helium · Vagabond · Feb 2, 2022 · Feb 2, 2022
diff --git a/xxxx-h3dex-targeting.md b/xxxx-h3dex-targeting.md
@@ -1,150 +1,165 @@
-# HIP Template
+# HIPXX: H3Dex-based PoC Targeting
 
-- Author(s): @vagabond
+- Author(s): [@Vagabond](https://github.com/Vagabond),
+  [@vihu](https://github.com/vihu), et al
 - Start Date: 2022-02-02
-- Category: technical
+- Category: Technical
 - Original HIP PR: <!-- leave this empty; maintainer will fill in ID of this pull request -->
 - Tracking Issue: <!-- leave this empty; maintainer will create a discussion issue -->
+- Status: Draft
 
 # Summary
 [summary]: #summary
 
-This HIP serves as both an explaination of the current PoC targeting behaviour
-as well as a proposal for a more scalable replacement.
+This HIP serves as both an explanation of the current Proof-of-Coverage (PoC)
+targeting behaviour as well as a proposal for a more scalable replacement using
+an H3-based index. We are proposing it as a HIP to communicate and acknowledge
+that this is a _change_ to the current implementation but we believe it still
+falls within the original intent of PoC.
 
 # Motivation
 [motivation]: #motivation
 
-The current targeting mechanism relies on a global list of asserted hotspots.
+The current targeting mechanism relies on a global list of asserted Hotspots.
 This list, as the network grows, is increasingly expensive to maintain and
-examine. Additionally this list does not support garbage collection, so inactive
-hotspots are counted towards targeting probability, leading to targeting skew in
-hexes with a high amount of inactive hotspots <insert link to hex in india with
-400 dead hotspots>.
+examine. Additionally, this list does not support garbage collection, so
+inactive Hotspots are counted towards targeting probability, leading to
+targeting skew in hexes with many inactive Hotspots.
+
+An example of such mis-targeting is here.
+
+<!-- TODO: insert link to hex in india with 400 dead Hotspots -->
 
 The goal of this proposed work is to maintain the existing targeting semantics
-as closely as possible but rework them to operate on a more scalable
-datastructure.
+as closely as possible but rework them to operate on a more scalable data
+structure. A secondary goal is to reduce the active use of this edge case in
+order to enable more frequent PoC activity in an area. We don't believe this
+particular edge case is being exploited in a widespread manner (although we do
+see a few instances of it). With the publishing of this HIP, however, we believe
+it may become more interesting to arbitrageurs.
 
 # Stakeholders
 [stakeholders]: #stakeholders
 
-* Who is affected by this HIP?
-
-* How are we soliciting feedback on this HIP from these stakeholders? Note that
-  they may not be watching the HIPs repository or even aren't directly active in
-  the Helium Community Slack channels.
+All Hotspot owners are directly affected by this HIP. It should improve fairness
+in targeting for PoC but may affect those Hotspot deployers who are taking
+advantage of the current mis-targeting behaviour.
 
 # Detailed Explanation
 [detailed-explanation]: #detailed-explanation
 
-This HIP will begin by explaining the current targeting behaviour and then
-explaining the proposed changes.
+We will begin by explaining the current targeting behaviour and then explain the
+proposed changes.
 
-The first thing to grasp here is that the `poc_request` is non user submissable.
+The first thing to grasp here is that the `poc_request` is non user submissible.
 It is generated by the miner itself. It depends on the `poc_challenge_interval`
-chain var which determines if the miner is eligible to challenge.
+chain variable which determines if the miner is eligible to write a challenge to
+the blockchain.
 
 Steps leading up to targeting:
 
-1. If the criteria for `poc_request` are satisfied then miner will try to
+1. If the criteria for `poc_request` are satisfied, the Challenger will try to
    construct a `poc_request` txn. This involves getting the hash of the block at
-   which said miner is eligible to issue request and the pubkeybin of the
-   hotspot.
+   which said miner is eligible to issue a request and the public key of the
+   Challenger.
 
 2. The `create_request` function (in `miner_poc_statem.erl`) then constructs a
    `poc_request` txn by first creating an ephemeral `ecc_compact` keypair,
-   hashing the secret key and also hashing the pubkey, putting both in the
-   request transaction. The secret is then stored in the poc statem's state to
-   be used later. This completes the request phase.
-
-3. Next the miner waits for the `poc_request` txn to appear in a block (assuming
-   that the `poc_request` was valid). Once it's seen in a block, the incoming
-   block hash for the `poc_request` txn is also stored in the poc statem's state.
-   At this point the miner moves to the targeting phase by supplying an
-   `Entropy` used for randomizing the target. The Entropy is a combination of
-   the secret, `request_block_hash` and the `pubkeybin` of the challenger. This
-   essentially means that the `Entropy` is unpredictably random, but
-   deterministic; implying that nobody can know what the `Entropy` is until it
+   hashing the secret key and also hashing the public key, and putting both in
+   the request transaction. The secret is then stored in local state to be used
+   later. This completes the request phase.
+
+3. Next the Challenger waits for the `poc_request` txn to appear in a block
+   (assuming the `poc_request` was valid). Once it's seen in a block, the
+   incoming block hash for the `poc_request` txn is also stored in the local
+   state. At this point the Challenger moves to the targeting phase by supplying
+   entropy used for randomizing the target Hotspot (Challengee). The entropy is
+   a combination of the secret, the hash of the request block and the public key
+   of the Challenger. Thus, the entropy is unpredictably random, but
+   deterministic, implying that nobody can know what the entropy is until it
    appears.
 
-4. Once in targeting phase, and using the Entropy + the Challenger's pubkeybin:
+4. Once in targeting phase:
 
   - A list of all the populated hexes (at h3 resolution 5) is fetched from the
     Ledger.
-  - A random state is initialzed using the incoming Entropy
-  - A hex `zone` is selected using an Inverse Cumulative Distribution Function
-    (ICDF), this uses the random state and the hex list and returns an
-    InitialHex (which is a hex at resolution 5) + HexRandState.
-
-5. Once an InitialHex (zone) is selected, it starts a recursive target selection
-mechanism which operates upon the following filters:
-    - Filter out any inactive gw (those which haven't challenged in a long time
-      decided by `poc_v4_target_challenge_age` chain var).
-    - Do not target the challenger hotspot itself (this is obvious).
-    - Do not target hotspots which do not have relevant capability. Each hotspot
-      in the ledger has a `mode` defined in `blockchain_caps.hrl`.
+  - A random state is initialized using the incoming entropy and the
+    Challenger's public key
+  - An initial h3 hex `zone` is selected using an Inverse Cumulative
+    Distribution Function (ICDF) using random state and the hex list
+  - Random state is also stored for future computation
+
+5. Once an initial h3 hex (zone) is selected, it starts a recursive target
+   selection mechanism which operates upon the following filters:
+
+  - Filter out any inactive gateways (those which haven't been challenged in a
+    long time decided by the `poc_v4_target_challenge_age` chain variable).
+  - Do not target the Challenger Hotspot itself (this is obvious).
+  - Do not target Hotspots which do not have relevant capability. Each Hotspot
+    in the ledger has a `mode` defined in `blockchain_caps.hrl`.
 
 6. Once there is a filtered list of eligible targets, it re-runs the ICDF with
-the updated Random state and effectively finds a target pubkeybin.
-
-Note, critically, that inactive hotspots are removed at step 5(a), after the
-InitialHex has been selected. This is what leads to the skew towards hexes with
-substantial numbers of inactive hotspots.
-
-After the "list of all the populated hexes" in the ledger, mentioned at step
-4(a) above, was added to the ledger, another piece of information, called the
-[H3Dex](https://github.com/helium/blockchain-core/pull/681),
-was added to the ledger to improve performance of HIP17. The H3Dex, short
-for H3 inDex, is an arragement of H3 indices formatted in such a way that they
-can be iterated over with a ranged iterator so that any hotspots inside that
-hex, of an arbitrary resolution, can be queried. This is superior in several
-ways to the older "hexes" list:
-
- * It can be updated granuarly
- * It can be queried efficently
+   the updated Random state and finds a Hotspot to Challenge.
+
+Note, critically, that inactive Hotspots are removed at step 5, but only after
+the initial h3 hex has been selected. This is what leads to the skew towards
+hexes with substantial numbers of inactive Hotspots. For historical context,
+this implementation was designed when the network was significantly smaller and
+there was a sparser distribution of Hotspots deployed.
+
+After the "list of all the populated hexes'' in the ledger, mentioned at step 4
+above, was added to the ledger, another piece of information, called the
+[H3Dex](https://github.com/helium/blockchain-core/pull/681), was added to the
+ledger to improve performance of HIP17. The H3Dex, short for H3 inDex, is an
+arrangement of H3 indices formatted in such a way that they can be iterated over
+with a ranged iterator so that any Hotspots inside that hex, of an arbitrary
+resolution, can be queried. This is superior in several ways to the older
+"hexes" list:
+
+ * It can be updated granularly
+ * It can be queried efficiently
  * It does not require the entire list to be read into memory to be used
 
 However, it shared the flaw with the hexes list in that it, too, had no garbage
-collection for inactive hotspots. The HIP17 code also did inactive hotspot
+collection for inactive Hotspots. The HIP17 code also did inactive Hotspot
 filtering.
 
-As noted above, as the network has grown, storing all 500,000+ hotspots in a
-single list has become infeasible. Updating that list on a hotspot location
+As noted above, as the network has grown, storing all 500,000+ Hotspots in a
+single list has become infeasible. Updating that list on a Hotspot location
 assert has become one of the most expensive single operations on the chain
 today. The H3Dex, which is updated at the same time, updates typically in orders
 of magnitude less time.
 
-Therefore, the Helium team has been discussing for some time the proposal to
-replace the current targeting with something that can use the more efficent
+Therefore, the core developers have been discussing for some time the proposal
+to replace the current targeting with something that can use the more efficient
 H3Dex.
 
 Essentially the steps outlined above remain the same except for step 4. Step 4
 is replaced by the following:
 
 - A set of `poc_target_pool_size` populated hexes is selected randomly, using
-  the challenge Entropy, from the h3dex. `poc_target_pool_size` is a new chain
-  var.
-- The set of populated hexes is weighted by their *active* population counts and
-  an ICDF is run as before in step 4(c)
+  the challenge entropy, from the h3dex. `poc_target_pool_size` is a new chain
+  variable.
+- The set of populated h3 hexes is weighted by their *active* population counts
+  and an ICDF is run as before in step 4
 
-At this point the code continues with step 5, although the inactive hotspot
-filtering is no longer needed or done.
+At this point the code continues with step 5, although the inactive Hotspot
+filtering is removed since it's no longer needed.
 
 Now, randomly selecting from a set of populated hexes, without examining the
 entire set, is a bit more complicated than it appears. Essentially all the
-populated hexes at a fixed resolution, the existing chain var
+populated hexes at a fixed resolution, the existing chain variable
 `poc_target_hex_parent_res`, are iterated and stored in h3 order, keyed by their
 position in the list, into the ledger. This allows a relatively simple random
 selection by simply taking a random number between 0 and N, where N is the count
-of populated hotspots at the `poc_target_hex_parent_res` resolution. When new
-hexes are populated or depopulated (the population goes from 0 to something, or
+of populated Hotspots at the `poc_target_hex_parent_res` resolution. When new
+hexes are populated or unpopulated (the population goes from 0 to something, or
 from something to 0), the random lookup table is recalculated. With a relatively
 low `poc_target_hex_parent_res` this is very infrequent.
 
 The second complexity here is how to garbage collect the h3dex without causing a
 performance problem. The current implementation uses PoC receipts to GC the
-first N challengee locations for the first PoC receipts in a block. This is
+first N Challengee locations for the first PoC receipts in a block. This is
 deterministic and cheap and should cover all challenged hexes over time. Another
 possible option is to do some GC as a side effect of HIP17 calculations.
 
@@ -155,8 +170,6 @@ This proposal is currently implemented in
 # Drawbacks
 [drawbacks]: #drawbacks
 
-- Why should we *not* do this?
-
 The only known drawback is a slight change in targeting selection semantics
 because an ICDF over all populated hexes is no longer used. This is expected to
 average out over time, but testing for biases and presentation of results will
@@ -165,84 +178,49 @@ be done and added to the HIP here.
 # Rationale and Alternatives
 [alternatives]: #rationale-and-alternatives
 
-This is your chance to discuss your proposal in the context of the whole design
-space. This is probably the most important section!
-
-- Why is this design the best in the space of possible designs?
-
 This design was considered superior to some alternatives because it uses an
-existing datastructure that the ledger already tracks and uses and does not add
-any excessive additional computation.
-
-- What other designs have been considered and what is the rationale for not
-  choosing them?
+existing data structure that the ledger already tracks and uses and does not add
+any excessive additional computation. It does require a new chain variable which
+will require all nodes (Hotspots, Validators, Routers, ETLs, Nodes, etc) to
+update.
 
 Alternatives to this design were improving the hexes list in the ledger somehow
 or adding an entirely new structure to track hex populations for targeting.
-Neither was as optimal as moving more code to using an existing, superior
-datastructure, as well as adding garbage collection to that existing
-datastructure which should be able to improve HIP17 calculation performance as
-well.
-
-- What is the impact of not doing this?
+Neither was as optimal as moving more code to using an existing, superior data
+structure, as well as adding garbage collection to that existing data structure
+which should be able to improve HIP17 calculation performance as well.
 
 The list continues to grow, continues to become more expensive to interact with,
-targeting becomes more and biased by inactive hotspots.
+targeting becomes more and biased by inactive Hotspots.
 
 # Unresolved Questions
 [unresolved]: #unresolved-questions
 
-- What parts of the design do you expect to resolve through the HIP process
-  before this gets merged?
-
-- What parts of the design do you expect to resolve through the implementation
-  of this feature?
-
-- What related issues do you consider out of scope for this HIP that could be
-  addressed in the future independently of the solution that comes out of this
-  HIP?
-
 The other mechanics of targeting are considered out of bounds. This HIP aims to
 maintain existing behaviour as much as possible and simply improve performance
 and scalability.
 
 # Deployment Impact
 [deployment-impact]: #deployment-impact
 
-Describe how this design will be deployed and any potential impact it may have on
-current users of this project.
-
-- How will current users be impacted?
-
 Normal users should not expect to see a measurable impact, other than
 improvements to chain performance.
 
-- How will existing documentation/knowlegebase need to be supported?
-
 There is not a large amount of information about the targeting in the
 documentation, indeed this HIP attempts to improve the situation. Any existing
 documentation should be updated to reference this HIP.
 
-- Is this backwards compatible?
-
-It is backwards compatible. It will be activated by a chain var which will
+It is backwards compatible. It will be activated by a chain variable which will
 trigger the creation of the lookup indices and begin the garbage collection. The
 existing hexes list can remain active on the ledger while the stability of the
 new system is verified. Once the update has been verified the old hexes list can
-be deactivated/removed by a subsequent chain var or ledger upgrade hook.
+be deactivated/removed by a subsequent chain variable or ledger upgrade hook.
 
 Code could also be written to re-calculate the hexes list on a downgrade back to
 the old behaviour.
 
 # Success Metrics
 [success-metrics]: #success-metrics
 
-What metrics can be used to measure the success of this design?
-
-- What should we measure to prove a performance increase?
-
-- What should we measure to prove an improvement in stability?
-
-- What should we measure to prove a reduction in complexity?
-
-- What should we measure to prove an acceptance of this by it's users?
+The community should notice an improvement in the performance of validating and
+absorbing blocks and more expected targeting distributions.