Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
238 changes: 108 additions & 130 deletions xxxx-h3dex-targeting.md
Original file line number Diff line number Diff line change
@@ -1,150 +1,165 @@
# HIP Template
# HIPXX: H3Dex-based PoC Targeting

- Author(s): @vagabond
- Author(s): [@Vagabond](https://github.com/Vagabond),
[@vihu](https://github.com/vihu), et al
- Start Date: 2022-02-02
- Category: technical
- Category: Technical
- Original HIP PR: <!-- leave this empty; maintainer will fill in ID of this pull request -->
- Tracking Issue: <!-- leave this empty; maintainer will create a discussion issue -->
- Status: Draft

# Summary
[summary]: #summary

This HIP serves as both an explaination of the current PoC targeting behaviour
as well as a proposal for a more scalable replacement.
This HIP serves as both an explanation of the current Proof-of-Coverage (PoC)
targeting behaviour as well as a proposal for a more scalable replacement using
an H3-based index. We are proposing it as a HIP to communicate and acknowledge
that this is a _change_ to the current implementation but we believe it still
falls within the original intent of PoC.

# Motivation
[motivation]: #motivation

The current targeting mechanism relies on a global list of asserted hotspots.
The current targeting mechanism relies on a global list of asserted Hotspots.
This list, as the network grows, is increasingly expensive to maintain and
examine. Additionally this list does not support garbage collection, so inactive
hotspots are counted towards targeting probability, leading to targeting skew in
hexes with a high amount of inactive hotspots <insert link to hex in india with
400 dead hotspots>.
examine. Additionally, this list does not support garbage collection, so
inactive Hotspots are counted towards targeting probability, leading to
targeting skew in hexes with many inactive Hotspots.

An example of such mis-targeting is here.

<!-- TODO: insert link to hex in india with 400 dead Hotspots -->

The goal of this proposed work is to maintain the existing targeting semantics
as closely as possible but rework them to operate on a more scalable
datastructure.
as closely as possible but rework them to operate on a more scalable data
structure. A secondary goal is to reduce the active use of this edge case in
order to enable more frequent PoC activity in an area. We don't believe this
particular edge case is being exploited in a widespread manner (although we do
see a few instances of it). With the publishing of this HIP, however, we believe
it may become more interesting to arbitrageurs.

# Stakeholders
[stakeholders]: #stakeholders

* Who is affected by this HIP?

* How are we soliciting feedback on this HIP from these stakeholders? Note that
they may not be watching the HIPs repository or even aren't directly active in
the Helium Community Slack channels.
All Hotspot owners are directly affected by this HIP. It should improve fairness
in targeting for PoC but may affect those Hotspot deployers who are taking
advantage of the current mis-targeting behaviour.

# Detailed Explanation
[detailed-explanation]: #detailed-explanation

This HIP will begin by explaining the current targeting behaviour and then
explaining the proposed changes.
We will begin by explaining the current targeting behaviour and then explain the
proposed changes.

The first thing to grasp here is that the `poc_request` is non user submissable.
The first thing to grasp here is that the `poc_request` is non user submissible.
It is generated by the miner itself. It depends on the `poc_challenge_interval`
chain var which determines if the miner is eligible to challenge.
chain variable which determines if the miner is eligible to write a challenge to
the blockchain.

Steps leading up to targeting:

1. If the criteria for `poc_request` are satisfied then miner will try to
1. If the criteria for `poc_request` are satisfied, the Challenger will try to
construct a `poc_request` txn. This involves getting the hash of the block at
which said miner is eligible to issue request and the pubkeybin of the
hotspot.
which said miner is eligible to issue a request and the public key of the
Challenger.

2. The `create_request` function (in `miner_poc_statem.erl`) then constructs a
`poc_request` txn by first creating an ephemeral `ecc_compact` keypair,
hashing the secret key and also hashing the pubkey, putting both in the
request transaction. The secret is then stored in the poc statem's state to
be used later. This completes the request phase.

3. Next the miner waits for the `poc_request` txn to appear in a block (assuming
that the `poc_request` was valid). Once it's seen in a block, the incoming
block hash for the `poc_request` txn is also stored in the poc statem's state.
At this point the miner moves to the targeting phase by supplying an
`Entropy` used for randomizing the target. The Entropy is a combination of
the secret, `request_block_hash` and the `pubkeybin` of the challenger. This
essentially means that the `Entropy` is unpredictably random, but
deterministic; implying that nobody can know what the `Entropy` is until it
hashing the secret key and also hashing the public key, and putting both in
the request transaction. The secret is then stored in local state to be used
later. This completes the request phase.

3. Next the Challenger waits for the `poc_request` txn to appear in a block
(assuming the `poc_request` was valid). Once it's seen in a block, the
incoming block hash for the `poc_request` txn is also stored in the local
state. At this point the Challenger moves to the targeting phase by supplying
entropy used for randomizing the target Hotspot (Challengee). The entropy is
a combination of the secret, the hash of the request block and the public key
of the Challenger. Thus, the entropy is unpredictably random, but
deterministic, implying that nobody can know what the entropy is until it
appears.

4. Once in targeting phase, and using the Entropy + the Challenger's pubkeybin:
4. Once in targeting phase:

- A list of all the populated hexes (at h3 resolution 5) is fetched from the
Ledger.
- A random state is initialzed using the incoming Entropy
- A hex `zone` is selected using an Inverse Cumulative Distribution Function
(ICDF), this uses the random state and the hex list and returns an
InitialHex (which is a hex at resolution 5) + HexRandState.

5. Once an InitialHex (zone) is selected, it starts a recursive target selection
mechanism which operates upon the following filters:
- Filter out any inactive gw (those which haven't challenged in a long time
decided by `poc_v4_target_challenge_age` chain var).
- Do not target the challenger hotspot itself (this is obvious).
- Do not target hotspots which do not have relevant capability. Each hotspot
in the ledger has a `mode` defined in `blockchain_caps.hrl`.
- A random state is initialized using the incoming entropy and the
Challenger's public key
- An initial h3 hex `zone` is selected using an Inverse Cumulative
Distribution Function (ICDF) using random state and the hex list
- Random state is also stored for future computation

5. Once an initial h3 hex (zone) is selected, it starts a recursive target
selection mechanism which operates upon the following filters:

- Filter out any inactive gateways (those which haven't been challenged in a
long time decided by the `poc_v4_target_challenge_age` chain variable).
- Do not target the Challenger Hotspot itself (this is obvious).
- Do not target Hotspots which do not have relevant capability. Each Hotspot
in the ledger has a `mode` defined in `blockchain_caps.hrl`.

6. Once there is a filtered list of eligible targets, it re-runs the ICDF with
the updated Random state and effectively finds a target pubkeybin.

Note, critically, that inactive hotspots are removed at step 5(a), after the
InitialHex has been selected. This is what leads to the skew towards hexes with
substantial numbers of inactive hotspots.

After the "list of all the populated hexes" in the ledger, mentioned at step
4(a) above, was added to the ledger, another piece of information, called the
[H3Dex](https://github.com/helium/blockchain-core/pull/681),
was added to the ledger to improve performance of HIP17. The H3Dex, short
for H3 inDex, is an arragement of H3 indices formatted in such a way that they
can be iterated over with a ranged iterator so that any hotspots inside that
hex, of an arbitrary resolution, can be queried. This is superior in several
ways to the older "hexes" list:

* It can be updated granuarly
* It can be queried efficently
the updated Random state and finds a Hotspot to Challenge.

Note, critically, that inactive Hotspots are removed at step 5, but only after
the initial h3 hex has been selected. This is what leads to the skew towards
hexes with substantial numbers of inactive Hotspots. For historical context,
this implementation was designed when the network was significantly smaller and
there was a sparser distribution of Hotspots deployed.

After the "list of all the populated hexes'' in the ledger, mentioned at step 4
above, was added to the ledger, another piece of information, called the
[H3Dex](https://github.com/helium/blockchain-core/pull/681), was added to the
ledger to improve performance of HIP17. The H3Dex, short for H3 inDex, is an
arrangement of H3 indices formatted in such a way that they can be iterated over
with a ranged iterator so that any Hotspots inside that hex, of an arbitrary
resolution, can be queried. This is superior in several ways to the older
"hexes" list:

* It can be updated granularly
* It can be queried efficiently
* It does not require the entire list to be read into memory to be used

However, it shared the flaw with the hexes list in that it, too, had no garbage
collection for inactive hotspots. The HIP17 code also did inactive hotspot
collection for inactive Hotspots. The HIP17 code also did inactive Hotspot
filtering.

As noted above, as the network has grown, storing all 500,000+ hotspots in a
single list has become infeasible. Updating that list on a hotspot location
As noted above, as the network has grown, storing all 500,000+ Hotspots in a
single list has become infeasible. Updating that list on a Hotspot location
assert has become one of the most expensive single operations on the chain
today. The H3Dex, which is updated at the same time, updates typically in orders
of magnitude less time.

Therefore, the Helium team has been discussing for some time the proposal to
replace the current targeting with something that can use the more efficent
Therefore, the core developers have been discussing for some time the proposal
to replace the current targeting with something that can use the more efficient
H3Dex.

Essentially the steps outlined above remain the same except for step 4. Step 4
is replaced by the following:

- A set of `poc_target_pool_size` populated hexes is selected randomly, using
the challenge Entropy, from the h3dex. `poc_target_pool_size` is a new chain
var.
- The set of populated hexes is weighted by their *active* population counts and
an ICDF is run as before in step 4(c)
the challenge entropy, from the h3dex. `poc_target_pool_size` is a new chain
variable.
- The set of populated h3 hexes is weighted by their *active* population counts
and an ICDF is run as before in step 4

At this point the code continues with step 5, although the inactive hotspot
filtering is no longer needed or done.
At this point the code continues with step 5, although the inactive Hotspot
filtering is removed since it's no longer needed.

Now, randomly selecting from a set of populated hexes, without examining the
entire set, is a bit more complicated than it appears. Essentially all the
populated hexes at a fixed resolution, the existing chain var
populated hexes at a fixed resolution, the existing chain variable
`poc_target_hex_parent_res`, are iterated and stored in h3 order, keyed by their
position in the list, into the ledger. This allows a relatively simple random
selection by simply taking a random number between 0 and N, where N is the count
of populated hotspots at the `poc_target_hex_parent_res` resolution. When new
hexes are populated or depopulated (the population goes from 0 to something, or
of populated Hotspots at the `poc_target_hex_parent_res` resolution. When new
hexes are populated or unpopulated (the population goes from 0 to something, or
from something to 0), the random lookup table is recalculated. With a relatively
low `poc_target_hex_parent_res` this is very infrequent.

The second complexity here is how to garbage collect the h3dex without causing a
performance problem. The current implementation uses PoC receipts to GC the
first N challengee locations for the first PoC receipts in a block. This is
first N Challengee locations for the first PoC receipts in a block. This is
deterministic and cheap and should cover all challenged hexes over time. Another
possible option is to do some GC as a side effect of HIP17 calculations.

Expand All @@ -155,8 +170,6 @@ This proposal is currently implemented in
# Drawbacks
[drawbacks]: #drawbacks

- Why should we *not* do this?

The only known drawback is a slight change in targeting selection semantics
because an ICDF over all populated hexes is no longer used. This is expected to
average out over time, but testing for biases and presentation of results will
Expand All @@ -165,84 +178,49 @@ be done and added to the HIP here.
# Rationale and Alternatives
[alternatives]: #rationale-and-alternatives

This is your chance to discuss your proposal in the context of the whole design
space. This is probably the most important section!

- Why is this design the best in the space of possible designs?

This design was considered superior to some alternatives because it uses an
existing datastructure that the ledger already tracks and uses and does not add
any excessive additional computation.

- What other designs have been considered and what is the rationale for not
choosing them?
existing data structure that the ledger already tracks and uses and does not add
any excessive additional computation. It does require a new chain variable which
will require all nodes (Hotspots, Validators, Routers, ETLs, Nodes, etc) to
update.

Alternatives to this design were improving the hexes list in the ledger somehow
or adding an entirely new structure to track hex populations for targeting.
Neither was as optimal as moving more code to using an existing, superior
datastructure, as well as adding garbage collection to that existing
datastructure which should be able to improve HIP17 calculation performance as
well.

- What is the impact of not doing this?
Neither was as optimal as moving more code to using an existing, superior data
structure, as well as adding garbage collection to that existing data structure
which should be able to improve HIP17 calculation performance as well.

The list continues to grow, continues to become more expensive to interact with,
targeting becomes more and biased by inactive hotspots.
targeting becomes more and biased by inactive Hotspots.

# Unresolved Questions
[unresolved]: #unresolved-questions

- What parts of the design do you expect to resolve through the HIP process
before this gets merged?

- What parts of the design do you expect to resolve through the implementation
of this feature?

- What related issues do you consider out of scope for this HIP that could be
addressed in the future independently of the solution that comes out of this
HIP?

The other mechanics of targeting are considered out of bounds. This HIP aims to
maintain existing behaviour as much as possible and simply improve performance
and scalability.

# Deployment Impact
[deployment-impact]: #deployment-impact

Describe how this design will be deployed and any potential impact it may have on
current users of this project.

- How will current users be impacted?

Normal users should not expect to see a measurable impact, other than
improvements to chain performance.

- How will existing documentation/knowlegebase need to be supported?

There is not a large amount of information about the targeting in the
documentation, indeed this HIP attempts to improve the situation. Any existing
documentation should be updated to reference this HIP.

- Is this backwards compatible?

It is backwards compatible. It will be activated by a chain var which will
It is backwards compatible. It will be activated by a chain variable which will
trigger the creation of the lookup indices and begin the garbage collection. The
existing hexes list can remain active on the ledger while the stability of the
new system is verified. Once the update has been verified the old hexes list can
be deactivated/removed by a subsequent chain var or ledger upgrade hook.
be deactivated/removed by a subsequent chain variable or ledger upgrade hook.

Code could also be written to re-calculate the hexes list on a downgrade back to
the old behaviour.

# Success Metrics
[success-metrics]: #success-metrics

What metrics can be used to measure the success of this design?

- What should we measure to prove a performance increase?

- What should we measure to prove an improvement in stability?

- What should we measure to prove a reduction in complexity?

- What should we measure to prove an acceptance of this by it's users?
The community should notice an improvement in the performance of validating and
absorbing blocks and more expected targeting distributions.