# SN netuid - sn_name demo

This demo notebook should fufill the following tasks.
 - (A) Demonstrate the quality of the communication between miners and validators coming from the top miner.
 - (B) Justify the difference in incentive for miners in different tiers (eg.quantile 1 VS quantile 3).
 - (C) (If applicable) Show the landscape and variety of miners. 
 - (D) (If applicable) Demonstrate the effectiveness of the scoring mechanism.
 - (E) (If applicable) Show the dataset that was used by the validator.
 - (F) (If applicable) Show the use of any API and/or links to a frontend.
 
- ** If you have difficulty in completing any of the task and need an alternative, please feel free to contact Isabella(isabella618033)/ Eugene(eugene3684) from the Opentensor Foundation on Discord.
- ** We understand that SNs runs very differently, so please feel free to make any modification to the notebook that best suit your SN as long as it can demonstrate
- ** For any wallet/ dendrite calls needed, we will be using the foundation hotkey 

## Objective
>  Please write the objective of the SN with precision. 
> - The task of the miner (what task does the validator send to miner, and what are miner supposed to response with.)

The miner has one main task that is to host and make data avialable, broken into three sub-tasks: to (1) `store`, data, (2) to pass random `challenges` to prove those data still exist, and (3) `retrieve` requested data. Thus, each miner has three axon endpoints, one for each of these tasks.

FileTAO's major objective is to be a generic, agnostic data storage platform for Bittensor, the surrounding ecosystem, and non-crypto participants as well.  Storing any data encrypted end-to-end in a decentralized, robust protocol so that users can reduce dependencies on third-party centralized actors is paramount. For example, one major goal of FileTAO is to provide storage for backing up chain data, machine learning model weights, structured and unstructured data, machine learning training data, end-user files, and more while providing levels of redundancy and data saliency.

There are several components to SN21's mechanism:
(1) Cryptographic Proof System: how do we verify the data still exists and miners have data integrity.
(2) Reward System: How to we distribute rewards fairly and meritocratically while minimizing unnecessary churn.
(3) Data Preservation & Recovery: How do we rebalance the data on the network so that no data loss is observed, even if/when some nodes fail.


### Store
In the Store phase, the goal is to securely store data and create a commitment to prove its storage without revealing the data itself. The mathematical steps are:

Validators query miners to store user data that is encrypted by the end-user coldkey pubkey. The encrypted data is sent to miners along with a random seed, which the miners use to initiate a sequential chain of seed + hash verification proofs. The previous seed value and the data is required for each subsequent commitment proof. The miner then applies a Pedersen committment to the entire data using the random seed, and forwards the proofs to the validator.

Upon receipt, validators verify the commitment and the initial hash chain, storing the associated metadata.

1. **Data Receiving**
   - Base64 decode incoming bytes data stream (saves bandwidth in transit)
   - Hash the data with `sha256` for creating a file ID to store in the index that matches the validator file ID
   - Save the data to the filesystem and save associated metadata, including data `size` (bytes), validator random `seed`, and elliptic curve initialization parameters `(g,h)`.

2. **Data Encryption**:
   - Data `D` is encrypted using a symmetric encryption scheme whos keys are private to the client. The miner cannot decrypt the data or know what it is receiving.
   - Encrypted Data: `encrypted_D = encrypt(D, key)`.

3. **Hashing and Commitment**:
   - Hash the encoded data with a unique random seed to create a unique identifier for the data.
   - Data Hash: `data_hash = hash(encrypted_D + r_seed)`.
   - Create a cryptographic commitment using an Elliptic Curve Commitment Scheme (ECC), which involves a commitment function `commit` with curve points `g` and `h`.
   - Pedersen Commitment: `(c, m, r) = commit(encrypted_D + seed)`, where `c` is the commitment, `m` is the message (or commitment hash), and `r` is the randomness used in the commitment.
   - Chained Hash Proof: `m` is used as the initial `C_0`, which contains the initial random seed and the data itself. The random seed is stored for the next challenge in the chain.

4. **Storage**:
   - Store the data (`E`) and the random seed (`r_seed`) in local storage.

5. **Response** 
   - Return randomness value `r`, commitment point `c`, produced by the Pedersen commitment, along with commitment hash `m` (a.k.a `C_0`) to the calling validator.


### Challenge

In the Challenge phase, the system verifies the possession of the data without actually retrieving the data itself.

Validators request the miner prove that it currently stores the data claimed by issuing an index-based challenge, where the miner must apply Pedersen commitments to the entire data table given a random seed and a chunk size.

Data is chunked according to the chunk size, and each slice is committed to using a Pederson commitment with the current random seed. Each commitment is appended to a merkle tree, and a subsequent proof is generated to obtain the path along the merkle tree such that a validator can verify the random seed was indeed used to commit to each data chunk at challenge time, thus proving the miner has the data at time of the challenge. 

The mathematical operations involved in the `Challenge` phase of the data storage and verification protocol can be broken down into a few key steps. Here's a simplified explanation:

1. **Chunking Data**:
   - The encrypted data is split into chunks: `chunks = chunk(encrypted_D, chunk_size)`.

2. **Selecting a Chunk for Challenge**:
   - A random chunk is selected for the challenge.
   - Selected Chunk: `chunk_j = chunks[j]`.

3. **Computing Commitment for the Chunk**:
   - A commitment is computed for the selected chunk.
   - Commitment for Chunk: `(c_j, m_j, r_j) = commit(chunk_j + seed)`.

4. **Creating a Merkle Tree**:
   - A Merkle tree is constructed using all chunk commitments.
   - Merkle Tree: `merkle_tree = MerkleTree([c_1, c_2, ..., c_n])`.

5. **Generating Merkle Proof**:
   - A Merkle proof is generated for the selected chunk to recreate the path along the merkle tree to the leaf that represents `chunk_j`.
   - Merkle Proof: `proof_j = merkle_tree.get_proof(j)`.

6. **Generating chained commitment**:
   - Compute commitment hash `Cn = hash( hash( encrypted_D + prev_seed ) + synapse.seed )`
   - Update previous seed `prev_seed = synapse.seed`

7. **Response**:
   - The challenge response includes the Pedersen elliptic curve commitment, the chained commitment hash, the Merkle proof, and the Merkle tree root.
   - The validator verifies the triple of proofs: chained commitment, elliptic-curve commitment, and the merkle proof.


### Retrieve
In this phase, the data is retrieved, decrypted, and its integrity is verified. This is achieved by the validator supplying a random hash to check that is owned by the miner, along with a random seed to incorporate.

1. **Fetching Encrypted Data**:
   - The encrypted data is fetched from the database based on its hash.
   - `encrypted_D = fetch(data_hash)`.

2. **Chained Verification Challenge**:
   - A new commitment is computed on the encrypted data with a new seed and the previous seed.
       - `Ch = hash( hash( encrypted_D + prev_seed ) + synapse.seed )`.

3. **Data Integrity Check**:
   - The retrieved data's integrity is verified by checking if the newly computed commitment matches the expected value.
   - `verify_chained(commitment, expected_commitment) == True`.

4. **Final Data Commitment Proof**
   - Pedersen Commitment: `(c, m, r) = commit(encrypted_D + seed)`, where `c` is the commitment, `m` is the message (or commitment hash), and `r` is the randomness used in the commitment.

5. **Submission**:
   - Return data `D`, `Ch` chained hash, `r` randomness value, and `c` commimtment point.
   - Data is validated by proofs and decrypted on validator side to be sent back to the end-user: `D = decrypt(encrypted_D, key)`.




> - The scoring mechanism (please specify is there is any model involved)

There are several components to the reward mechanism and is multi-layered. In a nutshell the goals are as follows:

- Incentivize data durability and salience via cryptoraphic proof system. (E.g. don't lose shit.)
- Incentivize good performance over a long period of time. (Tier system to reward based on reputation.)
- Incentivize fast response times. We want to retrieve user data quickly and ensure that it is over high network bandwidth.

### Cryptographic Proof System:
See above description for details on the algorithm. It is rewarded as follows:
```

SUPER_SAIYAN_TIER_REWARD_FACTOR = 1.0
DIAMOND_TIER_REWARD_FACTOR = 0.9
GOLD_TIER_REWARD_FACTOR = 0.8
SILVER_TIER_REWARD_FACTOR = 0.7
BRONZE_TIER_REWARD_FACTOR = 0.6
```

### Tier System:
The tier system classifies miners into five distinct categories, each with specific requirements and storage limits. These tiers are designed to reward miners based on their performance, reliability, and the total volume of data they can store.

Importance of Tier System:
- **Encourages High Performance:** Higher tiers reward miners with greater benefits, motivating them to maintain high Wilson Scores.
- **Enhances Network Reliability:** The tier system ensures that only the most reliable and efficient miners handle significant volumes of data, enhancing the overall reliability of the network.
- **Fair Reward Distribution:** The reward factors are proportional to the miners' tier, ensuring a fair distribution of rewards based on performance.

1. 🎇 **Super Saiyan Tier:** 
   - **Storage Limit:** 1 Exabyte (EB)
   - **Store Wilson Score:** 0.88
   - **Minimum Successes Required:** 10,000
   - **Reward Factor:** 1.0 (100% rewards)

2. 💎 **Diamond Tier:**
   - **Storage Limit:** 1 Petabyte (PB)
   - **Store Wilson Score:**  0.77
   - **Minimum Successes Required:** 5,000
   - **Reward Factor:** 0.9 (90% rewards)

3. 🥇 **Gold Tier:**
   - **Storage Limit:** 100 Terabytes (TB)
   - **Store Wilson Score:** 0.66
   - **Minimum Successes Required:** 2,000
   - **Reward Factor:** 0.8 (80% rewards)

4. 🥈 **Silver Tier:**
   - **Storage Limit:** 10 Terabytes (TB)
   - **Store Wilson Score:** 0.55
   - **Minimum Successes Required:** 1,000
   - **Reward Factor:** 0.7 (70% rewards)

5. 🥉 **Bronze Tier:**
   - **Storage Limit:** 1 Terabyte (TB)
   - **Store Wilson Score:** Not specifically defined for this tier
   - **Minimum Successes Required:** Not specifically defined for this tier
   - **Reward Factor:** 0.6 (60% rewards)

#### Maintaining and Advancing Tiers:
- To advance to a higher tier, miners must consistently achieve the required minimum Wilson Scores in their operations.
- Periodic evaluations are conducted to ensure miners maintain the necessary performance standards to stay in their respective tiers, or move up or down tiers.
- Advancing to a higher tier takes time. In order to ascend to the first higher tier (Silver), it takes at least 1000 successful requests, whether they are challenge requests, store requests, or retry requests and must maintain the minimum Wilson Score for successes / attempts. 
- Depending on how often a miner is queried, how many validators are operating at one given time, and primarily the performance of the miner, this can take several hours to several days. Assuming full 64 validator slots are occupied, this should take ~100 hours.

As miners move up tiers, their responsibility increases proportionally. Thus, miners who move from Silver -> Gold are expected to be able to store up to 100 Terabytes (TB) of data, a 10x the storage cap for Silver. Similarly, it must maintain a minimum wilson score of 0.66 over it's lifetime, increasing the lower bound for expected performance and reliability. It also means that this miner will now recieve a higher percentage of rewards for each successful query, going from 70% -> 80%. This does not mean that the miner may rest on its laurels, and may move right back down a tier (or more) if it does not meed the minimum requirements for Gold.

### Wilson score
Wilson score is an asymmetric approximation of the confidence interval suited for low sample sizes, given a particular z-score target. It doesn't suffer from problems of overshoot and zero-width intervals that afflict the normal interval approximation. 

See: https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Wilson_score_interval for more information.

### Tier Ascension Time
Assuming perfect performance, that out of ~200 miner UIDs, each of which is queried roughly 34 times every 1000 rounds, namely a 3.4% chance every query round, one can expect to reach the next tier within 
```bash
hours = total_successes / prob_of_query_per_round * time_per_round / 3600
hours = 98 # roughly 4 days at perfect performance to next (Silver) tier (assuming no challenge failures)
```

#### Miner Advancement Program
As a strategy for rewarding mineres who perform well but are lower tiers to move more quickly up through the ranks, we apply an inverse boosting strategy for the top miners per query batch. The top two (2) performers per batch of `store`, `challenge` or `retrieve` requests recieve a one-time tier appropriate boost. These tier-specific boosts decrease as you ascend the tier structure, and are defined in `constants.py`. For example:

```bash
TIER_BOOSTS = {
    b"Super Saiyan": 1.02, # 2%  -> 1.02
    b"Diamond": 1.05,      # 5%  -> 0.945
    b"Gold": 1.1,          # 10% -> 0.88
    b"Silver": 1.15,       # 15% -> 0.805
    b"Bronze": 1.2,        # 20% -> 0.72
}
```

Concretely, a `Bronze` miner who is one of the top 2 within a batch of requests will receive a proportionally larger boost than a `Diamond` miner who is also in the top 2.

```
REWARD = TIER * REWARD * BOOST
Bronze  -> 0.72 = 0.6 * 1.0 * 1.2
Diamond -> 0.84 = 0.8 * 1.0 * 1.05
```

This mechansim *significantly* closes the gap for newer miners who perform well and should be able to ascned the tier structure honestly and faithfully. Essentially is so that miners who consistently perform well but are lower tiers can more readily survive immunity period to make it to successively higher tiers and not "gate" access to the older mineres. This directly negates the "grandfathering" effect. Higher tier miners that are in the top 2 are boosted significantly less than those Bronze or lower tier miners who make the top 2.

#### Periodic Statistics Rollover

Statistics for `store_successes/attempts`, `challenge_attempts/successes`, and `retrieve_attempts/successes` are reset every 2 epochs (720 blocks), while the `total_successes` are carried over for accurate tier computation. This "sliding window" of the previous 720 blocks of `N` successes vs `M` attempts effectively resets the `N / M` ratio and applies Wilson Scoring. This facilitates a less punishing tier calculation for early failures that would otherwise have to be "outpaced", while simultaneously discourages grandfathering older miners who were able to succeed early and cemented their status in a higher tier. The net effect is greater mobility across the tiers, keeping the network competitive while incentivizing reliability and consistency.

For example:
```bash
store_successes = 2
store_attempts = 2
challenge_successes = 3
challenge_attempts = 5
retrieve_successes = 1
retireve_attempts = 1
total_successes_epoch = 7
total_attempts_epoch = 9
total_successes = 6842
wilson_score = 0.79
```
This miner would qualify for Diamond tier this round, as it has passed the minimum threshold for total successes (5000) and wilson score (0.77).


> - The dataset used
None. Client and challenge data are intermingled. Thus, miners cannot determine if data is from a validator or from a client, forcing miners to respond to any and *all* requests by validators, as it directly impacts their incentive, regardless of random challenge or client API data.

## Setup
> Please give instruction for how to run the notebook

## (A) Top miner responses
- This section should demonstrate the quality of the communication between miners and validators coming from the top miner.

> - (1) Define the group of top miners.
The Top miner group is typically the highest tier, (but not exclusively), also have the highest incentive (roughly top 15% of miners), and provide the best service for the end goal of data storage and retrieval. The top miner group nearly always succeeds in challenges, and clients/validators are able to retrieve data consistently, with low latency.

> - (2) Define the forward function. 
There are three (3) distinct forward functions, as described in detail above: `store`, `challenge` and `retrieve`, all of which are rewarded or punished based on reponses from miners. The fundamental reward is binary based on answering this simple question: "Did you, or did you not store (or still posess) this particular piece of data?" A 0 or 1 answer, but is then modified by several factors, such as tier, latency, and relative position in the query group.

Here is an example (simplified) description and pseudocode for the `challenge` forward:

Miner handles a data challenge by providing a series of cryptographic proofs of data possession. This method retrieves
the specified data from storage, calculates its commitment using elliptic curve cryptography, and
constructs a Merkle proof. The response includes the requested data chunk, Merkle proof, root, and
the commitment, which collectively serve as verifiable evidence of data possession.

The method performs the following steps:
1. Fetches the encrypted data from storage using the hash provided in the challenge.
2. Splits the data into chunks based on the specified chunk size.
3. Computes a new commitment hash to provide a time-bound proof of possession.
4. Generates a Merkle tree from the committed data chunks and extracts a proof for the requested chunk.
5. Encodes the requested chunk and Merkle proof in base64 for transmission.
6. Updates the challenge synapse with the commitment, data chunk, randomness, and Merkle proof.
7. Records the updated commitment hash in storage for future challenges.

This method ensures data integrity and allows the verification of data possession without disclosing the
entire data. It is designed to fulfill data verification requests in a secure manner with minimal overhead.


```python

async def challenge(synapse: Challenge) -> Challenge:
    # Retrieve the data itself from miner storage
    data = await get_chunk_metadata(database)

    # Chunk the data according to the specified (random) chunk size 
    encrypted_data_bytes = load_from_filesystem(data["filepath"])

    # Construct the next commitment hash using previous commitment and hash
    # of the data to prove storage over time
    prev_seed = data["seed"]

    # Compute "chained hash commitmenet" given prev_seed, data and new_seed
    next_commitment, proof = compute_subsequent_commitment(encrypted_data_bytes, prev_seed, synapse.seed)

    # Store the values back in the system to return to the validator for verificatin.
    synapse.commitment_hash = next_commitment
    synapse.commitment_proof = proof

    # update the commitment seed challenge hash in metadata storage
    await update_seed_info(
        database,
        chunk_hash=synapse.challenge_hash,
        hotkey=synapse.dendrite.hotkey,
        seed=synapse.seed,
    )

    # Chunk the data according to the provided chunk_size
    data_chunks = chunk_data(encrypted_data_bytes, synapse.chunk_size)

    # Extract setup params (initial points along the curve)
    g = hex_to_ecc_point(synapse.g, synapse.curve)
    h = hex_to_ecc_point(synapse.h, synapse.curve)

    # Commit the data chunks based on the provided curve points
    randomness, chunks, commitments, merkle_tree = commit_data_with_seed(
        ECCommitment(g, h),
        data_chunks=data_chunks,
        n_chunks=size(encrypted_data_bytes) // synapse.chunk_size + 1,
        seed=synapse.seed,
    )

    # Prepare return values to validator for proper proof verification
    # Includes: commitment point, randomness value, and merkle proof up to leaf node of index `j`
    # Base64 encode the data chink and the merkle proof for compression during transit.
    synapse.commitment = commitments[synapse.challenge_index]
    synapse.data_chunk = base64.b64encode(chunks[synapse.challenge_index])
    synapse.randomness = randomness[synapse.challenge_index]
    synapse.merkle_proof = b64_encode(
        merkle_tree.get_proof(synapse.challenge_index)
    )

    synapse.merkle_root = merkle_tree.get_merkle_root()
    return synapse

```

> - (3) Call the forward function for the top miners. 
The forward function is not designed to be called on the "top" miners specifically, but rather on only the miners who have stored the piece (or pieces) of data desired for both `challenge`s and `retrieve`s.

In FileTao it is not useful to query the "top" miner set, but rather use the validator as a proxy to store and retrieve data on the miner set. E.g. you cannot store data directly on miners without a registered validator hotkey, and if you were to attempt to retrieve data you would need to know specific CIDs and miner hotkeys where those data are stored. Instead we should query the validator using `StoreUser` and `RetrieveUser` `synapse`s. See the API example below in section  for more details.

However, if desiring to only query the top miners for the purposes of this evaluation, one can easily call `Store` on the highest tier miners at the current time given a registered validator hotkey. For example, attempting to store and then retrieve a specific piece of data from the network given it's content identifier. This will require some low-level functions to import and use.


```python
import base64
import numpy as np
import bittensor as bt
from redis import asyncio as aioredis
from Crypto.Random import get_random_bytes
from storage import protocol
from storage.shared.utils import get_redis_password
from storage.validator.database import *
from storage.shared.ecc import (
    hash_data,
    setup_CRS,
    ecc_point_to_hex,
)


data = b"Some bytes data to store on the network!"
metagraph = bt.subtensor("test").metagraph(netuid=22, lite=False)

# Grab top 10% of miners by incentive
def get_top_miner_uids(n=0.1):
    top_i = np.quantile(metagraph.I, 1 - n)
    uids = metagraph.uids[metagraph.I > top_i]
    return [uid.item() for uid in uids]

axons = [metagraph.axons[uid] for uid in get_top_miner_uids()]

# Setup CRS (common reference string) for this round of validation
curve = "P-256"
g, h = setup_CRS(curve=curve)

# Hash the data
data_hash = hash_data(data)

# Convert to base64 for compactness
b64_encrypted_data = base64.b64encode(data).decode("utf-8")

# Create Store synapse
synapse = protocol.Store(
    encrypted_data=b64_encrypted_data,
    curve=curve,
    g=ecc_point_to_hex(g),
    h=ecc_point_to_hex(h),
    seed=get_random_bytes(32).hex(),  # 256-bit seed
    ttl=12345, # how many seconds before miners can safely delete
)

# Must use a registered validator hotkey wallet
dendrite = bt.dendrite(wallet=bt.wallet())

# Store the data on the top miner set
store_responses = await dendrite(
    axons,
    synapse,
    deserialize=False,
    timeout=20,
)

# Now retrieve the data using the same CID (data_hash in this case) and miner set that we stored on.
synapse = protocol.Retrieve(
    data_hash=data_hash,
    seed=get_random_bytes(32).hex(), # New seed
)

retrieve_responses = await dendrite(
    axons,
    synapse,
    deserialize=False,
    timeout=20,
)

rdata = retrieve_responses[0].data
rdata = base64.b64decode(rdata)
assert rdata == data
```

> - (4) Show the responses from the miners.

```python

print("Store responses:", store_responses)
print("Retrieve responses:", retrieve_responses)

# Pick 1
print("Single store response:" store_responses[0])
print("Single retrieve response:" retrieve_responses[0])
```

## (B) Justification for incentive distribution 
- Justify the difference in incentive for miners in different incentive tiers (eg. sample 5 miners from quantile 1 VS 5 miners from quantile 3) with code.

Miner incentive is largeley dependent on three factors: (1) tier, (2) latency, (3) success rate. Tier is an aggregation of the 2nd and 3rd qualities. The more often you consistently pass challenges as a miner and the lower your latency, the higher your tier will be over time.

For example, let's imagine a set of miners was queried to store a given piece of data. The validator would create the `Store` synapse and send the query out, and use the responses to fill the rewards. Let's walk through this logic together:

(1) `create_rewards_vector()` is called, verfying and applying the initial tier based rewards.
Here is a simplified pseudocode implementation for explanation:

```python
# Define boosts for top 2 miners per query batch. Lower tier miners who perform well get boosted more.
TIER_BOOSTS = {
    b"Super Saiyan": 1.02, # 2%  -> 1.02
    b"Diamond": 1.05,      # 5%  -> 0.945
    b"Gold": 1.1,          # 10% -> 0.88
    b"Silver": 1.15,       # 15% -> 0.805
    b"Bronze": 1.2,        # 20% -> 0.72
}

async def create_reward_vector(
    synapse: Union[Store, Retrieve, Challenge],
    rewards: torch.FloatTensor,
    uids: List[int],
    responses: List[Synapse],
):

    # Sort the miner response times (ascending)
    sorted_times = ...

    # Get the top 2 lowest latency miners in this query batch and mark them for extra proportional boost
    in_top_2_dict = {
        uid: True if time < synapse.timeout else False
        for (uid, time) in sorted_times[:2]
    }

    for idx, (uid, response) in enumerate(zip(uids, responses)):
        # Verify the commitment
        success = verify_fn(synapse=response)

        ...

        # Apply reward for this task
        tier_factor = await get_tier_factor(hotkey, self.database)

        # Boost the factor for the top 2 miner tiers by x% given their current tier
        if in_top_2_dict.get(uid, False):
            tier_factor *= TIER_BOOSTS[tier]

        # Apply the reward based on tier factor
        rewards[idx] = 1.0 * tier_factor if success else failure_reward * tier_factor

    # Scale rewards according to latency
    scaled_rewards = scale_rewards(
        uids,
        responses,
        rewards,
    )

    # Scatter the rewards into the full 256 UID vector.
    scattered_rewards: torch.FloatTensor = (
        self.moving_averaged_scores.to(self.device)
        .scatter(
            0,
            uids,
            scaled_rewards,
        )
        .to(self.device)
    )

    # Update moving_averaged_scores with rewards produced by this step.
    # shape: [ metagraph.n ]
    alpha: float = 0.05
    self.moving_averaged_scores: torch.FloatTensor = alpha * scattered_rewards + (
        1 - alpha
    ) * self.moving_averaged_scores

```

- If there is no significant difference in the incentive distribution, you can also show that miners in the SN have about the same performance in multiple ways.

- There could be many reasons for the difference in incentive for miners. 
    - Case 1: Difference in the quality of response
        - Show that miners with higher incentive generally give better answer then those with lower incentive through the following ways
            - lower loss; higher accuracy
            - human eval for text/ image/ audio quality 

The quality of response is not relevant here, simply because it is a binary outcome. If successful, the cryptographic proofs will be verifiable and return `True`. If the cryptographic verification fails for any reason, we either give `0` or `negative` rewards based on the request type and the miner tier.

For completeness, here are verification algoritms for `challenge` proofs:
```python

def verify_chained_commitment(proof, seed, commitment, verbose=True):
    """
    Verifies the accuracy of a chained commitment using the provided proof, seed, and commitment.
    The function hashes the concatenation of the proof and seed and compares this result with the provided commitment
    to determine if the commitment is valid.
    Args:
        proof (str): The proof string involved in the commitment.
        seed (str): The seed string used in generating the commitment.
        commitment (str): The expected commitment hash to validate against.
        verbose (bool, optional): Enables verbose logging for debugging. Defaults to True.
    Returns:
        bool: True if the commitment is verified successfully, False otherwise.
    """
    if proof is None or seed is None or commitment is None:
        return False

    expected_commitment = str(hash_data(proof.encode() + seed.encode()))

    return expected_commitment == commitment

def validate_merkle_proof(proof, target_hash, merkle_root, hash_type="sha3_256"):
    """
    Validates a Merkle proof, verifying that a target element is part of a Merkle tree with a given root.

    A Merkle proof is a sequence of hashes that, when combined with the target hash through the hash function
    specified by `hash_type`, should result in the Merkle root if the target hash is indeed part of the tree.

    Parameters:
        proof (list of dicts): A list of dictionaries where each dictionary has one key, either 'left' or 'right',
            corresponding to whether the sibling hash at that level in the tree is to the left or right of the path
            leading to the target hash.
        target_hash (str): The hexadecimal string representation of the target hash being proven as part of the tree.
        merkle_root (str): The hexadecimal string representation of the Merkle root of the tree to which the target
            hash is being validated against.
        hash_type (str, optional): The type of hash function used to construct the Merkle tree. This must match the
            hash function used in constructing the original Merkle tree. Defaults to "sha3_256", and it must be an
            attribute of the `hashlib` module that takes a bytes object and returns a hash object that has a `digest`
            method.

    Example:
        # Example of validating a Merkle proof
        valid_proof = [{'left': 'abc...'}, {'right': 'def...'}]
        target = 'a1b2c3...'
        root = '123abc...'
        is_valid = validate_merkle_proof(valid_proof, target, root)
        print(is_valid)  # Outputs True if the proof is valid, False otherwise
    """
    hash_func = getattr(hashlib, hash_type)
    merkle_root = bytearray.fromhex(merkle_root)
    target_hash = bytearray.fromhex(target_hash)
    if len(proof) == 0:
        return target_hash == merkle_root
    else:
        proof_hash = target_hash
        for p in proof:
            try:
                # the sibling is a left node
                sibling = bytearray.fromhex(p["left"])
                proof_hash = hash_func(sibling + proof_hash).digest()
            except:
                # the sibling is a right node
                sibling = bytearray.fromhex(p["right"])
                proof_hash = hash_func(proof_hash + sibling).digest()
        return proof_hash == merkle_root

def verify_challenge_with_seed(synapse, seed, verbose=False):
    """
    Verifies a challenge in a decentralized network using a seed and the details contained in a synapse.
    The function validates the initial commitment hash against the expected result, checks the integrity of the commitment,
    and verifies the merkle proof.
    Args:
        synapse (Synapse): The synapse object containing challenge details.
        verbose (bool, optional): Enables verbose logging for debugging. Defaults to False.
    Returns:
        bool: True if the challenge is verified successfully, False otherwise.
    """
    if synapse.commitment_hash is None or synapse.commitment_proof is None:
        return False

    if not verify_chained_commitment(
        synapse.commitment_proof, seed, synapse.commitment_hash, verbose=verbose
    ):
        return False

    committer = ECCommitment(
        hex_to_ecc_point(synapse.g, synapse.curve),
        hex_to_ecc_point(synapse.h, synapse.curve),
    )
    commitment = hex_to_ecc_point(synapse.commitment, synapse.curve)

    if not committer.open(
        commitment,
        hash_data(base64.b64decode(synapse.data_chunk) + str(seed).encode()),
        synapse.randomness,
    ):
        return False

    if not validate_merkle_proof(
        b64_decode(synapse.merkle_proof),
        ecc_point_to_hex(commitment),
        synapse.merkle_root,
    ):
        return False

    return True
```

Only if all three (3) proofs succeed, do we consider this response "successful". The challenged miner must provide:
(1) A "chained hash" commitment proof. Contains the previous random seed chosen by the validator, the current seed chosed by the validator, and the original data. All three are required to pass this check.
(2) A Pedersen commitment proof at data index `j` containing data chunk `D_j`. The miner commits to this specific data chunk (small, usually < 128 kb) and returns the commitment with the randomness value `r` to complete this proof.
(3) A Merkle proof wherein the leaves of the merkle tree are the commitments (points) along the elliptic curve up to the index `j`, such that validators are able to validate the entire merkle tree up to index `j` without having to recieve any previous data or have special knowledge other than the merkle root, and the commitment for index `j`.

If *any* of these fail, the entire request is considered a failure and gets zero or negative reward.

- 
    - Case 2: Difference in miner avalibility 
        - Then you can show that given a certain number of trials(100), there are more successful calls to higher incentive miners.

This information is encapsulated in two places: (1) the `Wilson score` and (2) the `monitor` protocol. 

Given `N` trials, the wilson score establishes a lower bound for a confidence interval on miner success rates given low sample or population sizes. For example, if a miner responds successfuly 8 of 9 times, it's lower bound on success rate is roughly `0.8701769999681506`. In other words `wilson_score(8,9) == 0.8701769999681506`, suggesting that 87% of the time, this miner is expected to succeed at a minimum on average. This calculation is used as the lower bound to determine tier eligibility and establishes a "minimium trust level" for the miner. If a miner continues to have failed challenges, the wilson score lower bound will cross below the current tier threshold, and when the tiers are recalculated, that miner will move down a tier (e.g. `Diamond -> Gold`.)

Miners are also incentivized to be available, and are punshed for not being online, (e.g. failed `n` pings) by the `monitor` protocol. Periodically (every `m` steps) validators ping a set of `k` miners (default is 40), and after 5 successive failed attempts to ping a specific UID, that miner will be negatively rewarded for unavailability.

Here is a simplified implementation of the `monitor` protocol:
```python
async def monitor(self):
    """
    Monitor all UIDs by ping and keep track of how many failures
    occur. If a UID fails too many times, remove it from the
    list of UIDs to ping.
    """
    # Ping current subset of UIDs, keep track of who failed
    query_uids = await get_available_query_miners(self, k=40)
    _, failed_uids = await ping_uids(self, query_uids)

    down_uids = []
    for uid in failed_uids:
        self.monitor_lookup[uid] += 1
        if self.monitor_lookup[uid] > 5:
            self.monitor_lookup[uid] = 0
            # If threshold reached this round, negatively reward 
            down_uids.append(uid)

    if down_uids:
        # Negatively reward miners who are down.
        rewards = torch.zeros(len(down_uids), dtype=torch.float32)

        for i, uid in enumerate(down_uids):
            rewards[i] = MONITOR_FAILURE_REWARD # typically -0.005

        # Update moving averaged scores given new observations based on alpha
        scattered_rewards: torch.FloatTensor = self.moving_averaged_scores.scatter(
            0, down_uids, rewards
        )
        alpha: float = 0.05
        self.moving_averaged_scores: torch.FloatTensor = alpha * scattered_rewards + (
            1 - alpha
        ) * self.moving_averaged_scores

```

- 
    - Case 3: Difference in latency.
        - Then you can show that miners in Q1 generally response in faster time rather then miners in Q3.

Yes, miners with lower latency are rewarded higher. There is a modified sigmoid function that is applied to the normalized response times. The sigmoid is adjusted to the `max` response time per query group. This scales such that the laggards in the group (bottom, right tail) are rewarded less than those in the left tail (faster response times).

For example, if a group of miners respond with times (pre-sorted):
```python
proc_times = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

sigm_factor = sigmoid_normalize(proc_times, max(proc_times) * 2)
sigm_factor
> [0.9752, 0.9652, 0.9514, 0.9324, 0.9067, 0.8726, 0.8284, 0.7729, 0.7057, 0.6283]
```

Then rewards would be based on the `max_time` of `0.9` and multiplied by the initial rewards vector which is based on tiers.

For example:
```python
tiers = ["Bronze", "Bronze", "Gold", "Diamond", "Bronze", "Silver", "Silver", "Super Saiyan", "Gold", "Bronze"]
rewards_vector = [0.6, 0.6, 0.8, 0.9, 0.6, 0.7, 0.7, 1.0, 0.8, 0.6]

initial_rewards = rewards_vector * sigm_factor
> [0.5851, 0.5791, 0.7611, 0.8392, 0.544, 0.6108, 0.5799, 0.7729, 0.5646, 0.3769]


# Now if success or failer determines pos, neg, or zero reward:
verified = [True, True, False, True, False, False, True, True, True, False]
final_rewards = []
for i,v in enumerate(verified):
    final_rewards.append(initial_rewards[i] * (1 if v else 0))
    
final_rewards
> [0.5851, 0.5791, 0.0, 0.8392, 0.0, 0.0, 0.5799, 0.7729, 0.5646, 0.0]
```


An example calculation is below for illustration:

```python
def adjusted_sigmoid_inverse(x, steepness=1, shift=0):
    """
    Inverse of the adjusted sigmoid function.

    This function is a modified version of the sigmoid function that is shifted to
    the right by a certain amount but inverted such that low completion times are
    rewarded and high completions dimes are punished relative to the batch.
    """
    return 1 / (1 + np.exp(steepness * (x - shift)))

def calculate_sigmoid_params(timeout):
    """
    Calculate sigmoid parameters based on the timeout value.

    Args:
    - timeout (float): The current timeout value.

    Returns:
    - tuple: A tuple containing the 'steepness' and 'shift' values for the current timeout.
    """
    base_timeout = 1
    base_steepness = 7
    base_shift = 0.3

    # Calculate the ratio of the current timeout to the base timeout
    ratio = timeout / base_timeout

    # Calculate steepness and shift based on the pattern
    steepness = base_steepness / ratio
    shift = base_shift * ratio

    return steepness, shift

def sigmoid_normalize(process_times, max_time):

    # Center the completion times around 0 for effective sigmoid scaling
    centered_times = process_times - np.mean(process_times)

    # Calculate steepness and shift based on maximum time in the batch
    steepness, shift = calculate_sigmoid_params(max_time)

    # Apply adjusted sigmoid function to scale the times
    return adjusted_sigmoid_inverse(centered_times, steepness, shift)

# Main scaling function for reward latency. This modifies the initial tier based rewards further for more granularity
async def scale_rewards(uids, responses):
    max_time = max(
        [
            response.dendrite.process_time for response in responses
            if response.dendrite.process_time is not None
        ] or [1] # nobody responded successfully
    )

    sorted_axon_times = get_sorted_response_times(uids, responses, max_time=max_time)

    # Extract only the process times
    process_times = [proc_time for _, proc_time in sorted_axon_times]

    # Apply logarithmic scaling to data sizes
    log_data_sizes_np = np.log1p(data_sizes)

    # Normalize the response times by data size (unit time)
    data_normalized_process_times = np.asarray(np.array(process_times) / log_data_sizes_np)

    # Normalize the response times
    normalized_times = sigmoid_normalize(data_normalized_process_times, max(data_normalized_process_times) * 2)

    # Create a dictionary mapping UIDs to normalized times
    uid_to_normalized_time = {
        uid: normalized_time
        for (uid, _), normalized_time in zip(sorted_axon_times, normalized_times)
    }

    # Scale the data size-scaled rewards with normalized times
    time_scaled_rewards = torch.tensor(
        [   # tier reward * latency based normalized reward
            rewards[i] * uid_to_normalized_time[uid]
            for i, uid in enumerate(uids)
        ]
    )

    # Final normalization if needed
    rescale_factor = torch.sum(rewards) / torch.sum(time_scaled_rewards)
    bt.logging.trace(f"Rescale factor: {rescale_factor}")
    scaled_rewards = [reward * rescale_factor for reward in time_scaled_rewards]
    return scaled_rewards


# Usage:
uids = ...
responses = await dendrite(...)

scaled_rewards = scale_rewards(uids, responses)

```

- 
    - Case 4: Please provide your own justification if the reasons above dosen't fit.

In short, miners are incentivized to be (1) available, (2) performant, (3) correct, and (4) consistent and trustworthy over time. This creates an incentive landscape that is multidimensional and has degrees fo freedom such that miners are able to ascend and descend the tier system based on their participation over time in an organic fashion. The challenge is to take a binary outcome (do you have the data or not?) and turn it into a gradated landscape with a pseudo-linear incentive distribution curve.

Incentive typically ranges from 350 -> 528. After immunity (2 days) miners either cross this incentive threshold and compete or drop off and are deregistered below ~350 level. The curve is smoothed out by the additions to the reward mechanism so that there are not distinct levels or groups by tier. The goal is to avoid a step function and allow foster a granular competitive landscape.



## (C) (If applicable) Miner landscape
- How many unique responses can we get from the network and how many miners are giving the same responses. It is perfectly fine even if all of the miners respond the same thing.
N/A: Miners are expected to respond successfully to challenges for only data they posses, and does not make sense to project the same request to all miners.

> (1) Send the same request to all miners over the SN
>
> (2) Check the number of unique responses  
> 
> (3) Check the number of miners giving the same response
  

 ## (D) (If applicable) Demonstrate the effectiveness of the scoring mechanism.
- If you are using a reward/penalty model: 
    - Please load the reward or penalty model one by one and then show that the reward of a good response > the reward of a bad response
    - Please also allow us to customise the input of the reward model

    > (1) Load the reward/penalty model one by one 
    >
    > (2) Define the good/bad response
    >
    > (3) Score the response with the model

N/A No explicit ML reward model is used.

- Otherwise, you may just give a brief explanation to how does your scoring mechanism works.

### Algorithm
Multi-dimensional reward mechanism across 4 main axes:
(1) Availability: Is the miner reachable when requested? Incentivizes uptime.
(2) Speed (performant): how fast was the normalized response time?
(3) Correctness (reliable): did the proof succeed? y/n
(4) Trustworthiness (correctness over time): Tier definitions encapsulate consistenty and trustworthiness over time. A miner that rarely fails challenges and faithfully returns data when requested will receive an initial higher proportional reward.

The main drivers behind the reward mechanism are to model meritocratic systems as seen in the academic realm and in the business world that scales trust across time and observation of output. For example, we create various "tiers" in education, Undergraduate, Masters, PhD, Post Doc, or in software, Junior Engineer, Senior Engineer, Principal Engineer, Staff Engineer, etc where each subsequent level achieved has proportionally greater expecetations on performance and qualification of the individual. 

However, performance of the individual must match the expectations of the pedigree, and behavior that is inconsistent with a given level will be adjusted. If, for example, a Junior Engineer proves their output is substantial and over time completes projects that provide value, that engineer will be promoted to Senior Engineer over time, and pontentially beyond. Conversely, a Principal Engineer that consistently underperforms expectations will be demoted to a role with lower expectations until able to prove otherwise. The same logic applies to miners in FileTAO (SN21).

Tier (class) mobility is at the heart of the mechanism, and provides a balance between competition and trust, where over time competitiveness breeds a degree of trust on which we can associate a degree of reliability with a given entity (or miner).

### Example Store Reward
```python

miner_uids = [0, 1, 2, 3, 4]
miner_tier = ["Bronze", "Silver", "Gold", "Diamond", "Super Saiyan"]
tier_miner_rewards = [0.6, 0.7, 0.8, 0.9, 1.0]
response_times = [0.15, 0.12, 0.19, 0.11, 0.3]
response_factor = sigmoid_normalize(response_times, max(response_times) * 2)
top_2_miner_uids = [1, 3] # UIDs that returned fastest
verified_responses = [True, True, True, True, True] # Assume all respond successfully

rewards = [None for _ in range(len(miner_uids))]
for uid in miner_uids:
    rewards[uid] = tier_miner_rewards[uid] * response_factor[uid] if verified_responses[uid] else 0.0

rewards
> [0.5491, 0.6571, 0.6971, 0.8506, 0.6524]

# Bump top 2 miners
TIER_BOOSTS = {
    "Super Saiyan": 1.02, # 2%  -> 1.02
    "Diamond": 1.05,      # 5%  -> 0.945
    "Gold": 1.1,          # 10% -> 0.88
    "Silver": 1.15,       # 15% -> 0.805
    "Bronze": 1.2,        # 20% -> 0.72
}
for uid in top_2_miner_uids:
    rewards[uid] *= TIER_BOOSTS[miner_tier[uid]]

# boosts:    15%             5%
rewards 
> [0.5491, 0.7557, 0.6971, 0.8931, 0.6524]
```


## Rebalance Protocol (Data Preservation)
A critical component of the subnet mechanism is the rebalance protocol wherein data is recovered and minimum data redundancy is restored when miners are (1) deregistered, or (2) unavailable after a certain amount of time.

The general algorithm is as follows:
(1) Block watcher checks each block's events for `NeuronRegistered` events for `netuid=21`.
(2) When found, all client data for replaced `hotkey` is gathered from the `k-1` redundant miners and distributed to 2 new miners, thus *increasing* redundancy factor for the subnet as a whole.
(3) All challenge data is wiped for that `hotkey` and purged from the index. This way miners who deregister are not responsible for old challenge data they may no longer possess.
(4) The tier and statistics are reset for that `hotkey`.

Pseudo-code below:
```python

async def rebalance_data_for_hotkey(
    database, k: int, source_hotkey: str, hotkey_replaced: bool = False
):
    """
    Get all data from a given miner/hotkey and rebalance it to other miners.

    (1) Get all data from a given miner/hotkey.
    (2) Find out which chunks belong to full files, ignore the rest (challenges)
    (3) Distribute the data that belongs to full files to other miners.
    """
    metadata = await get_metadata_for_hotkey(source_hotkey, database)

    miner_hashes = list(metadata)

    rebalance_hashes = []
    for _hash in miner_hashes:
        if await is_file_chunk(_hash, database):
            rebalance_hashes.append(_hash)

    if hotkey_replaced:
        # Reset miner statistics
        await register_miner(source_hotkey, database)
        # Update index for full and chunk hashes for retrieve
        # Iterate through ordered metadata for all full hashses this miner had
        async for file_key in database.scan_iter("file:*"):
            file_key = file_key.decode("utf-8")
            file_hash = file_key.split(":")[1]
            # Get all ordered metadata for this file
            ordered_metadata = await get_ordered_metadata(file_hash, database)
            for chunk_metadata in ordered_metadata:
                # Remove the dropped miner from the chunk metadata
                await remove_hotkey_from_chunk(
                    chunk_metadata, source_hotkey, database
                )
        # Purge challenge hashes so new miner doesn't get hosed
        await purge_challenges_for_hotkey(source_hotkey, database)

    # Take each data that needs to be spread and send it to `k` (usually 2) new miners.
    for _hash in rebalance_hashes:
        await rebalance_data_for_hash(data_hash=_hash, k=k)


# Usage:
await rebalance_data_for_hotkey(
    database, k, hotkey, hotkey_replaced=True
)
```


## Distribute Protocol

Periodically, in the `forward` step, data is `distribute`d similar to the `rebalance` protocol, but by increasing data redundancy so that data never becomes "stale" on specific miners. A particular client data hash is chosen at random then spread to 2 new miners. 

Simplified implementation below:
```python

async def distribute_data(self, k: int):
    """
    Distribute data storage among miners by migrating data from a set of miners to others.

    Parameters:
    - k (int): The number of miners to query and distribute data from.

    Returns:
    - A report of the rebalancing process.
    """

    full_hashes = [key async for key in self.database.scan_iter("file:*")]

    full_hash = random.choice(full_hashes)
    ordered_metadata = await get_ordered_metadata(full_hash, self.database)

    # Get the hotkeys/uids to query
    exclude_uids = set()
    for chunk_metadata in ordered_metadata:
        uids = [
            self.metagraph.hotkeys.index(hotkey)
            for hotkey in chunk_metadata["hotkeys"]
            if hotkey
            in self.metagraph.hotkeys
        ]
        # Collect all uids for later exclusion
        exclude_uids.update(uids)

    # Use primitives to retrieve and store all the chunks:
    retrieved_data, retrieved_payload = await retrieve_broadband(self, full_hash)

    # Pick new miners to store data on that are not previously utilized, thereby increasing redundancy and data decentralization.
    await store_broadband(
        self,
        retrieved_data,
        encryption_payload=retrieved_payload,
        exclude_uids=list(exclude_uids),
    )
```

 ## (E) (If applicable) Show the dataset that was used by the validator.
 > (1) Load the dataset 
 > 
 > (2) Show the first 10 samples of the dataset 

N/A No dataset. Validation is performed on both random and intermingled user data.

 ## (F) (If applicable) Demonstrate the use of any API and/or links to a frontend.

Dashboard: COMING SOON!: 
[Dash Screenshot](Dashboard.png)

Front End: COMING SOON!: 
[Front End Screenshot](FrontEnd.png)

### High Level API
API example (python):
```python
import time
import bittensor as bt
from typing import List, Optional
from storage.api import store, retrieve
bt.trace()

# Example usage
async def store_things() -> Tuple[str, List[str]]:
    # setup wallet and subtensor connection
    wallet = bt.wallet()
    subtensor = bt.subtensor("test")

    # Store some data using the validator set
    data = b"This is a test of the API high level abstraction"
    print("Storing data on the Bittensor testnet.")
    cid, hotkeys = await store(data, wallet, subtensor, netuid=22)
    print("Stored {} with {} hotkeys".format(cid, hotkeys))
    return cid, hotkeys

async def retrieve_things(cid: str, hotkeys: Optional[List[str]] = None) -> bytes:
    print("Now retrieving data with CID: ", cid)
    # Retrieve some data by querying validator set that contains said data
    data = await retrieve(cid, wallet, subtensor, netuid=22, hotkeys=hotkeys)
    return data


cid, hotkeys = await store_things()

time.sleep(5)
data = await retrieve_things(cid, hotkeys)
print(f"Retrieved data: {data}!")
```


### Low level API
This uses the bittensor subnets api, slightly modified for our use case here.

Here is how we utilize the Subnets API abstract class to extend two simple fucntions, prepare and process:

Subnets API:
```python

class Subnet21API(ABC):
    def __init__(self, wallet: "bt.wallet"):
        self.wallet = wallet
        self.dendrite = bt.dendrite(wallet=wallet)

    async def __call__(self, *args, **kwargs):
        return await self.query_api(*args, **kwargs)

    @abstractmethod
    def prepare_synapse(self, *args, **kwargs) -> Any:
        """
        Prepare the synapse-specific payload.
        """
        ...

    @abstractmethod
    def process_responses(self, responses: List[Union["bt.Synapse", Any]]) -> Any:
        """
        Process the responses from the network.
        """
        ...

    async def query_api(
        self,
        axons: Union[bt.axon, List[bt.axon]],
        deserialize: Optional[bool] = False,
        timeout: Optional[int] = 12,
        n: Optional[float] = 0.1,
        uid: Optional[int] = None,
        **kwargs: Optional[Any],
    ) -> Any:
        ... # Implementation not shown for brevity. Find src in `storage/api/base.py`
```

E.g. How do we prepare synapse for querying? And what do we do with the reponses when receieved?

```python
class StoreUserAPI(Subnet21API):
    def __init__(self, wallet: "bt.wallet"):
        super().__init__(wallet)
        self.netuid = 21

    def prepare_synapse(
        self, data: bytes, encrypt=False, ttl=60 * 60 * 24 * 30, encoding="utf-8"
    ) -> StoreUser:
        data = bytes(data, encoding) if isinstance(data, str) else data
        encrypted_data, encryption_payload = (
            encrypt_data(data, self.wallet) if encrypt else (data, "{}")
        )
        expected_cid = generate_cid_string(encrypted_data)
        encoded_data = base64.b64encode(encrypted_data)

        synapse = StoreUser(
            encrypted_data=encoded_data,
            encryption_payload=encryption_payload,
            ttl=ttl,
        )

        return synapse

    def process_responses(self, responses: List[Union["bt.Synapse", Any]], return_failures: bool = False) -> Union[str, List[str]]:
        success = False
        successful_hotkeys = []
        failure_modes = {"code": [], "message": []}
        for response in responses:
            if response.dendrite.status_code != 200:
                failure_modes["code"].append(response.dendrite.status_code)
                failure_modes["message"].append(response.dendrite.status_message)
                continue

            stored_cid = (
                response.data_hash.decode("utf-8")
                if isinstance(response.data_hash, bytes)
                else response.data_hash
            )
            success = True
            bt.logging.debug(f"Successfully stored CID {stored_cid} with hotkey {response.axon.hotkey}")
            successful_hotkeys.append(response.axon.hotkey)

        if success:
            bt.logging.info(
                f"Stored data on the Bittensor network with CID {stored_cid}"
            )
        else:
            bt.logging.error(
                f"Failed to store data. Response failure codes & messages {failure_modes}"
            )
            stored_cid = ""

        if return_failures:
            return stored_cid, successful_hotkeys, failure_modes

        return stored_cid, successful_hotkeys
```

Now we can use our lower level primitives to accomplish the same goal as the abstraction above in `store` and `receieve` functions.

```python
import bittensor as bt
from storage.api import StoreUserAPI, RetrieveUserAPI, get_query_api_axons

wallet = bt.wallet()
store_handler = StoreUserAPI(wallet)

# Fetch the axons of the available API nodes, or specify UIDs directly
metagraph = bt.subtensor("test").metagraph(netuid=22)
all_axons = await get_query_api_axons(wallet=wallet, metagraph=metagraph)
axons = random.choices(all_axons, k=3)

# Store some data!
raw_data = b"Hello FileTao! This is a test of storing data on SN21."

bt.logging.info(f"Storing data {raw_data} on the Bittensor testnet.")
cid, hotkeys = await store_handler(
    axons=axons,
    # any arguments for the proper synapse
    data=raw_data,
    encrypt=False, # optionally encrypt the data with your bittensor wallet
    ttl=60 * 60 * 24 * 30,
    encoding="utf-8",
    uid=None,
    timeout=60,
)
print("Stored {} with {} hotkeys".format(cid, hotkeys))

time.sleep(5)
bt.logging.info(f"Now retrieving data with CID: {cid}")
retrieve_handler = RetrieveUserAPI(wallet)
rdata = await retrieve_handler(
    axons=axons,
    # Arugmnts for the proper synapse
    cid=cid,
    timeout=60,
)
print(rdata)
assert raw_data == rdata

```


To test these directly, you can run `storage/api/examples.py`

In [None]:
# RUN THIS CELL TO QUERY MINERS USING A VALIDATOR HOTKEY
import base64
import numpy as np
import bittensor as bt
from redis import asyncio as aioredis
from Crypto.Random import get_random_bytes
from storage import protocol
from storage.shared.utils import get_redis_password
from storage.validator.database import *
from storage.shared.ecc import (
    hash_data,
    setup_CRS,
    ecc_point_to_hex,
)


data = b"Some bytes data to store on the network!"
subtensor = bt.subtensor("finney")
metagraph = subtensor.metagraph(netuid=21, lite=False)

# Grab top 10% of miners by incentive
def get_top_miner_uids(n=0.1):
    top_i = np.quantile(metagraph.I, 1 - n)
    uids = metagraph.uids[metagraph.I > top_i]
    return [uid.item() for uid in uids]

axons = [metagraph.axons[uid] for uid in get_top_miner_uids()]

# Setup CRS (common reference string) for this round of validation
curve = "P-256"
g, h = setup_CRS(curve=curve)

# Hash the data
data_hash = hash_data(data)

# Convert to base64 for compactness
b64_encrypted_data = base64.b64encode(data).decode("utf-8")

# Create Store synapse
synapse = protocol.Store(
    encrypted_data=b64_encrypted_data,
    curve=curve,
    g=ecc_point_to_hex(g),
    h=ecc_point_to_hex(h),
    seed=get_random_bytes(32).hex(),  # 256-bit seed
    ttl=12345, # how many seconds before miners can safely delete
)

# NOTE: Must be a registered validator hotkey to work.
name = "default"
hotkey = "default"
wallet = bt.wallet(name, hotkey)
dendrite = bt.dendrite(wallet)

responses = await dendrite(
    axons,
    synapse,
    deserialize=False
)

responses

In [None]:
# RUN THIS CELL TO STORE ON TESTNET

import time
import random
import bittensor as bt
from storage.api import store, retrieve
from storage.api import StoreUserAPI, RetrieveUserAPI, get_query_api_axons
bt.trace()

# setup wallet and subtensor connection
wallet_name = "default"
wallet_hotkey = "default"
wallet = bt.wallet(wallet_name, wallet_hotkey)
subtensor = bt.subtensor("test")

# Store some data and retrieve it
data = b"This is a test of the API high level abstraction"

print("Storing data on the Bittensor testnet.")
cid, hotkeys = await store(data, wallet, subtensor, netuid=22)

print("Stored {} with {} hotkeys".format(cid, hotkeys))

In [None]:
# RUN THIS CELL TO RETRIEVE ON TESTNET
print("Now retrieving data with CID: ", cid)
rdata = await retrieve(cid, wallet, subtensor, netuid=22, hotkeys=hotkeys)
print(rdata)
assert data == rdata, "Data does not match!"

In [None]:
# RUN THIS CELL TO STORE ON MAINNET (Assumes at least 1 API node is running with --open_access for whitelisting all keys.)

# setup wallet and subtensor connection
wallet_name = "default"
wallet_hotkey = "default"
wallet = bt.wallet(wallet_name, wallet_hotkey)
subtensor = bt.subtensor("finney")

# Store some data and retrieve it
data = b"This is a test of the API high level abstraction"

print("Storing data on the Bittensor testnet.")
cid, hotkeys = await store(data, wallet, subtensor, netuid=21)

print("Stored {} with {} hotkeys".format(cid, hotkeys))

In [None]:
# RUN THIS CELL TO RETRIEVE ON MAINNET
print("Now retrieving data with CID: ", cid)
rdata = await retrieve(cid, wallet, subtensor, netuid=21, hotkeys=hotkeys)
print(rdata)
assert data == rdata, "Data does not match!"

## Consent: Do you want this demo notebook to be public? Yes/No 
Yes. 