# SN 9 Demo

## Overview

The code and documentation for the subnet can be found at https://github.com/RaoFoundation/pretraining.

Bittensor subnet 9 rewards miners for producing pretrained Foundation-Models on the Falcon Refined Web dataset. It acts like a continuous benchmark whereby miners are rewarded for attaining the best losses on randomly sampled pages of Falcon given a consistent model architecture.

1. Miners train and periodically publish models to hugging face and commit the metadata for that model to the Bittensor chain. See https://github.com/RaoFoundation/pretraining/blob/main/docs/miner.md for more details.

2. Validators download the models from hugging face for each miner based on the Bittensor chain metadata and continuously evaluate them, setting weights based on the performance of each model against the Falcon dataset. See https://github.com/RaoFoundation/pretraining/blob/main/docs/validator.md for more details.

3. The Bittensor chain aggregates weights from all active validators using Yuma Consensus to determine the proportion of TAO emission rewarded to miners and validators. See https://docs.bittensor.com/learn/anatomy-of-incentive-mechanism for more details.


## Finding the best uploaded model

The leaderboard is regularly updated with the best model as well as benchmark information for that model vs other well known models. https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard.

From the leaderboard you can follow the links to the hugging face repo.

Alternatively you can use the built in helpers in the following code block to programatically get the best_uid and the corresponding repository from the chain.

In [None]:
import pretrain as pt

best_uid = pt.graph.best_uid()
repo_url = pt.mining.get_repo(best_uid)

In [None]:
download_directory = "your/download/directory"
best_model = pt.mining.load_remote_model(best_uid, download_directory)

In [None]:
# To reload the model with bfloat16 and flash_attention_2 for memory/performance efficiency:
from transformers import AutoModelForCausalLM
import torch

best_model = AutoModelForCausalLM.from_pretrained(
                pretrained_model_name_or_path=download_directory,
                local_files_only=True,
                use_safetensors=True,
                torch_dtype=torch.bfloat16,
                attn_implementation="flash_attention_2",
            )

## Running inference with the model

Note that all models are using the same tokenizer. So we will go ahead and create that tokenizer to parse the desired prompts.

You will need to run this notebook on a device with an nvidia GPU supporting bfloat16 and flash attention 2.

In [None]:
tokenizer = pt.model.get_tokenizer()

prompt = """Permaculture is a design process mimicking the diversity, functionality and resilience of natural ecosystems. The principles and practices are drawn from traditional ecological knowledge of indigenous cultures combined with modern scientific understanding and technological innovations. Permaculture design provides a framework helping individuals and communities develop innovative, creative and effective strategies for meeting basic needs while preparing for and mitigating the projected impacts of climate change.
Write a summary of the above text.
Summary:
"""

# Encode the prompt.
inputs = tokenizer.encode(prompt, return_tensors="pt")

# Generate the output token ids.
ids = best_model.generate(inputs.input_ids.cuda(), num_return_sequences=1)

# Decode the output tokens into text.
text = tokenizer.decode(ids[0], skip_special_tokens=True)
print(text)


## Chain Operations

If you are interested in checking the specific metadata that has been uploaded to the chain that is something you can also do directly.

In [None]:
import bittensor as bt
from model.storage.chain.chain_model_metadata_store import ChainModelMetadataStore

metadata_store = ChainModelMetadataStore(subtensor=bt.subtensor())

model_metadata = await metadata_store.retrieve_model_metadata(
    "HOTKEY_TO_CHECK"
)
print(model_metadata)

## Validation Loop

1. Identifies valid models for evaluation (top N from last run + newly updated models).
2. Generates random pages for evaluation and prepares batches for each page from the dataset.
3. Computes the scoring for each model based on the losses incurred on the evaluation batches.
4. Calculates wins and win rates for each model to determine their performance relative to others.
5. Updates the weights of each model based on their performance and applies a softmax normalization.
6. Implements a blacklist mechanism to remove underperforming models from the evaluation set.
7. Logs all relevant data for the step, including model IDs, pages, batches, wins, win rates, and losses.


In [None]:
# Generate random pages for evaluation and prepares batches for each page from the dataset.
import constants
import random
import pretrain as pt

pages = [
    random.randint(1, pt.dataset.SubsetFalconLoader.max_pages)
    for _ in range(constants.n_eval_pages)
]

tokenizer = pt.model.get_tokenizer()
loader = pt.dataset.SubsetFalconLoader(
        batch_size=constants.batch_size,
        sequence_length=constants.SEQUENCE_LENGTH_2,
        pages=pages,
        tokenizer=tokenizer,
    )
batches = list(loader)

In [None]:
# Computes the scoring for each model based on the losses incurred on the evaluation batches.

# The validator will load each model in the current evaluation one by one.
# This notebook uses the previously loaded best_model.

losses = pt.validation.compute_losses(best_model, batches, "cuda", tokenizer.eos_token)