Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
0dd9cb2
Docker containerization for ReproducibleVLLM Validator
chrisu-inigra May 9, 2025
423fd30
Use build-context
chrisu-inigra May 12, 2025
09a73c9
Container building script for sn1-validator-api
chrisu-inigra May 13, 2025
797968d
Fixes for formatters and linters
chrisu-inigra May 27, 2025
1beb8a0
Add reliable changes
dbobrenko May 27, 2025
05ba731
Add model revert
dbobrenko May 27, 2025
76de801
Reduce number of workers
dbobrenko May 27, 2025
436f558
Increase queue size
dbobrenko May 27, 2025
9301505
Add chain module to api
dbobrenko May 27, 2025
ef3a2ed
Fix settings
dbobrenko May 27, 2025
8a046f6
Remove tracker from chat completion
dbobrenko May 27, 2025
2914352
Add delayed startup for calibration
dbobrenko May 27, 2025
60a8f59
Fixes from pre-commit hooks
chrisu-inigra May 28, 2025
b2c1d09
Move to sqlite with lock on one worker
dbobrenko May 28, 2025
06660fa
Fix worker sync
dbobrenko May 28, 2025
fa30176
Add calibration and DR improvements
dbobrenko May 28, 2025
f5e3ea5
Fix uid tracker
dbobrenko May 28, 2025
cad6ac6
Don't Generate Validator MSR Every Time
richwardle May 28, 2025
31388a2
Merge pull request #720 from backend-developers-ltd/SN1-Validator-Con…
bkb2135 May 29, 2025
62eb95c
Add calibration fixes
dbobrenko May 30, 2025
758f89e
Fix short response stream
dbobrenko Jun 1, 2025
793f42c
Fix weight sync on testnet
dbobrenko Jun 2, 2025
0d8b366
Add Wandb Log For Organics
richwardle Jun 2, 2025
bb2ec95
WIP
dbobrenko Jun 5, 2025
6a2fa54
Add extra uids query to API
dbobrenko Jun 6, 2025
344ca8c
Depreciate Target Uids
richwardle Jun 6, 2025
cc4f6d5
Remove overlapping linter (black + isort)
richwardle Jun 6, 2025
5f5137b
Black reformatting
richwardle Jun 6, 2025
d72130c
CI to just use Ruff Checks
richwardle Jun 6, 2025
28483dd
Add primary stream fallbacks
dbobrenko Jun 7, 2025
7686790
Clean up code
dbobrenko Jun 7, 2025
1ba5394
Reduce fallback conditions
dbobrenko Jun 7, 2025
faf0a9d
Speed up calibration
dbobrenko Jun 7, 2025
3ae2f92
Tune calibration params
dbobrenko Jun 7, 2025
0f2583f
Reduce fallback reqs
dbobrenko Jun 8, 2025
a92a21b
Increase logits penalty
dbobrenko Jun 8, 2025
a89cb76
Refactor web retrieval endpoint, enhance openai compatibility
dbobrenko Jun 9, 2025
bf6116d
Revert task creation interval
dbobrenko Jun 9, 2025
bd83cf7
Revert debug code parts
dbobrenko Jun 9, 2025
8645761
Run precommit hook
dbobrenko Jun 9, 2025
1342337
Remove commented unused code
dbobrenko Jun 9, 2025
abb5df7
Run pre-commit
dbobrenko Jun 9, 2025
4d3a776
Merge pull request #749 from macrocosm-os/fix/testnet
dbobrenko Jun 10, 2025
f34507c
Merge pull request #752 from macrocosm-os/feature/SN1-493
dbobrenko Jun 10, 2025
79d6cac
Merge branch 'staging' into SN1-508-remove-possibility-to-query-exact…
richwardle Jun 10, 2025
295634f
Deprecation In Docs
richwardle Jun 10, 2025
6ed2299
Precommit
richwardle Jun 10, 2025
a3b57cf
Gather UIDs by coldkeys
dbobrenko Jun 10, 2025
e19bf43
Comment optional request params
dbobrenko Jun 10, 2025
2f72f49
Log all organic query rewards and timings to wandb
richwardle Jun 10, 2025
8a883cb
Fixed test failures in tests/validator_api/test_chain.py by adding mi…
richwardle Jun 10, 2025
4cdb380
Fix wrong chunk assingment for primary stream
dbobrenko Jun 11, 2025
edf79f2
Always add 1 reliable uid
dbobrenko Jun 11, 2025
5b73432
Reduce extra uids, remove top incentive for primary stream
dbobrenko Jun 11, 2025
6ec2147
Improve logging
dbobrenko Jun 11, 2025
34c1dc0
Pair with organic checks
richwardle Jun 11, 2025
5e6125b
Merge pull request #751 from macrocosm-os/SN1-509-run-pre-commit-auto…
bkb2135 Jun 11, 2025
0832ed8
Merge pull request #750 from macrocosm-os/SN1-508-remove-possibility-…
bkb2135 Jun 11, 2025
f48f62d
Merge pull request #745 from macrocosm-os/SN1-502-remove-validator-ge…
bkb2135 Jun 11, 2025
86273b1
Fix primary stream fallback
dbobrenko Jun 13, 2025
15853cc
Fix rate logging
dbobrenko Jun 14, 2025
59beb6a
Fix metrics increment
dbobrenko Jun 14, 2025
c55d4c0
Remove model None
dbobrenko Jun 16, 2025
45be966
Run pre-commit
dbobrenko Jun 16, 2025
d2d70b7
Increase penalty
dbobrenko Jun 16, 2025
23bc988
Clean up settings and code
dbobrenko Jun 16, 2025
8503538
Add typing hints
dbobrenko Jun 16, 2025
c284178
More hints
dbobrenko Jun 16, 2025
c765dd8
Fix code style
dbobrenko Jun 16, 2025
2ae968b
Merge pull request #753 from macrocosm-os/feature/SN1-512
dbobrenko Jun 16, 2025
69285db
Merge branch 'staging' into SN1-507-log-organics-but-only-the-scoring…
dbobrenko Jun 16, 2025
ead1753
Add uids to organic log
dbobrenko Jun 16, 2025
1684f84
Code clean up
dbobrenko Jun 16, 2025
641ed5b
Merge pull request #754 from macrocosm-os/SN1-507-log-organics-but-on…
dbobrenko Jun 16, 2025
4c96631
Bump v2.19.5
dbobrenko Jun 16, 2025
9855450
Merge pull request #757 from macrocosm-os/release/v2.19.5
dbobrenko Jun 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 4 additions & 8 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,14 @@ jobs:
poetry run pip list

# Style/format checks.
- name: Run Black (code formatter)
run: |
poetry run black --check --diff .

- name: Run isort (import sorting)
run: |
poetry run isort --check-only --diff --profile black .

- name: Run Ruff (linter)
run: |
poetry run ruff check --diff .

- name: Run Ruff (formatter)
run: |
poetry run ruff format --check --diff .

- name: Test with pytest
run: |
# run tests in tests/ dir and only fail if there are failures or errors
Expand Down
16 changes: 1 addition & 15 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,4 @@ repos:
hooks:
- id: ruff
args: [--fix]

- repo: https://github.com/psf/black
rev: 23.7.0
hooks:
- id: black
name: black (code formatter)
language_version: python3.10
additional_dependencies: ["black[jupyter]"]

- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
name: isort (import sorting)
args: ["--profile", "black"]
- id: ruff-format
24 changes: 24 additions & 0 deletions containerized_job/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
git build-essential \
&& rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY download_model.py .

ARG LLM_MODEL
ENV MODEL_PATH=./downloaded_model

RUN python download_model.py --model-name "$LLM_MODEL" --model-path "$MODEL_PATH"

COPY . .
COPY --from=external_context /vllm_llm.py .

EXPOSE 8000

CMD ["python", "app.py"]
50 changes: 50 additions & 0 deletions containerized_job/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import os

import uvicorn
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from schema import ChatRequest, LogitsRequest
from vllm_llm import ReproducibleVLLM

MODEL_PATH = os.getenv("MODEL_PATH")


class ReproducibleVllmApp:
def __init__(self):
self.llm = ReproducibleVLLM(model_id=MODEL_PATH)
self.app = FastAPI()
self.app.post("/generate")(self.generate)
self.app.post("/generate_logits")(self.generate_logits)

async def generate(self, request: ChatRequest):
try:
result = await self.llm.generate(
messages=[m.dict() for m in request.messages],
sampling_params=request.sampling_parameters.dict(),
seed=request.seed,
continue_last_message=request.continue_last_message,
)
return {"result": result}
except Exception as e:
return JSONResponse(status_code=500, content={"error": str(e)})

async def generate_logits(self, request: LogitsRequest):
try:
logits, prompt = await self.llm.generate_logits(
messages=[m.dict() for m in request.messages],
top_logprobs=request.top_logprobs,
sampling_params=request.sampling_parameters.dict(),
seed=request.seed,
continue_last_message=request.continue_last_message,
)
return {"logits": logits, "prompt": prompt}
except Exception as e:
return JSONResponse(status_code=500, content={"error": str(e)})

def run(self):
uvicorn.run(self.app, host="0.0.0.0", port=8000)


if __name__ == "__main__":
server = ReproducibleVllmApp()
server.run()
10 changes: 10 additions & 0 deletions containerized_job/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

IMAGE_NAME="sn1-validator-api"
MODEL_NAME="mrfakename/mistral-small-3.1-24b-instruct-2503-hf"

DOCKER_BUILDKIT=1 docker build \
--build-arg LLM_MODEL="$MODEL_NAME" \
-t "$IMAGE_NAME" \
--build-context external_context=../prompting/llms \
.
24 changes: 24 additions & 0 deletions containerized_job/download_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import argparse

from huggingface_hub import snapshot_download

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Download model files")
parser.add_argument(
"--model-name",
type=str,
help="Model name to use",
)
parser.add_argument(
"--model-path",
type=str,
help="Path to save the model files",
)

args = parser.parse_args()

print(f"Downloading Model {args.model_name}, files downloaded to {args.model_path}")

snapshot_download(repo_id=args.model_name, local_dir=args.model_path)

print(f"Model files downloaded to {args.model_path}")
8 changes: 8 additions & 0 deletions containerized_job/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
fastapi==0.115.0
uvicorn==0.23.2
pydantic==2.9.0
vllm==0.8.3
torch==2.6.0
numpy==1.26.4
loguru==0.7.2
huggingface-hub==0.30.0
29 changes: 29 additions & 0 deletions containerized_job/schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from typing import List, Literal, Optional

from pydantic import BaseModel


class ChatMessage(BaseModel):
content: str
role: Literal["user", "assistant", "system"]


class SamplingParameters(BaseModel):
temperature: Optional[float] = 1.0
top_p: Optional[float] = 1.0
max_tokens: Optional[int] = 512
presence_penalty: Optional[float] = 0.0
frequency_penalty: Optional[float] = 0.0
top_k: Optional[int] = -1
logprobs: Optional[int] = None


class ChatRequest(BaseModel):
messages: List[ChatMessage]
seed: Optional[int]
sampling_parameters: Optional[SamplingParameters] = SamplingParameters()
continue_last_message: Optional[bool] = False


class LogitsRequest(ChatRequest):
top_logprobs: Optional[int] = 10
8 changes: 4 additions & 4 deletions docs/API_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,11 @@ bash run_api.sh

**Endpoint:** `POST /miner_availabilities/miner_availabilities`

**Description:** Fetches miner availabilities based on provided UIDs.
**Description:** Fetches miner availabilities based on provided UIDs. **Note: Specifying UIDs is deprecated.**

**Request Body:**

- JSON array of integers or null (optional).
- JSON array of integers or null (optional, deprecated).

---

Expand Down Expand Up @@ -169,13 +169,13 @@ Web Retrieval

**Endpoint:** `GET /web_retrieval`

**Description:** Retrieves a list websites about a search query
**Description:** Retrieves a list websites about a search query. **Note: The `uids` parameter is deprecated.**

**Parameters:**

- **search_query** (str): The search term you'd like to look up
- **n_miners** (int, optional): How many miners to query
- **uids**: (list[int], optional): which specific uids to query (Deprecated)
- **uids**: (list[int], optional, deprecated): which specific uids to query

---

Expand Down
2 changes: 1 addition & 1 deletion prompting/api/weight_syncing/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ async def verify_weight_signature(request: Request):
raise HTTPException(status_code=400, detail="Bad Request, message is not intended for self")
validator_hotkeys = [shared_settings.METAGRAPH.hotkeys[uid] for uid in WHITELISTED_VALIDATORS_UIDS]
if signed_by not in validator_hotkeys:
logger.error("Signer not the expected ss58 address")
logger.error(f"Signer not the expected ss58 address: {signed_by}")
raise HTTPException(status_code=401, detail="Signer not the expected ss58 address")
now = time.time()
body = await request.body()
Expand Down
4 changes: 1 addition & 3 deletions prompting/datasets/sn13.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,8 @@
class SN13Dataset(BaseDataset):
_url: ClassVar[str] = "arrmlet/x_dataset_218"
name: ClassVar[str] = "x_dataset_218"
_chance_word_synonym: ClassVar[float] = 0.10
_chance_char_typo: ClassVar[float] = 0.02
exception: Exception | None = None
dataset: datasets.Dataset = None
dataset: datasets.Dataset | None = None

class Config:
arbitrary_types_allowed = True
Expand Down
2 changes: 1 addition & 1 deletion prompting/llms/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ def gpu_utilization(cls):
return cls.used_memory / cls.total_memory


TEXT_MODELS = ["mrfakename/mistral-small-3.1-24b-instruct-2503-hf"]
TEXT_MODELS: set[str | None] = set([None, "mrfakename/mistral-small-3.1-24b-instruct-2503-hf"])


def model_factory(model_name: str) -> type[ReproducibleHF]:
Expand Down
7 changes: 5 additions & 2 deletions prompting/rewards/exact_match.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,14 @@
TOP_LOGPROBS = 10
MIN_VERIFY_TOKENS = 10
MAX_VERIFY_TOKENS = 51
PARTIAL_PENALTY = -1.0
# Partial completion is much more harmful from API perspective, compared to no response.
# TODO: Experimental aggressive value, revisit once the network is clean.
PARTIAL_PENALTY = -100.0
INCORRECT_PENALTY = -2.0
NOT_ENOUGH_TOKENS_PENALTY_SCALE = 0.1
MIN_SMOOTH_PENALTY_SCALE = 0.3
MIN_TIME_PENALTY_SCALE = 0.3
VERIFICATION_THRESH_CONTAINS = 0.92
VERIFICATION_THRESH_CONTAINS = 0.90
VERIFICATION_THRESH_SIM = 0.83
VERIFICATION_SIM_EXP_SCALE = 2.0

Expand Down Expand Up @@ -108,6 +110,7 @@ async def reward( # noqa: C901
to_complete = "".join(chunks[:check_idx])
if to_complete:
messages.extend([{"role": "assistant", "content": to_complete}])

verification_logits, _ = await model_manager.generate_logits(
model=task.llm_model_id,
messages=messages,
Expand Down
78 changes: 48 additions & 30 deletions prompting/rewards/scoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,44 +108,62 @@ async def run_step(self) -> RewardLoggingEvent:
model_manager=self.model_scheduler.llm_model_manager,
task_queue=self.task_queue,
)
if scoring_config.task.organic:
logger.debug(f"Reward events size: {len(reward_events)}")

self.reward_events.append(reward_events)

logger.debug(
f"Scored {scoring_config.task.__class__.__name__} {scoring_config.task.task_id} with model "
f"{scoring_config.task.llm_model_id}"
)
if not scoring_config.task.organic:
# Reduce log size for raw chunks, wandb fails to log any data when overloaded.
response = copy.deepcopy(scoring_config.response)
response.stream_results_all_chunk_dicts_raw = []
for idx in range(len(response.stream_results)):
response.stream_results[idx].accumulated_chunk_dicts_raw = []

if isinstance(scoring_config.task, MSRv2Task):
if scoring_config.task.ground_truth is not None:
reference_value = str(scoring_config.task.ground_truth) # "0" or "1"
else:
reference_value = None

# Reduce log size for raw chunks, wandb fails to log any data when overloaded.
response = copy.deepcopy(scoring_config.response)
response.stream_results_all_chunk_dicts_raw = []
for idx in range(len(response.stream_results)):
response.stream_results[idx].accumulated_chunk_dicts_raw = []

if isinstance(scoring_config.task, MSRv2Task):
if scoring_config.task.ground_truth is not None:
reference_value = str(scoring_config.task.ground_truth) # "0" or "1"
else:
reference_value = scoring_config.task.reference

log_event(
RewardLoggingEvent(
response_event=response,
reward_events=reward_events,
reference=reference_value,
challenge=scoring_config.task.query,
task=scoring_config.task.name,
block=scoring_config.block,
step=scoring_config.step,
task_id=scoring_config.task_id,
task_dict=scoring_config.task.model_dump(),
source=scoring_config.dataset_entry.source,
)
reference_value = None
else:
reference_value = scoring_config.task.reference

if scoring_config.task.organic:
response.stream_results = []
response.axons = []
response.completions = []
response.stream_results_all_chunks = []
response.stream_results_all_tokens_per_chunk = []
reward_events = copy.deepcopy(reward_events)
for event in reward_events:
event.task = event.task.__class__()

reference = None
challenge = ""
task_dict = {}
source = "organic"
else:
reference = reference_value
challenge = scoring_config.task.query
task_dict = scoring_config.task.model_dump()
source = scoring_config.dataset_entry.source

log_event(
RewardLoggingEvent(
response_event=response,
reward_events=reward_events,
reference=reference,
challenge=challenge,
task=scoring_config.task.name,
block=scoring_config.block,
step=scoring_config.step,
task_id=scoring_config.task_id,
task_dict=task_dict,
source=source,
)

)
self.model_scheduler.llm_model_manager.lock.release()
await asyncio.sleep(0.01)

Expand Down
Loading