v2.7.2 #363

dbobrenko · 2024-09-03T13:46:33Z

Patch v2.7.2 changes

Add signature to W&B logs to ensure authenticity of logs.
Increase Multiple Choice task rate from 0.05 to 0.2.

Hot fix/redirect wandb

* Point to macrocosmos entity * Adjust project name * Increase spec version to 2.5.2 Co-authored-by: bkb2135 <98138173+bkb2135@users.noreply.github.com>

* Point to macrocosmos entity * Adjust project name * Increase spec version to 2.5.2

* v2.5.2 (#287) * Point to macrocosmos entity * Adjust project name * Increase spec version to 2.5.2 * correct image switching based on system color scheme refs #302 --------- Co-authored-by: bkb2135 <98138173+bkb2135@users.noreply.github.com>

…ery (#304) * Retry creation of challenge query multiple times

Organic Scoring Implementation Changes: - This implementation is based on the Generic Organic Scoring framework introduced [here](macrocosm-os/organic-scoring#1). - Organic scoring runs in a separate `asyncio` task alongside current benchmarking tasks. - Organic queries are received via an open validator axon and stored in the organic queue. - For each organic or synthetic query, a reference answer is generated by the LLM. - Rewards and penalties are calculated based on the `relevance` metric for both organic and synthetic queries, which is defined as the cosine similarity between sentence embeddings of the reference and completions. - Augmented LMSys-Chat is used for synthetic queries. - Logging includes elapsed time between steps inside the organic loop, organic queue length, and other default logs used by benchmarking tasks, except prompts and completions, which are excluded from logging into W&B. - Validator queries 5 random miners from the network to stream back completions for organic queries (defined in config as `neuron.organic_sample_size`). - Reward step for organic or synthetic queue is triggered every 15 seconds and scaled down to 2 seconds if the organic queue is growing (defined in config as `neuron.organic_trigger`, `neuron.organic_trigger_frequency`, and `neuron.organic_trigger_frequency_min`). Process Workflow: 1. **Trigger Check**: Upon triggering the rewarding process, the system checks if the organic queue is empty. If the queue is empty, synthetic datasets (defined in `organic_scoring/synth_dataset_base.py`) are used to bootstrap the organic scoring mechanism. Otherwise, samples from the organic queue are utilized. 2. **Data Processing**: The sampled data is concurrently passed to the `_query_miners` and `_generate_reference` methods. 3. **Reward Generation**: After receiving responses from miners and any reference data, the information is processed by the `_generate_rewards` method. 4. **Weight Setting**: The generated rewards are then applied through the `_set_weights` method. 5. **Logging**: Finally, the results can be logged using the `_log_results` method, along with all relevant data provided as arguments, and default time elapsed on each step of rewarding process.

* Hotfix undefined HumanAgent.challenge_time * Set begin_conversation to True for organics

Fix Wikipedia broken sections

Restart when an error is encountered in the get_block function. Errors when making substrate calls usually result in the validator failing quietly, often requiring a manual restart. This PR is intended to catch errors originating from calls to the Bittensor package, raise them as BittensorError, and then restart.

- Enable organic scoring weight setting. - Fix bittensor WASM errors by switching to another bittensor branch.

Fix 'MockPipeline' object has no attribute 'generate' errors when using --neuron.model_id mock.

Fix AttributeError (no attribute ‘isdigit’) for Wikipedia Summary and Date.

…-uids Log organic miner UIDs to wandb

Merge main into staging (#329)

Changes: - New multi-choice benchmarking task; - Refactor changes (.env config-based, decoupled parts of the code); - Poetry setup; - Only 5 tasks are included: QA, DateQA, Summary, MultiChoice, Organic.

Add hotkey signature to the wandb run for multi-choice verification

Changes: - Bump v2.7.2. - Raise multi-choice probability from 0.05 to 0.2.

Hollyqui

LGTM

cassova

lgtm

bkb2135 and others added 30 commits July 12, 2024 10:28

Point to macrocosmos entity

c4988be

Adjust project name

bdad04c

Increase spec version to 2.5.2

b914060

Merge pull request #286 from macrocosm-os/hot-fix/redirect-wandb

5f5f9d3

Hot fix/redirect wandb

v2.5.2 (#287) (#293)

75546a6

* Point to macrocosmos entity * Adjust project name * Increase spec version to 2.5.2 Co-authored-by: bkb2135 <98138173+bkb2135@users.noreply.github.com>

v2.5.2 (#287) (#298)

2c35a27

* Point to macrocosmos entity * Adjust project name * Increase spec version to 2.5.2

Raised an exception after trying 3 times to generate the challenge qu…

bfdb5fb

…ery (#304) * Retry creation of challenge query multiple times

Hotfix undefined challenge_time (#308)

3142088

* Hotfix undefined HumanAgent.challenge_time * Set begin_conversation to True for organics

Raise version to 2.6.0 (#310)

02dc70f

Remove Reliance on Wikipedia Sections (#306)

b643f10

Fix Wikipedia broken sections

Resolve merge conflicts with main branch (#311)

142c19b

Resolve merge conflict with main (#312)

8abe06c

Merge branch 'main' into staging

6346a72

Fix Dataset Creation (#314)

2407921

Fix Wiki IndexError and uninstall uvloop (#315)

592b345

Enable organic weight set, fix WASM errors (#324)

3582d67

- Enable organic scoring weight setting. - Fix bittensor WASM errors by switching to another bittensor branch.

Fix Mock model and pipeline (#326)

3a1efb0

Fix 'MockPipeline' object has no attribute 'generate' errors when using --neuron.model_id mock.

Fix wikipedia summary and date (#325)

2914af4

Fix AttributeError (no attribute ‘isdigit’) for Wikipedia Summary and Date.

Increase version to 2.6.1 (#327)

b63beed

Merge branch 'main' into staging

42e4974

Remove code duplicate (#330)

2851ebf

Add miner UIDs to wandb

42f4ba1

Add miner UIDs wandb logging

a613d7b

Merge pull request #331 from macrocosm-os/feature/SN1-173-log-organic…

c0d8841

…-uids Log organic miner UIDs to wandb

Merge pull request #332 from macrocosm-os/main

f8eddc9

Merge main into staging (#329)

Hotfix: add content check for summary (#333)

ecb3378

Staging v2.7.0 (#352)

88eca33

Changes: - New multi-choice benchmarking task; - Refactor changes (.env config-based, decoupled parts of the code); - Poetry setup; - Only 5 tasks are included: QA, DateQA, Summary, MultiChoice, Organic.

dbobrenko and others added 4 commits August 28, 2024 01:24

Merge main into staging branch (#354)

5459ddf

Merge branch 'main' into staging

0e91e8d

Add Signature of Hotkey to WANDB run (#360)

e69fc1d

Add hotkey signature to the wandb run for multi-choice verification

v2.7.2 (#362)

4a5cd1a

Changes: - Bump v2.7.2. - Raise multi-choice probability from 0.05 to 0.2.

dbobrenko self-assigned this Sep 3, 2024

Hollyqui approved these changes Sep 3, 2024

View reviewed changes

bkb2135 approved these changes Sep 3, 2024

View reviewed changes

cassova approved these changes Sep 3, 2024

View reviewed changes

dbobrenko merged commit 91f1fc0 into main Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.7.2 #363

v2.7.2 #363

Uh oh!

dbobrenko commented Sep 3, 2024 •

edited

Loading

Uh oh!

Hollyqui left a comment

Uh oh!

cassova left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

v2.7.2 #363

v2.7.2 #363

Uh oh!

Conversation

dbobrenko commented Sep 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Patch v2.7.2 changes

Uh oh!

Hollyqui left a comment

Choose a reason for hiding this comment

Uh oh!

cassova left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dbobrenko commented Sep 3, 2024 •

edited

Loading