Skip to content

Conversation

@dbobrenko
Copy link
Collaborator

@dbobrenko dbobrenko commented Jun 14, 2024

Organic Scoring Implementation

Changes

  • This implementation is based on the Generic Organic Scoring framework introduced here.
  • Organic scoring runs in a separate asyncio task alongside current benchmarking tasks.
  • Organic queries are received via an open validator axon and stored in the organic queue.
  • For each organic or synthetic query, a reference answer is generated by the LLM.
  • Rewards and penalties are calculated based on the relevance metric for both organic and synthetic queries, which is defined as the cosine similarity between sentence embeddings of the reference and completions.
  • Currently, LMSys-chat-1m is used for synthetic queries. (TODO: Change to generated conversations by LLM and modified existing organic queries by LLM or query synth data via API)
  • Logging includes elapsed time between steps inside the organic loop, organic queue length, and other default logs used by benchmarking tasks, except prompts and completions, which are excluded from logging into W&B.
  • Validator queries 5 random miners from the network to stream back completions for organic queries (defined in config as neuron.organic_sample_size).
  • Reward step for organic or synthetic queue is triggered every 15 seconds and scaled down to 2 seconds if the organic queue is growing (defined in config as neuron.organic_trigger, neuron.organic_trigger_frequency, and neuron.organic_trigger_frequency_min).

Process Workflow

  1. Trigger Check: Upon triggering the rewarding process, the system checks if the organic queue is empty.
    If the queue is empty, synthetic datasets (defined in organic_scoring/synth_dataset_base.py) are used to bootstrap
    the organic scoring mechanism. Otherwise, samples from the organic queue are utilized.
  2. Data Processing: The sampled data is concurrently passed to the _query_miners and _generate_reference
    methods.
  3. Reward Generation: After receiving responses from miners and any reference data, the information
    is processed by the _generate_rewards method.
  4. Weight Setting: The generated rewards are then applied through the _set_weights method.
  5. Logging: Finally, the results can be logged using the _log_results method, along with all relevant data
    provided as arguments, and default time elapsed on each step of rewarding process.

@dbobrenko dbobrenko self-assigned this Jun 14, 2024
@dbobrenko dbobrenko changed the base branch from feature/organic to staging June 19, 2024 14:51
@dbobrenko dbobrenko changed the title [WIP] Validator axon, organic task, dataset [WIP] Organic Scoring implementation Jul 18, 2024
@dbobrenko dbobrenko changed the title [WIP] Organic Scoring implementation Organic Scoring implementation Jul 18, 2024
@dbobrenko dbobrenko changed the base branch from staging to pre-staging July 18, 2024 11:17
@dbobrenko dbobrenko changed the base branch from pre-staging to staging July 18, 2024 11:17
@dbobrenko dbobrenko changed the base branch from staging to pre-staging July 18, 2024 11:37
@dbobrenko dbobrenko changed the base branch from pre-staging to staging July 18, 2024 11:38
Copy link
Collaborator

@Hollyqui Hollyqui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, do check comments

@dbobrenko dbobrenko merged commit 91af582 into staging Jul 25, 2024
@dbobrenko dbobrenko deleted the feature/organic-task branch August 2, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants