# Macrocosmos Programming Assessment

## Task

In bittensor, there are miners & validators. A validator sends out a challenge to the miners, which they then try to solve. Validators then evaluate the miners and reward them. 

In this assessment, it is your job to create the core logic for a miner on a simplified version of 'Subnet 1'. That means the whole challenge will revolve around a single miner, 
for which you write the "forward" function.

```python
class ExampleMiner(BaseModel):
    # The forward function is the core logic of your miner. It takes in a synapse, which contains the query, and returns a synapse with the response.
    async def forward(self, synapse: Synapse) -> Synapse:
        print(synapse.query) # print the query
        # your logic goes here
        synapse.response = your_result # set the response
        return synapse
```

<br />
<br />
You miner will be sent two types of questions:

1) One about wikipedia articles where your miner needs to respond as factually as possible. \
A challenge might be something like: `How fast do elephant shrews run and what weight are they?`. You must then return the optimal answer, e.g.: \
`Elephant shrews have been recorded to reach speeds of 28.8 kilometres per hour (17.9 mph).[8] They vary in size from about 10 to 30 centimetres (3.9 to 11.8 in), from 50 to 500 grams (1.8 to 17.6 oz). One species of giant sengi, the grey-faced sengi, weighs about 700 g.`
The wikipedia questions are scored based on the cosine similarity between the reference answer and your response.

2) The second type are basic math questions, where you need to respond with the correct number. Your reward is either 0 or 1 depending on whether the number is correct.\
<br />
<br />


You are not expected to find an optimal solution, nor to do any clever prompt engineering (although you can if you want to). This programming test is meant to evaluate your skills at writing clean, legible python code, as well as understanding of some more advanced concepts.

As a miner, it is your interest to maximise reward, so you may apply more advanced techniques like async programming etc. to improve your miner's performance. This assignment is there for you to be able to show us your skills & creative thinking, so pretty much anything is allowed (even if it feels like cheating/an exploit).


## Testing setup:

Your miner's 'forward' function acts as if it is an API endpoint which will be called every 0.2 seconds, for a total duration of 20 seconds - meaning you will receive 100 requests total. This is regardless of your response time - so even if your function runs for 2 seconds, it will be called once ever 0.2 seconds. Your miner contains a 'busy' parameter, which you can set to True if you would like to not receive a request (e.g. to avoid hitting API limits). An example workflow might look like this

```python
async def forward(synapse: Synapse) -> Synapse:
    self.busy = True # avoid getting new requests whilst your computation runs
    # do your main computation here
    synapse.response = your_computed_response # set the response so the validator can evaluate your response
    self.busy = False # make yourself available again
    return synapse # return the response
```

 All requests that have completed after those 20 seconds will count towards your reward. For instance, if you respond to 30 requests with an average score of 0.3 per request, your overall score would be 30 * 0.3 = 9.

 

## Dataset

Your queries will be sent from a dataset shown in dataset.csv. We advise you to have a look at the dataset first to understand what types of questions you'll be getting!

## APIs

You will be given access to two very basic (mock) APIs - a chat API (GPT wrapper) as well as a web search API (uses duckduckgo). You chat API has the following endpoint:

`openai_api.get_openai_response(prompt: str)` 

which will simply take an input string as a query and respond with an output string using GPT. An average call to this API endpoint takes about 0.2 seconds.

Similarly, your web search api has one endpoint as well

`web_search_api.search_and_scrape(query: str, n_results: int)`

which will return a list of strings containing the website content of the first N results. An average call to this API endpoint takes about 2 seconds.

Examples for how both endpoints are called can be found further down in the API section. Both APIs may raise errors occasionally & have a rate limit of 5 concurrent requests.

NOTE: Async endpoints for both APIs are available, async_get_openai_response & async_search_and_scrape which can be called as

`await openai_api.async_get_openai_response(prompt: str)` 


## Rules

You may only modify the miner.py file. You may look at the dataset and take advantage of patterns/issues in it, but you may not load the dataset (e.g. to simply take the correct responses from there)

## Important Notes/Quirks of the test

- You may run into issues if your internet connection is bad. If you have connection errors/actual API errors (that are not our 'Random API error' and the custom 'RateLimit error') please contact your interviewer
- This test cannot be run inside this jupyter notebook. Please modify the miner.py file and then run the main.py file to run it.

## APIs

### Chat API

In [3]:
from chat_api import openai_api

# sync:
print(openai_api.get_openai_response("What is an elephant shrew in 20 words or less?"))

# async:
print(await openai_api.async_get_openai_response("What is an elephant shrew in 20 words or less?"))

An elephant shrew is a small mammal that is native to Africa and resembles a shrew but has a unique trunk-like snout.
An elephant shrew is a small, insect-eating mammal with a long nose found in Africa and Madagascar.


### Web Search API

In [1]:
from web_search_api import web_search_api

# sync
print(web_search_api.search_and_scrape(query="Elephant Shrew"))

# async
print(await web_search_api.async_search_and_scrape(query="Elephant Shrew"))

['What is an elephant shrew?\nElephant shrews are not, in fact, shrews. Recent evidence suggests that they are more closely related to a group of African mammals that includes elephants, sea cows, and aardvarks. Elephant shrews (also called sengis) are represented by a single family, the Macroscelididae, including four genera and 19 living species.\nThey take their name from their long pointed head and very long, mobile, trunk-like nose. They have rather long, legs for their size, which move in a hopping fashion like rabbits. They have a hunchbacked posture and a long, scaly tail. A gland on the underside of the tail produces a strong scent used to mark territories. This musky smell serves as a deterrent against many carnivores.\nRhynchocyon cirnei\n25 to 700 grams depending on the species (1 to 24 ounces)\n22 to 30 centimeters long, not including tail (9 to 12 inches)\n2 to 4 years\nDense forest to open plains\nInsectivorous\n45 to 60 days\nSnakes, birds of prey, various carnivores\nC

Exception: Random API error occurred

### Example Rate Limit Error

When requesting too many simultaneous calls, the API will throw a rate limit error

In [4]:
import asyncio
await asyncio.gather(*[openai_api.async_get_openai_response(prompt="How are you?") for _ in range(10)])

SimultaneousRequestLimitError: Maximum number of simultaneous requests reached. Please try again later.

## Dataset

In [1]:
import pandas as pd

df = pd.read_csv("./dataset.csv")
challenges, references = [x.strip() for x in list(df["challenge"])], [x.strip() for x in list(df["reference"])]
df

Unnamed: 0.1,Unnamed: 0,challenge,reference
0,1,\n\nWhat type of vehicles is AM General best k...,"\n\nAccording to the context, AM General is be..."
1,2,\n\nWhat year did Ataş make his professional l...,"\n\nAccording to the context, Ataş made his pr..."
2,3,\n\nWhat is the format of the DVDs in the 4-di...,\n\nThe DVDs in the 4-disc set of The Swiss Fa...
3,4,\n\nWhat was Evins' role in his family's oil c...,"\n\nUnfortunately, the provided context does n..."
4,7,\n\nWhat year did the Imperial Tobacco Company...,"\n\nAccording to the context, the Imperial Tob..."
5,10,\n\nWhat is the time period examined in the bo...,"\n\nThe time period examined in the book ""The ..."
6,11,\n\nWhat was the name given to the site of the...,\n\nThe name given to the site of the former C...
7,14,\n\nWhat prompted the formation of the British...,\n\nThe British Universities Ice Hockey Associ...
8,15,\n\nWhat was the population of Avaldsnes after...,"\n\nAccording to the context, after the rural ..."
9,16,\n\nWhat was the reason for the delay in Simba...,"\n\nAccording to the context provided, after s..."


## Example Miner

Here a little demonstration of a _very_ simple miner implemenation. It's your job to make this miner as good as possible by e.g.
- Making the forward function make use of the internet to give more factual outputs
- Improving the prompting so the responses more closely match the dataset
- Implementing ways to respond to more requests (e.g. async programming or similar)
- Ensuring all errors/exceptions are properly caught

In [13]:
from miner import Miner
from validator import Validator
from synapse import Synapse
import asyncio
from chat_api import openai_api
from pydantic import BaseModel
from loguru import logger
import time

class Miner(BaseModel):
    busy: bool = False

    
    # you can use both sync or async programming. If you decide to use async programming make sure you use not blocking
    # operations like e.g. time.sleep()
    async def forward(self, synapse: Synapse) -> Synapse:
        """Implement your own forward function here"""

        self.busy = True
        synapse.response = await openai_api.async_get_openai_response(prompt=synapse.query)
        # synapse.response = "I'm sorry, I don't know the answer to that question."
        self.busy = False
        return synapse
    
# async def main():
validator = Validator(miner=Miner(), show_logs=True)
logger.info("Validator created")
await validator.start()



[32m2024-10-31 16:42:46.047[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m37[0m - [1mValidator created[0m
[32m2024-10-31 16:42:46.049[0m | [1mINFO    [0m | [36mvalidator[0m:[36mstart[0m:[36m82[0m - [1mStarting validator[0m
[32m2024-10-31 16:42:46.550[0m | [1mINFO    [0m | [36mvalidator[0m:[36mrun_step[0m:[36m42[0m - [1mRequesting miner on step 1[0m
[32m2024-10-31 16:42:47.054[0m | [1mINFO    [0m | [36mvalidator[0m:[36mrun_step[0m:[36m42[0m - [1mRequesting miner on step 2[0m
[32m2024-10-31 16:42:47.556[0m | [1mINFO    [0m | [36mvalidator[0m:[36mrun_step[0m:[36m42[0m - [1mRequesting miner on step 3[0m
[32m2024-10-31 16:42:48.059[0m | [1mINFO    [0m | [36mvalidator[0m:[36mrun_step[0m:[36m42[0m - [1mRequesting miner on step 4[0m
[32m2024-10-31 16:42:48.563[0m | [1mINFO    [0m | [36mvalidator[0m:[36mrun_step[0m:[36m42[0m - [1mRequesting miner on step 5[0m
[32m2024-10-31 16:42:49.065[0m | [1mIN