# Customize LLM for Math Application

In this notebook, we will find the best customized model on the math domain by running multiple training experiment, and picking the best model based on a customized evaluation system. Finally, you can easily deploy the customized model.

We utilize [MathInstruct 🤗](https://huggingface.co/datasets/TIGER-Lab/MathInstruct). MathInstruct is a carefully curated instruction tuning dataset designed to be both lightweight and highly generalizable. It is composed of 13 math rationale datasets, including six newly curated for this project. This dataset uniquely emphasizes the combined use of chain-of-thought (CoT) and program-of-thought (PoT) rationales, providing extensive coverage across various mathematical fields.

Please install Python SDK:

In [None]:
!pip install leeroo-client --upgrade

or install it from source:

In [None]:
!git clone https://github.com/Leeroo-AI/leeroo-client
%cd leeroo-client 
!pip install -e .
%cd .

Leeroo dager supports the following format for training dataset:

```json
[
    {
        "query": QUERY,
        "response": RESPONSE,
    },
    {
        ....
    }
]
```

In [None]:
# prepare seed examples in required format
import json
import datasets
import os
from tqdm import tqdm
from pprint import pprint
from leeroo_client.client import LeerooClient

In [None]:
dataset = datasets.load_dataset("TIGER-Lab/MathInstruct")['train']
## modify the number of training samples here
n_seed_samples = 1000
data = []
for d in tqdm(dataset):
    data.append({'query':d['instruction'],'response':d['output']})
    if len(data) == n_seed_samples:
        break

json.dump(data, open('math_tutor.json', 'w'))
print(len(data))
pprint(data[-1])

Create your API key in [here](http://app.leeroo.com/dashboard), if you don't have one!

In [5]:
leeroo_api_key = "iPGNfxpRIeUCJpJBlpmvjXKlbgSlFJVlfxeoiDRhTqwpYkROFjhaiumTmcoSRNHaRbEDjgiuApmCMVQAbinUNfnATNBLmRxmeDRknwybidSrVMvzyFeQbtng"
client = LeerooClient(
    leeroo_api_key,
)

User: alireza@leeroo.com Logged in!


For designing the workflow of experiments, please provide:

- `evaluation_criteria` (optional): A short description of what are important factors in your mind for scoring the responses of LLM. Just describe them in natural language.
- `workflow_name` : The name of this experiment. This will be later saved along with the id of workflow.  
- `seed_data_path`: The dataset should follow JSON format with `query` and `response` as fields.

In [7]:
evaluation_criteria = \
"""
- Assess clarity of the response by determining if the response is well-structured and easy to understand.
- Evaluate accuracy by verifying the correctness of the mathematical content and solutions.
- Check completeness by ensuring that the response addresses all parts of the question thoroughly.
- Finally, judge pedagogical effectiveness by considering if the explanation is insightful and promotes understanding, 
utilizing examples and step-by-step reasoning where appropriate.

Each response should be rated on these aspects to ensure a comprehensive evaluation.
"""

In [None]:
workflow_configs = client.initialize_workflow_configs(
    evaluation_criteria=evaluation_criteria,
    workflow_name="leeroo_math_tutor",
    seed_data_path="math_tutor.json",
    budget=2 # each experiment needs at least 2 unites of time, you can increase it for running more experiments
) 

workflow_configs
# currently data generation module is turned off. We will have it in few weeks.

You can edit the hyper-parameters of suggested configs:

In [7]:
#workflow_configs['experiment_config']['0']['training_args']['num_train_epochs'] = 1

In [None]:
workflow_configs

🚀 Once you're happy with hyper-parameters, you can submit the training workflow. It will **automatically execute experiments, evaluate them, and pick the best model** based your customized evaluation system!

In [None]:
# Submit workflow for execution
running_workflow_status = client.submit_workflow(
    workflow_configs=workflow_configs
)
print(" Workflow running state:", running_workflow_status)

You can get the status of all your workflows, by running the following command:

- `runing_workflows`: shows the training workflows with `running` status.  
- `finished_workflows`: shows executed workflows

In [None]:
# Retrieve user's workflows
user_workflows = client.all_workflows()

print( f"Total finished workflows : {len(user_workflows['finished_workflows'])}")
print( f"Total running workflows : {len(user_workflows['running_workflows'])}")

user_workflows['running_workflows']

If you need further details on the status of a specific workflow, you can run the following function:

- `status`: overal status of workflow
- `workflow_node_status`: status of all nodes
- `workflow_name`: name of your workflow
- `workflow_running_state_id`: id of your workflow

In [None]:
# Check status of the running workflow
workflow_status = client.get_workflow_status('1723020957')
workflow_status

In [None]:
print(client.print_workflow(workflow_runnning_state_id='1723020957'))

Once the workflow is executed, you can deploy it as:

In [None]:
## Deploy the workflow
workflow_id = '1723020957' 
deployment_status = client.deploy_workflow(
    workflow_id
)
print(deployment_status)

Get the status of deployment by:

In [None]:
client.get_workflow_deployment_status('DeploymentState-1722772956.964921')

In [None]:
# Get Model id
import requests
model_id = requests.get( "http://54.227.170.247:9000/v1/models").json()['data'][0]['id']
model_id

In [None]:
# Inference
url = "http://54.227.170.247:9000/v1/chat/completions"
data = {
    "model": model_id,
    "messages": [{"role": "user", "content": "1, 1, 3, 9, 5, 25, 7, 49 _, 81\nAnswer Choices: (A) 6 (B) 7 (C) 8 (D) 9 (E) 10"}],
    "max_tokens": 500,
    "temperature": 0.9
}
response = requests.post(url, json=data)
print(response.json()['choices'][0]['message'])

Kill the deployed model by running the following command: (you can later deploy it again, if needed)

In [None]:
client.kill_deployment(
    'DeploymentState-1722772956.964921'
)