# FireAct
## FireAct takes an initial step to show multiple advantages of fine-tuning LMs for agentic uses. In this demo, we will use its method to finetune Llama 13B to generate multi queires which is used in [RAG-Fusion](https://github.com/anyscale/endpoint-cookbook/blob/main/App_RAG_Fusion.ipynb) demo.

### Even with prompt engineering, the Llama2 70B model can NOT generate the multi-queries we needed for RAG Fusion. The example below shows verbose reponse from Llama2 70B model.

### Later, we will show a finetuned Llama2 7B model can achieve the purpose

In [13]:
import openai
import random

msg=f"Generates up to 5 search queries based on a single input query. You should generate queries relevent to the \
original one, output them in bullet points, be precise, no other verbose output needed. Here is the orignal \
query {original_query} and output (4 queries):"

def generate_queries_llama(original_query):
    response = openai.ChatCompletion.create(
        api_base = "https://console.endpoints.anyscale.com/m/v1",
        api_key = "ANYSCALE_API_KEY",
        model="meta-llama/Llama-2-70b-chat-hf",
        messages=[{"role": "user", "content": msg}]
    )

    generated_queries = response.choices[0]["message"]["content"].strip().split("\n")
    return generated_queries

original_query = "What name is given to the explosive death of a star?"
generated_queries = generate_queries_llama(original_query)
generated_queries

['Sure, here are four search queries based on the original query "What name is given to the explosive death of a star?":',
 '',
 '* "Types of star explosions"',
 '* "Stellar explosions and their names"',
 '* "Supernova vs. hypernova: what\'s the difference?"',
 '* "The science behind a star\'s explosive death"',
 '',
 'I hope these queries are helpful! Let me know if you need any further assistance.']

### Now let's use the code provided in FireAct to generate training datasets and finetune Anyscale Endpoint. Here are the step:
1. Clone the [repo](https://github.com/anchen1011/FireAct)  and add following contents  
- `data/triviaqa/dev.json` (Download [TriviaQA](https://nlp.cs.washington.edu/triviaqa/) dataset)
- `tasks/__init__.py` and `tasks/triviaqa.py` (work with TriviaQA datasets)
- `prompts/triviaqa_multiqueries.txt` (few-shot examples for multi-query generation)
2. Run `python generation-triviaqa.py` to generate training data

In [None]:
!python generation-triviaqa.py

3. Training dataset will be generated under `trajs` folder. Convert from JSON to GPT-format JSONL

In [None]:
import json
JSON_PATH = "trajs/triviaqa_dev_0_300_gpt-4_0.0_2023-10-26-19-34-02.json"

# Load the dataset
with open(JSON_PATH, 'r', encoding='utf-8') as f:
    items = json.load(f)

sys_msg="Generates multiple search queries based on a single input query."
outfile =  open('trivia.jsonl', 'w')
entry={}
for i, idx in enumerate(items):
    item = items[idx] 
    entry['messages']=[]
    entry['messages'].append({'role':'system','content':sys_msg})
    entry['messages'].append({'role':'user','content':item["Query"]})
    entry['messages'].append({'role':'assistant','content':"\\n".join(item["Queries"])})
    
    json.dump(entry, outfile)
    outfile.write('\n')

4. Create training and validation datasets

In [7]:
DATA_PATH = "trivia.jsonl"
# Load the dataset
with open(DATA_PATH, 'r') as f:
    items = [json.loads(line) for line in f]
threshold = int(len(items) * .85)
with open('trivia_train.jsonl', 'w') as outfile:
    for i, entry in enumerate(items):
        if i<threshold:
            json.dump(entry, outfile)
            outfile.write('\n')
            
with open('trivia_val.jsonl', 'w') as outfile:
    for i, entry in enumerate(items):
        if i>=threshold:
            json.dump(entry, outfile)
            outfile.write('\n')

5. Finetune with Anyscale Endpoint Llama2 7B model

In [None]:
from finetune_utils import finetune_check
finetune_check('./trivia_train.jsonl')
finetune_check('./trivia_val.jsonl')

In [4]:
from finetune_utils import finetune_run
finetune_run('./trivia_train.jsonl','./trivia_val.jsonl',
             token='ANYSCALE_API_TOKEN', 
             model='meta-llama/Llama-2-7b-chat-hf', 
             suffix='multiqueries')

## Let's try to generate multi-queries again with the FT model and the results is exactly in the format of what we need and can be easily converted into Python List for further processings. 

In [19]:
import openai
def generate_queries_llama(original_query):
    msg=f"Generates up to 5 search queries based on a single input query. \
Here is the orignal query {original_query} and output (4 queries):"
    response = openai.ChatCompletion.create(
        api_base = "https://console.endpoints.anyscale.com/m/v1",
        api_key = "ANYSCALE_API_KEY",
        model="meta-llama/Llama-2-7b-chat-hf:SUFFIX:ID",
        messages=[{"role": "user", "content": msg}]
    )

    generated_queries = response.choices[0]["message"]["content"].strip().split("\\n")
    return generated_queries

original_query = "What name is given to the explosive death of a star?"
generated_queries = generate_queries_llama(original_query)
for generated_query in generated_queries:
    print(generated_query)

- Explosive death of a star name
- What is the technical term for exploding star?
- What do we call the explosion of a star?
- Name for the fatal star explosion
- What is the scientific term for a star explosion?
