Nome: Fabio Grassiotto  
RA: 890441

# Aula 9_10 - Implementar a ReAct  com LLaMa 3 70B (groq)

- Testar no dataset do IIRC - 50 primeiras perguntas com resposta (test_questions.json em anexo)  
- Usar o prompt do LLaMAIndex: https://github.com/run-llama/llama_index/blob/a87b63fce3cc3d24dc71ae170a8d431440025565/llama_index/agent/react/prompts.py  
- Salvar as respostas finais das 50 perguntas no JSON para exercício futuro de avaliação  
- Instruir o modelo a seguir a sequência Thougth, Action, Input, Observation (a observação não é do próprio modelo, mas resultado da busca)  
- É necessário usar o parâmetro stop_sequence="Observation:", para o modelo parar de gerar texto e esperar o retorno da busca. Implementem o código da busca e retornem os top-k documentos pro modelo (sugestão: k=5).  
- Instruir o modelo agir passo-a-passo (decomposição da pergunta).  
- Podem usar o LangChain, LLaMAindex ou outro framework. Ou implementar na mão.  
- Usar a busca como ferramenta  
- Usar o BM25 como buscador (repetir indexação do exercício passado)  
- Usar a indexação do Visconde: https://github.com/neuralmind-ai/visconde/blob/main/iirc_create_indices.ipyn  


## Setup Environment
### Install packages

In [8]:
%%capture
%pip install -q torch
%pip install groq
%pip install -U sentence-transformers
%pip install faiss-cpu
%pip install spacy
%pip install pandas
%python -m spacy download en_core_web_sm

### Imports

In [9]:
import os
import sys
import torch
import faiss
import json
from transformers import AutoTokenizer, AutoModel
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer
import groq
from groq import Groq
from bs4 import BeautifulSoup
import warnings
warnings.simplefilter('ignore')
from collections import Counter
import string
import re
import spacy
from tqdm import tqdm

### Collab setup

In [10]:
# Colab environment
IN_COLAB = 'google.colab' in sys.modules

if (IN_COLAB):
    # Google Drive
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)

    project_folder="/content/drive/MyDrive/Classes/IA024/Aula_9_10"
    os.chdir(project_folder)
    !ls -la

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(device)

cuda


### Groq API

In [11]:
def load_groq_key():
    try:
        # Open and read the entire content of the file
        with open("groq-key.txt", 'r') as file:
            contents = file.read()
        
        return contents
    
    except FileNotFoundError:
        print(f"The file does not exist.")
        return None
    except Exception as e:
        # Handle other potential exceptions (e.g., permission errors)
        print(f"An error occurred while reading the file: {str(e)}")
        return None
    
groq_key = load_groq_key()
os.environ["GROQ_API_KEY"] = groq_key

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

def groq_chat(content):
    try:

        chat_completion = client.chat.completions.create(
            #
            # Required parameters
            #
            messages=[
                # Set an optional system message. This sets the behavior of the
                # assistant and can be used to provide specific instructions for
                # how it should behave throughout the conversation.
                {
                    "role": "system",
                    "content": "you are a helpful assistant."
                },
                # Set a user message for the assistant to respond to.
                {
                    "role": "user",
                    "content": content,
                }
            ],

            # The language model which will generate the completion.
            model="llama3-70b-8192",

            #
            # Optional parameters
            #

            # Controls randomness: lowering results in less random completions.
            # As the temperature approaches zero, the model will become deterministic
            # and repetitive.
            temperature=0,

            # The maximum number of tokens to generate. Requests can use up to
            # 32,768 tokens shared between prompt and completion.
            #max_tokens=10,

            # Controls diversity via nucleus sampling: 0.5 means half of all
            # likelihood-weighted options are considered.
            top_p=1,

            # A stop sequence is a predefined or user-specified text string that
            # signals an AI to stop generating content, ensuring its responses
            # remain focused and concise. Examples include punctuation marks and
            # markers like "[end]".
            stop=None,

            # If set, partial message deltas will be sent.
            stream=False,
        )

    except groq.APIConnectionError as e:
        print("The server could not be reached")
        print(e.__cause__)  # an underlying Exception, likely raised within httpx.
    except groq.RateLimitError as e:
        print("A 429 status code was received; we should back off a bit.")
    except groq.APIStatusError as e:
        print("Another non-200-range status code was received")
        print(e.status_code)
        print(e.response)
    
    return chat_completion.choices[0].message.content

## Globals

In [12]:
NUM_QUESTIONS = 50
model_name = "sentence-transformers/msmarco-MiniLM-L-6-v3"

## IIRC Dataset

### Load and parse

In [13]:
test_dataset  = json.load(open('dataset/test_questions.json', 'r'))

In [28]:
test_dataset[0]

{'answer': {'type': 'span',
  'answer_spans': [{'text': 'sky and thunder god',
    'passage': 'zeus',
    'type': 'answer',
    'start': 83,
    'end': 102}]},
 'question': 'What is Zeus know for in Greek mythology?',
 'context': [{'text': 'he Palici the sons of Zeus',
   'passage': 'main',
   'indices': [684, 710]},
  {'text': 'in Greek mythology', 'passage': 'main', 'indices': [137, 155]},
  {'text': 'Zeus (British English , North American English ; , Zeús ) is the sky and thunder god in ancient Greek religion',
   'passage': 'Zeus',
   'indices': [0, 110]}],
 'question_links': ['Greek mythology', 'Zeus'],
 'title': 'Palici'}

In [29]:
# Grab the first 50 questions with an answer
questions_to_ask = []
questions_found = 0

documents = []
all_titles = []

for item in test_dataset:
  
  question = item['question']
  answer = item['answer']
  answer_type = answer['type']

  if answer_type == 'binary' or answer_type == 'value':
    final_answer = answer['answer_value']
  elif answer_type == 'span':
    final_answer = answer['answer_spans'][0]['text']
  elif answer_type == 'none':
    final_answer = 'none'
  else:
    final_answer = 'An error perhaps, bad type'
    print(answer_type)

  if (final_answer == 'none'):
    # Skip this one.
    continue
  else:
    # Thats a good question, grap the document and title associated with it.
    
    context_list = item['context']
    for context in context_list:
      text = context['text']
      # clean up html
      soup = BeautifulSoup(text, 'html.parser')
      clean_text = soup.get_text()

      documents.append({
              "title": item['title'],
              "content": clean_text
          }
      )
      all_titles.append(item['title'].lower())

    questions_to_ask.append({"Question": question, "Answer": final_answer})
    questions_found += 1
    if (questions_found == NUM_QUESTIONS):
      # found our questions
      break

In [32]:
documents[:3]

[{'title': 'Palici', 'content': 'he Palici the sons of Zeus'},
 {'title': 'Palici', 'content': 'in Greek mythology'},
 {'title': 'Palici',
  'content': 'Zeus (British English , North American English ; , Zeús ) is the sky and thunder god in ancient Greek religion'}]