**INSTALLAZIONE DELLE LIBRERIE NECESSARIE**
- *Le librerie sono state installate in un env di anaconda* 

*Librerie necessarie*
- !pip install llama-index 
- !pip install openai 
- !pip install langchain 
- !pip install gpt-index 
- !pip install PyPDF2 
- !pip install pypdf

***INDICE***
1. Caricamento file
2. Caricamento dei testi dei file in oggetti Document()
3. Set up del LLM
4. Creazione degli indici
5. Caricamento indici da disco
6. Composizione di un grafo per creare una risposta considerando più documenti
7. Set up di: Tool e Langchain ChatBot Agent
   - *7.1 - Configurazione per ogni indice*
8. Creazione del LangChain Chatbot Agent
9. Test dell'agente
   - *9.1 - Cancellazione della memoria dell'agente*
10. Chatbot Loop

**1. CARICAMENTO FILE**

In [15]:
import os
files = os.listdir("./documenti")          # otteniamo la lista di tutti i file contenuti nella directory di lavoro di colab

lista_pdf = []
for file in files:              # col ciclo otterremo la lista "lista_pdf" che conterrà i file in formato 
    if ".pdf" in file:          # pdf presenti nella direrctory
        lista_pdf.append(file)

print(lista_pdf)

['Chain of thoughts.pdf', 'NEUCOM-D-23-00321.pdf', 'Prompt patterns.pdf']


*NOTA:*\
Se in un'esecuzione preceedente abbiamo già creato gli indici di cui abbiamo bisogno e li abbiammo salvati su disco, una volta che abbiamo caricato la lista di file (eseguendo la cella *1. Caricamento file*), possiamo salatare lo step 2 e passare allo step *3. Set up del LLM* per caricare la chiave di OpenAI.

In questo modo potremo saltare lo step 4, in cui vengono creati gli indici (sempre nell'ipotesi in cui, in precedeza, abbiamo già creato e salvato gli indice dei file che vogliamo interrogare).

Poi eseguiremo gli step necessari dal 5 in poi.

**2. CARICAMENTO DEI TESTI DEI FILE IN OGGETTI DOCUMENT()**

In [16]:
from llama_index.readers import SimpleDirectoryReader     # importiamo SimpleDirectoryReader che ci permetterà di leggere il file ceh desideriamo


doc_set = {}              # insieme che conterrà coppie "chiavi:valori" -> nome del file : [Document(text = '')]     perr ogni file caricato
all_docs = []             # Lista che conterrà i testi di ciascun file come elementi

i = 0

for nomefile in lista_pdf:
  # "input_dir" -> innseriamo la directory "documenti" che contiene i nostri pdf
  # "required_exts": imposta l'estensione di cui abbiamo bisogno (anche qui deve essere una lista 
  
  pdf_docs = SimpleDirectoryReader(input_dir = "./documenti/", required_exts = [".pdf"]).load_data()        # pdf_docs -> variabile che contiene il testo del documento letto da SimpleDirectoryReader
  
  i+=1
  # insert nomefile metadata into each file
  for d in pdf_docs:
    d.extra_info = {"nome file": nomefile}            # inserendo il nome del file come metadata permetterà un'interrogazione più semplice
    
  doc_set[nomefile] = pdf_docs
  all_docs.extend(pdf_docs)

In [17]:
print(doc_set)

{'Chain of thoughts.pdf': [Document(text='Chain-of-Thought Prompting Elicits Reasoning\nin Large Language Models\nJason Wei Xuezhi Wang Dale Schuurmans Maarten Bosma\nBrian Ichter Fei Xia Ed H. Chi Quoc V . Le Denny Zhou\nGoogle Research, Brain Team\n{jasonwei,dennyzhou}@google.com\nAbstract\nWe explore how generating a chain of thought —a series of intermediate reasoning\nsteps—signiﬁcantly improves the ability of large language models to perform\ncomplex reasoning. In particular, we show how such reasoning abilities emerge\nnaturally in sufﬁciently large language models via a simple method called chain-of-\nthought prompting , where a few chain of thought demonstrations are provided as\nexemplars in prompting.\nExperiments on three large language models show that chain-of-thought prompting\nimproves performance on a range of arithmetic, commonsense, and symbolic\nreasoning tasks. The empirical gains can be striking. For instance, prompting a\nPaLM 540B with just eight chain-of-though

**3. SET UP DEL LLM**

In [18]:
import os

# importiamo la chiave che sarà necessaria per impostare il LLM da usare, la nostra è una chiave di OpenAI
os.environ['OPENAI_API_KEY'] = 'sk-...Iy'     

''' provare ad usare MockLLMPredictor per ottenere il numero di token usati -> from llama_index import MockLLMPredictor''' 

# Set up del LLM
from llama_index import LLMPredictor, GPTVectorStoreIndex, PromptHelper, GPTListIndex, ServiceContext 
from langchain.chat_models import ChatOpenAI

# Defining LLm
llm_predictor = LLMPredictor(llm = ChatOpenAI(temperature = 0, model_name = 'gpt-3.5-turbo', openai_api_key= os.environ["OPENAI_API_KEY"]))


# Define PromptHelper
## Set Maximum input size
max_input_size = 4096

## Set max number of output tokens
num_output = 256

## Sex maximum chunk overlap
max_chunk_overlap = 20


prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)       # salviamo le impostazioni per il prompting del modello


# Costruiamo il service context che sarà sfruttato per creare l'indice
service_context = ServiceContext.from_defaults(llm_predictor = llm_predictor,     # impostiamo il contesto di rferimento, ovvero il modello e le impostazioni
                                               prompt_helper = prompt_helper)     # per il suo prompting

**4. CREAZIONE DEGLI INDICI**

In [19]:
# initialize simple vector indices + global vector index

# index_set è un insieme di indici, questo ci potrà tornare utile per caricare i vari indici in unico grafo

index_set = {}                    # questo insieme conterrà coppie chiavi:valori nella forma -> "nomefile": indice
for nomefile in lista_pdf:
    
  # Creazione di un indice per ogni documento
  cur_index = GPTVectorStoreIndex.from_documents(doc_set[nomefile], service_context=service_context)  # GPTVectorStoreIndex creerà l'indice basandosi sull'oggetto Document(text ='') del file selezionato

  index_set[nomefile] = cur_index                                                                     # Carichiamo nell'insieme la chiave "nomefile" e il relativo valore "cur_index"

  # Salviamo ogni indice come file .json
  cur_index.storage_context.persist(persist_dir = f'''./storage/index_{nomefile.replace(".pdf","")}.json''')

**5. CARICAMENTO INDICI DA DISCO**

In [20]:
from llama_index import StorageContext, load_index_from_storage


# Load indices from disk
index_set = {}
for nomefile in lista_pdf:
  # rebuild storage context
  storage_context = StorageContext.from_defaults(persist_dir = f'''./storage/index_{nomefile.replace(".pdf","")}.json/''')

  # load index
  cur_index = load_index_from_storage(storage_context)

  index_set[nomefile] = cur_index
print(index_set)

{'Chain of thoughts.pdf': <llama_index.indices.vector_store.base.GPTVectorStoreIndex object at 0x000002738491FA10>, 'NEUCOM-D-23-00321.pdf': <llama_index.indices.vector_store.base.GPTVectorStoreIndex object at 0x0000027384892790>, 'Prompt patterns.pdf': <llama_index.indices.vector_store.base.GPTVectorStoreIndex object at 0x00000273848ADC90>}


**6. COMPOSIZIONE DI UN GRAFO PER CREARE UNA RISPOSTA DA CONSIDERANDO PIU' DOCUMENTI**

Interrogando il grafo che sarà creato, verranno dapprima recuperate informazioni da ciascun indice che è stato creato, per poi
eventualmente combinare insieme le informazioni e creare la risposta

In [21]:
from llama_index import GPTListIndex, LLMPredictor, ServiceContext
from langchain.chat_models import ChatOpenAI
from llama_index.indices.composability import ComposableGraph

# describe each index to help traversal of composed graph
index_summaries = [f"Information about the file {nomefile}" for nomefile in lista_pdf]


# Qui definiamo il LLM per l'interrogazione del grafo
# define an LLMPredictor set number of output tokens (possiamo aumentare il numero di token di output)
llm_predictor = LLMPredictor(llm = ChatOpenAI(temperature = 0),) #max_tokens = 512)
                                              

service_context = ServiceContext.from_defaults(llm_predictor = llm_predictor)


# define a list index over the vector indices
# allows us to synthesize information across each index

graph = ComposableGraph.from_indices(GPTListIndex,
                  [index_set[y] for y in lista_pdf],
                  index_summaries = index_summaries,
                  service_context = service_context,
                  storage_context = storage_context)

root_id = graph.root_id
print(root_id)

7ea86540-12b3-40a8-a9cf-496b4ff84710


In [22]:
# Possiamo salvare il grafo creato sul disco, così potremmo utilizzarlo in futuro

from llama_index import load_graph_from_storage

# [optional] save to disk
storage_context.persist(persist_dir=f'./storage/root')

In [23]:
# [optional] load from disk, so you don't need to build graph from scratch
graph = load_graph_from_storage(root_id = root_id,
                                service_context = service_context,
                                storage_context = storage_context)

**7. SET UP DI: TOOL E LANGCHAIN CHATBOT AGENT**

We use Langchain to setup the outer chatbot agent, which has access to a set of Tools. 
LlamaIndex provides some wrappers around indices and graphs so that they can be easily used within a Tool interface.

We want to define a separate **Tool** for each index (corresponding to a given document), as well as the graph.\
We can define all tools under a central **LlamaToolkit** interface.

Below, we define a **GraphToolConfig** for our graph. Note that we also import a **DecomposeQueryTransform** module
for use within each vector index within the graph - this allows us to “decompose” the overall query into a query that
can be answered from each subindex

In [24]:
# IMPORT 

from langchain.agents import Tool
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent
from llama_index.langchain_helpers.agents import LlamaToolkit, create_llama_chat_agent, IndexToolConfig

In [25]:
# define a decompose transform
from llama_index.indices.query.query_transform.base import DecomposeQueryTransform

# DecomposeQueryTransform: permette di scomporre/riformulare la domanda che viene fatta in modo tale da
# poter essere applicata ad ogni indice nel momento in cui viene usato il grafo

decompose_transform = DecomposeQueryTransform(llm_predictor, verbose = True)



# Define Custom Retrievers
from llama_index.query_engine.transform_query_engine import TransformQueryEngine


# Qui proviamo a definire quale sarà il modo in cui avverrà la query e il modo in cui verrà costruita la risposta
custom_query_engines = {}
for index in index_set.values():
  
  query_engine = index.as_query_engine()
  query_engine = TransformQueryEngine(query_engine,
                                      query_transform = decompose_transform,
                                      transform_extra_info = {'index_summary': index.index_struct.summary})
  
  
  custom_query_engines[index.index_id] = query_engine
  
custom_query_engines[graph.root_id] = graph.root_index.as_query_engine(response_mode = "tree_summarize", verbose = True)


# Tool Config
graph_config = IndexToolConfig(query_engine = query_engine,
                               name = f"Graph Index",
                               description = "useful for when you want to answer queries about a single document or more given documents",
                               tool_kwargs = {"return_direct": True})

*7.1 - Configurazione per ogni indice*

In [26]:
# Besides the GraphToolConfig object, we also define an IndexToolConfig corresponding to each index:
# define toolkit
index_configs = []

for y in lista_pdf:
  query_engine = index_set[y].as_query_engine(similarity_top_k = 3)

  tool_config = IndexToolConfig(query_engine = query_engine,
                                name = f"Vector Index {y}",
                                description = f"useful for when you want to answer queries about the {y}.",
                                index_query_kwargs = {"similarity_top_k": 3},      
                                tool_kwargs = {"return_direct": True})

  index_configs.append(tool_config)

print(index_configs)

[IndexToolConfig(query_engine=<llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine object at 0x0000027384941050>, name='Vector Index Chain of thoughts.pdf', description='useful for when you want to answer queries about the Chain of thoughts.pdf.', tool_kwargs={'return_direct': True}), IndexToolConfig(query_engine=<llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine object at 0x00000273849226D0>, name='Vector Index NEUCOM-D-23-00321.pdf', description='useful for when you want to answer queries about the NEUCOM-D-23-00321.pdf.', tool_kwargs={'return_direct': True}), IndexToolConfig(query_engine=<llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine object at 0x0000027384920B90>, name='Vector Index Prompt patterns.pdf', description='useful for when you want to answer queries about the Prompt patterns.pdf.', tool_kwargs={'return_direct': True})]


In [28]:
toolkit = LlamaToolkit(index_configs = index_configs + [graph_config])


**8. CREAZIONE DEL LANGCHAIN CHATBOT AGENT**

Grazie alla funzione *create_llama_chat_agent* creiamo il chatbot, questo sfrutterà i Tool definiti precedentemente

In [29]:
memory = ConversationBufferMemory(memory_key = "chat_history")

llm = ChatOpenAI(temperature = 0)  # ,max_tokens = 1024)


agent_chain = create_llama_chat_agent(
  toolkit,
  llm,
  memory=memory,
  verbose=True,
  )

**9. TEST DELL'AGENTE**

In [None]:
agent_chain.run(input = "What are the main takeaways of the chain of thoughts file?")

In [16]:
#Proviamo ad attivare il grafo

agent_chain.run(input = "Compare the files Chain of thoughts.pdf and Pormpt patterns.pdf.")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: Vector Index Chain of thoughts.pdf
Action Input: Compare the files Chain of thoughts.pdf and Prompt patterns.pdf[0m
Observation: [36;1m[1;3m
The files Chain of thoughts.pdf and Prompt patterns.pdf both discuss the use of prompt patterns in the context of large language models (LLMs). Chain of thoughts.pdf focuses on the use of prompt patterns to automate software development tasks, while Prompt patterns.pdf provides an overview of prompt patterns, including their structure, classifications, and intent. Both files discuss the importance of prompt patterns and how they can be used to create larger and more complex capabilities.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


'\nThe files Chain of thoughts.pdf and Prompt patterns.pdf both discuss the use of prompt patterns in the context of large language models (LLMs). Chain of thoughts.pdf focuses on the use of prompt patterns to automate software development tasks, while Prompt patterns.pdf provides an overview of prompt patterns, including their structure, classifications, and intent. Both files discuss the importance of prompt patterns and how they can be used to create larger and more complex capabilities.'

*9.1 - Cancellazione della memoria dell'agente*

Quando cambiamo l'input in *agent_chain.run(input = "")* ed eseguiamo nuovamente la cella, l'input e la risposta precedenti sono comunque salvati nella memoria dell'agente.\
Questo può comportare, dopo molte esecuzioni, degli errori legati alla lunghezza del contesto oppure perché il bot "ricordando" le conversazioni avute in precedenza, potrebbe fornire delle risposte sbagliate.

In [17]:
#Possiamo visualizzare il contenuto di agent_chain e la sua memoria
print(agent_chain)

memory=ConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='Compare the files Chain of thoughts.pdf and Pormpt patterns.pdf. What are the two main differences beetwen them?', additional_kwargs={}, example=False), AIMessage(content='\nThe question is not clear. It is possible that the question is asking about the pattern of prompts used in the Chain of Thoughts.pdf document. In this case, the answer is that the prompts used in the Chain of Thoughts.pdf document are formatted as questions and answers, with each exemplar augmented with a chain of thought for an associated answer.', additional_kwargs={}, example=False), HumanMessage(content='Compare the files Chain of thoughts.pdf and Pormpt patterns.pdf.', additional_kwargs={}, example=False), AIMessage(content='\nThe files Chain of thoughts.pdf and Prompt patterns.pdf both discuss the use of prompt patterns in the context of large language models (LLMs). Chain of thoughts.pdf focuses on the use of promp

In [18]:
# Per cancellarne la memoria

agent_chain.memory.chat_memory.messages = []
print(agent_chain)

# vedremo che messages sarà una lista vuota

memory=ConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[]), output_key=None, input_key=None, return_messages=False, human_prefix='Human', ai_prefix='AI', memory_key='chat_history') callbacks=None callback_manager=None verbose=True agent=ConversationalAgent(llm_chain=LLMChain(memory=None, callbacks=None, callback_manager=None, verbose=False, prompt=PromptTemplate(input_variables=['input', 'chat_history', 'agent_scratchpad'], output_parser=None, partial_variables={}, template='Assistant is a large language model trained by OpenAI.\n\nAssistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.\n\nAssistant is constantly learnin

**10. CHATBOT LOOP**

Scrivi 'stop' per terminare la conversazione.

In [30]:
while True:
  comm = input()
  
  if comm.lower() == "stop":
    # quando il loop viene interrotto, cancelliamo la conversazione
    agent_chain.memory.chat_memory.messages = []
    break
  text_input = "User: " + str(comm)
  
  response = agent_chain.run(input=text_input)
  print(f'Agent: {response}')



[1m> Entering new AgentExecutor chain...[0m


Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 2dc345121ac9efb088f873bb568c75fd in your message.).


[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: Vector Index NEUCOM-D-23-00321.pdf
Action Input: "title"[0m
Observation: [33;1m[1;3m
JCSE: Contrastive Learning of Japanese Sentence Embeddings and Its Applications[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m
Agent: 
JCSE: Contrastive Learning of Japanese Sentence Embeddings and Its Applications


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: Vector Index NEUCOM-D-23-00321.pdf
Action Input: English level[0m
Observation: [33;1m[1;3m
A: The context information does not provide enough information to answer the question.[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m
Agent: 
A: The context information does not provide enough information to answer the question.
