### Based on:
https://docs.llamaindex.ai/en/stable/presentations/materials/2024-02-28-rag-bootcamp-vector-institute/?h=rag

- Use this notebook if you want to use OpenAI's LLM and embeddings
- Here we use the pickled embeddings, so we save the cost there and only incur the LLM query cost

In [25]:
import os
import nest_asyncio

nest_asyncio.apply()

In [26]:
USE_OPENAI = True

In [27]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.node_parser import SentenceSplitter

if USE_OPENAI:
    Settings.llm = OpenAI(model="gpt-3.5-turbo", api_key=os.getenv('OPENAI_API_KEY'))
    Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
else:
    Settings.llm = Ollama(model="llama3:instruct", request_timeout=120.0)
    Settings.embed_model = OllamaEmbedding(
        model_name="llama3:instruct",
        base_url="http://localhost:11434",
        ollama_additional_kwargs={"mirostat": 0})

Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)

In [28]:
from llama_index.core import VectorStoreIndex
from llama_index.core import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir="models/openAI_idpp_metagpt_state")
index = load_index_from_storage(storage_context=storage_context)

In [29]:
retriever = index.as_retriever(similarity_top_k=2)
retrieved_nodes = retriever.retrieve("What did the president say about Justice Breyer?")

In [30]:
# to view the retrieved node
print (retrieved_nodes[0].text)
print ("================")
print (retrieved_nodes[1].text)

I ask Democrats and Republicans alike to pass my budget and keep our neighborhoods safe.

And we’ll do everything in my power to crack down on gun trafficking of ghost guns that you can buy online, assemble at home — no serial numbers, can’t be traced.

I ask Congress to pass proven measures to reduce gun violence. Pass universal background checks. Why should anyone on the terrorist list be able to purchase a weapon. Why? Why?

And, folks, ban assault weapons with high-capacity magazines that hold up to 100 rounds. You think the deer are wearing Kevlar vests?

Look, repeal the liability shield that makes gun manufacturers the only industry in America that can’t be sued — the only one. Imagine had we done that with the tobacco manufactures.

These laws don’t infringe on the Second Amendment; they save lives.

The most fundamental right in America is the right to vote and have it counted. And look, it’s under assault.

In state after state, new laws have been passed not only to suppress 

In [31]:
query_engine = index.as_query_engine(similarity_top_k=2)

In [32]:
response = query_engine.query("What were the models tried for predicting ALS progression in the idpp paper?.")
print (response)

The models tried for predicting ALS progression in the idpp paper included a naive model that carried the last observed value forward, various Machine Learning algorithms for regression, and a Long Short-Term Memory (LSTM) neural network to model the temporal dependencies in the sequential sensor data.


In [33]:
response = query_engine.query("What was the validation strategy used by the authors in the idpp paper?.")
print (response)

The authors in the idpp paper used Grid search with cross validation on the entire training data to select the model that gave the best validation set RMSE.


In [34]:
response = query_engine.query("Which model performed the best with lowest RMSE in the idpp paper?.")
print (response)

The ElasticNet + Naive model performed the best with the lowest RMSE of 0.5048 in the idpp paper.


In [35]:
response = query_engine.query("What did the president say about Justice Breyer")
print (response)

The president expressed gratitude and appreciation for Justice Breyer's service to the country, acknowledging his dedication and service as a retiring Justice of the United States Supreme Court.


In [36]:
response = query_engine.query("How do agents share information with other agents in MetaGPT?")
print (response)

Agents in MetaGPT share information with other agents by utilizing a shared message pool. This shared message pool allows all agents to exchange messages directly, enabling them to publish their structured messages and access messages from other entities transparently. By storing information in this global message pool, agents can retrieve required information without the need to inquire about other agents individually, thus enhancing communication efficiency.
