# Demo Notebook 

This notebook serves the purpose of demonstrating how the Retriaval Augmented Generation improves the LLM code generation. 

## LLM code creation without RAG and CoALA

Firstly, we import the necessary libraries and create an agent object.

In [1]:
# imports and agent creation
import pandas as pd
from llms.agents.react import ReActAgent
from llms.clients.gpt import GPTClient
from llms.settings import settings

client = GPTClient(
    client_id=settings.CLIENT_ID,
    client_secret=settings.CLIENT_SECRET,
    auth_url=settings.AUTH_URL,
    api_base=settings.API_BASE,
    deployment_id="gpt-4-32k",
    max_response_tokens=1000,
    temperature=0.0,
)
agent = ReActAgent(client)

Then we define a prompt and ask the agent without RAG and CoALA implemented for the code. Then we execute the function code and save the function that the agent generated under `agent_func`.

In [2]:
prompt = """Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib."""

generated_code = agent.run(prompt)

namespace_agent = {}
exec(generated_code, namespace_agent)
agent_func = namespace_agent['response_function']

22/01/24 09:48:30 INFO User prompt: Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib.
22/01/24 09:48:31 INFO Waiting 1s to avoid rate limit
22/01/24 09:48:32 INFO Starting call to 'llms.clients.base.BaseLLMClient._request_handler', this is the 1st time calling it.
22/01/24 09:48:43 INFO API response: 
{'choices': [{'finish_reason': 'stop', 'index': 0, 'message': {'content': '{\n    "Thought": "The user wants to calculate the rolling rank of a pandas Series. The rolling rank is calculated over a specified window of values. In this case, the window size is 3. The rank function ranks each number in the window from smallest to largest. If there are duplicate values, they get the average rank. The pandas library has a function called rolling which can be used to calculate the rolling rank.",\n    "Action": "def response_function(series):\\n    import pandas as pd\\n    s = pd.Series(series)\\n    return s.rolling(3).ap

Next, we define our input and the expected function.

In [3]:
input = pd.Series([1, 4, 2, 3, 5, 3])

def expected_func(input):
    result = input.rolling(3).rank()
    return result

Now we can compare the results of the two functions:

In [4]:
print('Desired result:')
print(expected_func(input))
print()
print('Agent result:')
print(agent_func(input))

Desired result:
0    NaN
1    NaN
2    2.0
3    2.0
4    3.0
5    1.5
dtype: float64

Agent result:
0    NaN
1    NaN
2    2.0
3    2.0
4    3.0
5    1.5
dtype: float64


In [5]:
agent.reasoning

[{'User prompt': 'Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib.'},
 {'Thought': 'The user wants to calculate the rolling rank of a pandas Series. The rolling rank is calculated over a specified window of values. In this case, the window size is 3. The rank function ranks each number in the window from smallest to largest. If there are duplicate values, they get the average rank. The pandas library has a function called rolling which can be used to calculate the rolling rank.'},
 {'Tool': 'def response_function(series):\n    import pandas as pd\n    s = pd.Series(series)\n    return s.rolling(3).apply(lambda x: x.rank().iloc[-1])'},
 {'Observation': 'Your response format was correct and the code does not have any syntax errors.'},
 {'Thought': 'The code is correct and it will return the rolling rank of the series. Now I will provide the final answer.'},
 {'Answer': 'def response_function(series):\n    import pa

## LLM code creation with RAG and CoALA

We start by creating a new LLM agent that is augmented by RAG and CoALA.

In [7]:
# import the FAISS vector storage for the retrieval augmented generation and CoALA
from llms.rag.faiss import FAISS
from llms.rag.coala import CoALA

filename = "embeddings_EUCLIDEAN_DISTANCE_k_results_1_threshold_0.0_chunk_size_512_avg_True"

docs_vector_store = FAISS.load_local("embeddings/semantic/", filename, client)

code_vector_store = FAISS.load_local("embeddings/episodic/", filename, client)

coala = CoALA(docs_vector_store, code_vector_store)

In [8]:
rag_agent = ReActAgent(client, rag=None, tools={"CoALA": coala})

In [9]:
# reminder
prompt = """Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib."""

generated_code_rag = rag_agent.run(prompt)

namespace_rag_agent = {}
exec(generated_code_rag, namespace_rag_agent)
rag_agent_func = namespace_rag_agent['response_function']

22/01/24 09:54:18 INFO User prompt: Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib.
22/01/24 09:54:18 INFO Waiting 1s to avoid rate limit
22/01/24 09:54:19 INFO Starting call to 'llms.clients.base.BaseLLMClient._request_handler', this is the 1st time calling it.
22/01/24 09:54:33 INFO Finished call to 'llms.clients.base.BaseLLMClient._request_handler' after 13.974(s), this was the 1st time calling it.
22/01/24 09:54:36 INFO Starting call to 'llms.clients.base.BaseLLMClient._request_handler', this is the 2nd time calling it.
22/01/24 09:54:47 INFO API response: 
{'choices': [{'finish_reason': 'stop', 'index': 0, 'message': {'content': '{\n    "Thought": "The user wants to calculate the rolling rank of a pandas Series. The rolling rank is a measure of how a value compares to the previous values in a rolling window. In this case, the window size is 3. I will use the pandas library to create a Series from the given 

As the `input` and `expected_func` stay the same, we can now compare the results of the rag enhanced agent with the expected result.

In [10]:
input = pd.Series([1, 4, 2, 3, 5, 3])

In [11]:
print('Desired result:')
print(expected_func(input))
print()
print('Rag enhanced agent result:')
print(rag_agent_func(input))

Desired result:
0    NaN
1    NaN
2    2.0
3    2.0
4    3.0
5    1.5
dtype: float64

Rag enhanced agent result:
[nan, nan, 2.0, 2.0, 3.0, 1.5]


In [12]:
agent.reasoning

[{'User prompt': 'Please return the rolling rank(3) of this Series [1, 4, 2, 3, 5, 3]. Make sure to code your solution using the pandas lib.'},
 {'Thought': 'The user wants to calculate the rolling rank of a pandas Series. The rolling rank is calculated over a specified window of values. In this case, the window size is 3. The rank function ranks each number in the window from smallest to largest. If there are duplicate values, they get the average rank. The pandas library has a function called rolling which can be used to calculate the rolling rank.'},
 {'Tool': 'def response_function(series):\n    import pandas as pd\n    s = pd.Series(series)\n    return s.rolling(3).apply(lambda x: x.rank().iloc[-1])'},
 {'Observation': 'Your response format was correct and the code does not have any syntax errors.'},
 {'Thought': 'The code is correct and it will return the rolling rank of the series. Now I will provide the final answer.'},
 {'Answer': 'def response_function(series):\n    import pa