<a href="https://colab.research.google.com/github/luquelab/AI_lab_assistant/blob/AI_agent_examples/AI_agent_analyst_LangChain_practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Analyst Assistant (quick-guide)
Example on how to generate an AI analyst assistant using the agent functionality of LangChain.

The example is based on [this post](https://towardsai.net/p/machine-learning/create-your-own-data-analyst-assistant-with-langchain-agents) from Peter Martra in TowardsAI

# Packages and Libraries

This section installs the packages and import the libraries that might be necessary in the Colab environment.



## Install Packages

+ `langchain`: A Python library that allows us to chain the model with different tools.
+ `openai`: It will enable us to work with the API of the well-known AI company that owns ChatGPT. Through this API, we can access several of their models, including GPT-3.5 and GPT4.
+ `tabulate`: Python library that simplifies the printing of data tables, which our agent may use.
+ `xformers`: A recently created library maintained by Facebook that uses LangChain. It is necessary for the operation of our agent.

In [12]:
# Install libraries
!pip install langchain
!pip install openai
!pip install tabulate
!pip install xformers

Collecting torch==2.1.0 (from xformers)
  Using cached torch-2.1.0-cp310-cp310-manylinux1_x86_64.whl (670.2 MB)
Collecting triton==2.1.0 (from torch==2.1.0->xformers)
  Using cached triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)
Installing collected packages: triton, torch
  Attempting uninstall: triton
    Found existing installation: triton 2.0.0
    Uninstalling triton-2.0.0:
      Successfully uninstalled triton-2.0.0
  Attempting uninstall: torch
    Found existing installation: torch 2.0.1
    Uninstalling torch-2.0.1:
      Successfully uninstalled torch-2.0.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fastai 2.7.12 requires torch<2.1,>=1.7, but you have torch 2.1.0 which is incompatible.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.1.0 which is incompatible.
torchdata 0.6.1 requires to

## Import Libraries

+ `OpenAI`: It allows us to interact with OpenAI’s models.
+ `create_pandas_dataframe_agent`: As the name suggests, this library is used to create our specialized agent, capable of handling data stored in a Pandas DataFrame.
+`Pandas`: The well-known library for working with tabular data.
+ `os`: Module that provides a portable way of using operating system dependent functionality.
+ `files`: Library in google.colab to interact with files.

In [18]:
# Import libraries
from langchain.llms import OpenAI
from langchain.agents import create_pandas_dataframe_agent
import pandas
import os
from google.colab import files

# API Key
Set up API key to interact with AI LLM.

In [17]:
openai_key = "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"
os.environ["OPENAI_API_KEY"] = openai_key

# Create Dcoument and Agent


## Document

In [19]:
# Upload data
def load_csv_file():
  """Uploads a CSV file from the user and returns a Pandas DataFrame."""

  uploaded_file = files.upload()
  file_path = next(iter(uploaded_file))
  document = pandas.read_csv(file_path)
  return document

document = load_csv_file()


Saving climate_change_data.csv to climate_change_data.csv


## Agent

In [20]:
# Create agent
litte_ds = create_pandas_dataframe_agent(
 OpenAI(temperature=0), document, verbose=True
)

# Use Agent

In [22]:
# Request
Request = "Analyze this data, and write a brief explanation around 100 words."

# Run request:
litte_ds.run(Request)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to look at the data and think about what it is telling me.
Action: python_repl_ast
Action Input: df.describe()[0m
Observation: [36;1m[1;3m        Temperature  CO2 Emissions  Sea Level Rise  Precipitation  \
count  10000.000000   10000.000000    10000.000000   10000.000000   
mean      14.936034     400.220469       -0.003152      49.881208   
std        5.030616      49.696933        0.991349      28.862417   
min       -3.803589     182.131220       -4.092155       0.010143   
25%       11.577991     367.109330       -0.673809      24.497516   
50%       14.981136     400.821324        0.002332      49.818967   
75%       18.305826     433.307905        0.675723      74.524991   
max       33.976956     582.899701        4.116559      99.991900   

           Humidity    Wind Speed  
count  10000.000000  10000.000000  
mean      49.771302     25.082066  
std       28.929320     14.466648  
min        0.018

'This dataframe contains information about the climate in various locations around the world. It includes numerical data such as temperature, CO2 emissions, sea level rise, precipitation, humidity, and wind speed. It also includes categorical data such as date, location, and country. The data is mostly centered around the mean, with some outliers.'

In [24]:
# Request
Request = "Clean the data \
Implement a statistical model \
Make a projection what will be the temperature in 15 years"

# Run request:
litte_ds.run(Request)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to clean the data, then use a statistical model to make a projection
Action: python_repl_ast
Action Input: df.dropna()[0m
Observation: [36;1m[1;3m                               Date           Location                Country  \
0     2000-01-01 00:00:00.000000000    New Williamtown                 Latvia   
1     2000-01-01 20:09:43.258325832       North Rachel           South Africa   
2     2000-01-02 16:19:26.516651665   West Williamland          French Guiana   
3     2000-01-03 12:29:09.774977497        South David                Vietnam   
4     2000-01-04 08:38:53.033303330     New Scottburgh                Moldova   
...                             ...                ...                    ...   
9995  2022-12-27 15:21:06.966696576   South Elaineberg                 Bhutan   
9996  2022-12-28 11:30:50.225022464       Leblancville                  Congo   
9997  2022-12-29 07:40:33.483348224     West 



[32;1m[1;3m I now know the final answer
Final Answer: The predicted temperature in 15 years is 14.93510204.[0m

[1m> Finished chain.[0m


'The predicted temperature in 15 years is 14.93510204.'