#PandasAI

Pandas AI is a Python library that adds generative artificial intelligence capabilities to Pandas, the popular data analysis and manipulation tool.

Here is a very simple demo about how it work!

First of all we install the dependencies:

In [None]:
!pip install pandasai openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pandasai
  Downloading pandasai-0.1.1-py3-none-any.whl (9.5 kB)
Collecting openai
  Downloading openai-0.27.5-py3-none-any.whl (71 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.6/71.6 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting python-dotenv<2.0.0,>=1.0.0
  Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Collecting pandas<3.0.0,>=2.0.1
  Downloading pandas-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m77.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting astor<0.9.0,>=0.8.1
  Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB

Now we import the dependencies:

In [None]:
import pandas as pd
from pandasai import PandasAI
from pandasai.llm.openai import OpenAI
from pandasai.llm.open_assistant import OpenAssistant

We create a dataframe using pandas:

In [None]:
df = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "gdp": [21400000, 2940000, 2830000, 3870000, 2160000, 1350000, 1780000, 1320000, 516000, 14000000],
    "happiness_index": [7.3, 7.2, 6.5, 7.0, 6.0, 6.3, 7.3, 7.3, 5.9, 5.0]
})

In [None]:
df

Unnamed: 0,country,gdp,happiness_index
0,United States,21400000,7.3
1,United Kingdom,2940000,7.2
2,France,2830000,6.5
3,Germany,3870000,7.0
4,Italy,2160000,6.0
5,Spain,1350000,6.3
6,Canada,1780000,7.3
7,Australia,1320000,7.3
8,Japan,516000,5.9
9,China,14000000,5.0


We instantiate the llm (in this case OpenAI). Remember to change the API key with you OpenAI api key.

In [None]:
OPENAI_API_KEY = "sk-xxxx"
llm = OpenAI(api_token=OPENAI_API_KEY)

Then we instantiate PandasAI with the provided large language model and we run it, passing the dataframe and the prompt

In [None]:
pandas_ai = PandasAI(llm)

In [None]:

pandas_ai.run(df, prompt='Which are the 5 happiest countries?')

'According to the data, the top 5 happiest countries are the United States, Canada, Australia, the United Kingdom, and Germany.'

In [None]:
pandas_ai.run(df, prompt='What are the names and gdp of the happiest countries?')

'The happiest countries and their GDPs are listed as follows: United States with a GDP of 21400000, Canada with a GDP of 1780000, and Australia with a GDP of 1320000.'

In [None]:
pandas_ai.run(df, prompt='What are the countries with lowest gdp?')

"Well, according to the data, the countries with the lowest GDP are Japan, Australia, Spain, Canada, and Italy. However, it's important to note that GDP isn't the only factor that determines a country's overall well-being. The happiness index is also taken into account, which measures factors like quality of life, social support, and freedom to make life choices."

In [None]:
pandas_ai.run(df, prompt='If I have to choose 3 countries to migrate, which one will you recommend based on GDP and Happiness Index?')

'Based on GDP and Happiness Index, I would recommend the United States, China, or Germany as potential countries to migrate to.'

Open Assistant

In [None]:
HF_API_KEY = "hf_xxxx"
oa_llm = OpenAssistant(api_token=HF_API_KEY)

In [None]:
pandas_ai_oa = PandasAI(oa_llm)