# Demo - Data Analysis with pandasai

This notebook demonstrates the effective use of `pandasai` for data management and analysis within Jupyter notebooks. Below are the steps covered:

1. **Data Loading**: How to load data into a dataframe using `pandasai`.
2. **Data Processing**: Techniques for manipulating and processing data using `pandasai`.
3. **Data Insight**: Methods to derive insights from the data efficiently using the capabilities of `pandasai`.

These examples will provide you with a solid foundation for performing detailed data analysis and can serve as a starting point for building more complex data analysis workflows.


In [1]:
import pandas as pd
import os
from pandasai import SmartDataframe
from pandasai.llm import AzureOpenAI
from dotenv import load_dotenv
load_dotenv()

AZURE_OPENAI_ENDPOINT=os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY=os.getenv("AZURE_OPENAI_KEY")
OPENAI_DEPLOYMENT_NAME=os.getenv("OPENAI_DEPLOYMENT_NAME")

In [2]:
df_anime = pd.read_csv(r'C:\Users\(Ai)AiSukmoren\Desktop\powerbi + genai\data\anime.csv')

In [4]:
df_anime.head(5)

Unnamed: 0,anime_id,name,genre,type,episodes,rating,members
0,32281,Kimi no Na wa.,"Drama, Romance, School, Supernatural",Movie,1,9.37,200630
1,5114,Fullmetal Alchemist: Brotherhood,"Action, Adventure, Drama, Fantasy, Magic, Mili...",TV,64,9.26,793665
2,28977,Gintama°,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.25,114262
3,9253,Steins;Gate,"Sci-Fi, Thriller",TV,24,9.17,673572
4,9969,Gintama&#039;,"Action, Comedy, Historical, Parody, Samurai, S...",TV,51,9.16,151266


# Working with pandas DataFrames Using PandasAI

Explore the enhanced data analysis capabilities when integrating `PandasAI` with traditional Pandas DataFrames.

## 1. Introduction to PandasAI
   - Overview of `PandasAI` and its installation.

## 2. Loading and Enhancing Data
   - Load data into Pandas DataFrame.
   - Apply `PandasAI` methods for improved data manipulation.

## 3. Advanced Data Analysis
   - Perform complex transformations and analyses with `PandasAI`.
   - Tips for optimizing data workflows.

In [5]:
llm = AzureOpenAI(
deployment_name=OPENAI_DEPLOYMENT_NAME,
api_version="2023-12-01-preview",
api_token = AZURE_OPENAI_API_KEY
)

df = SmartDataframe(df_anime, config={"llm": llm})
df = df.chat("""
            Instruction
            - help me clean this dataset and order it by column 'anime_id' ascending
            - help me remove special character
            - normalize the records by lowercase all the records
            - reset index
            """)

df.head()

Unnamed: 0,anime_id,name,genre,type,episodes,rating,members
0,1,cowboy bebop,"[action, adventure, comedy, drama, scifi, space]",tv,26,8.82,486824
1,5,cowboy bebop tengoku no tobira,"[action, drama, mystery, scifi, space]",movie,1,8.4,137636
2,6,trigun,"[action, comedy, scifi]",tv,26,8.32,283069
3,7,witch hunter robin,"[action, drama, magic, mystery, police, supern...",tv,26,7.36,64905
4,8,beet the vandel buster,"[adventure, fantasy, shounen, supernatural]",tv,52,7.06,9848


In [8]:
df.type()

: 

# Plotting with PandasAI

## Example of Using PandasAI to Plot a Chart from a Pandas DataFrame

This section provides a concise example of how to use `PandasAI` for visualizing data directly from a Pandas DataFrame. The example will demonstrate creating a simple chart to visualize your data insights effectively.

In [3]:
# Set up the connection to the Azure OpenAI service
llm = AzureOpenAI(
    deployment_name=OPENAI_DEPLOYMENT_NAME,
    api_version="2023-12-01-preview",
    api_token=AZURE_OPENAI_API_KEY
)

# Define the path where charts will be saved
user_defined_path = 'export/'

# Assuming SmartDataframe is a custom class you have that integrates DataFrame operations with an LLM
df = SmartDataframe(df_anime, config={
    "llm": llm,  # Uncomment if the LLM should be used for generating chart
    "save_charts": False,
    "save_charts_path": user_defined_path
})

# Requesting a single histogram with different colors for each bar
chart = df.chat("""
    Instruction:
    "Plot a single top 10 anime in barchart based on rating with different color on each bar"
    """)

# Output the resulting chart
print(chart)