# Using PhiData Agent to work as a Data Analyst


In this notebook, we will use **PhiData**, a framework for defining and running AI agents for specific tasks. This example, we can see how a structured data (in form of a table) can be utilised by agent to get insights.

## Steps Overview
1. Define a tool to access a data source (CSV)
2. Define an agent with LLM and the Data tool to perform data analysis.
3. Initate Agent in interactive mode

---
### Prerequisites
- Install the `phidata` library.
- Obtain API keys for the LLM model you want to use (e.g., Groq or OpenAI).
- Set up a Python environment with necessary dependencies.

---
### Code Walkthrough
Below is the implementation to define and use an AI agent for answering queries.

### Step 0 : Required installation and import of dependencies

In [2]:
!pip install phidata groq duckdb duckduckgo-search

Collecting phidata
  Downloading phidata-2.7.10-py3-none-any.whl.metadata (38 kB)
Collecting groq
  Downloading groq-0.18.0-py3-none-any.whl.metadata (14 kB)
Collecting duckduckgo-search
  Downloading duckduckgo_search-7.5.0-py3-none-any.whl.metadata (17 kB)
Collecting pydantic-settings (from phidata)
  Downloading pydantic_settings-2.8.1-py3-none-any.whl.metadata (3.5 kB)
Collecting python-dotenv (from phidata)
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting tomli (from phidata)
  Downloading tomli-2.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting primp>=0.14.0 (from duckduckgo-search)
  Downloading primp-0.14.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Downloading phidata-2.7.10-py3-none-any.whl (716 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m716.9/716.9 kB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading groq-0.18.0-py3-none-any.whl (121 kB)


#### imports

In [3]:
from phi.agent import Agent
from phi.model.groq import Groq
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo
from phi.storage.agent.json import JsonFileAgentStorage
from datetime import datetime
from phi.tools.csv_tools import CsvTools

### Step 1 : Load the CSV file (Data) and initialise the tool

#### Mount your GDrive to access the file

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


#### Initalise the CSV tool with the required data file.
The CSV tool handles the data file as a table and provides **SQL** features, which can be used by the agent. It handles as a simple 'in-memory' DB

In [5]:
# Initialise the tool with specific data file
CSV_Tool = CsvTools(csvs=['/content/drive/My Drive/GrowthSchool_RAG_and_AgenticAI/Agentic_AI/Doc/RainFall.csv'],
                    read_csvs=False)


### Step 2 : Define Agent with CSV tool access

Define an agent with LLM linked. Also the CSV data base is associated to it and parameters set to search for for information from the DB

In [6]:
from google.colab import userdata
import os

# Set your Groq API key or any other LLM API key
# os.environ['GROQ_API_KEY'] = userdata.get('GROQ_API_KEY')
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

Data_Analyst = Agent (
                        # model=Groq (id="llama3-70b-8192"),
                        model=OpenAIChat (id="gpt-4o-mini"),
                        name = 'Data Analyst',
                        agent_id = 'Data Analyst',
                        description = 'You are an expert data analyst with deep knowledge in statistics and very good at SQL',
                        add_history_to_messages=True,
                        role = 'Expert data analyst with Statistical expertise',
                        storage=JsonFileAgentStorage("./tmp/agent_sessions_json"),
                        tools=[CSV_Tool],
                        instructions=["You have access to databases through the tools",
                                      "Use your SQL expertise to analyse, manipulate data",
                                      "use efficient SQL queries to get the data.",
                                      "Respond to the question prcisely"],
                        show_tool_calls=True,
                        markdown=True,
                      )


### Step 3 : Launch agent in interactive mode

In [None]:
prompt = 'x'

while (prompt != 'exit'):
  prompt = input ("Enter your query ... or 'exit'")
  if prompt != 'exit':

    Res = Data_Analyst.run (prompt)
    print (Res.content)

In [None]:
## Current time and date
from datetime import datetime
Now = datetime.now ()
Today = Now.strftime ("%d-%b-%Y")

# Define an internet search agent
Explorer = Agent (
                      # model=Groq (id="llama3-70b-8192"),
                      model=OpenAIChat (id="gpt-4o-mini"),
                      name = 'Explorer',
                      agent_id = 'Explorer',
                      description = 'You are good at gathering important inforamtion from internet for given topic',
                      add_history_to_messages=True,
                      role = 'Expert internet information explorer',
                      storage=JsonFileAgentStorage("./tmp/agent_sessions_json"),
                      tools=[DuckDuckGo()],
                      instructions=["Get relevant information from internet.",
                                    "Don't make up anything.",
                                    "Provide your summary also.",
                                    "Consider you are on "+Today],
                      # show_tool_calls=True,
                      markdown=True,
                  )

# Add to the team of Data analyst
Data_Analyst = Agent (
                        # model=Groq (id="llama3-70b-8192"),
                        model=OpenAIChat (id="gpt-4o-mini"),
                        name = 'Data Analyst',
                        agent_id = 'Data Analyst',
                        description = 'You are an expert data analyst with deep knowledge in statistics and very good at SQL',
                        add_history_to_messages=True,
                        role = 'Expert data analyst with Statistical expertise',
                        storage=JsonFileAgentStorage("./tmp/agent_sessions_json"),
                        tools=[CSV_Tool],
                        team=[Explorer],
                        instructions=["You have access to databases through the tools",
                                      "Use your SQL expertise to analyse, manipulate data",
                                      "use efficient SQL queries to get the data.",
                                      "Look for additional information if needed from internet"
                                      "Respond to the question prcisely"],
                        # show_tool_calls=True,
                        markdown=True,
                      )