# BUSINESS SCIENCE UNIVERSITY
# Free AI Training

**This AI example is designed to get you started with learning AI for data science today!**

You will:

1. Learn from applying AI to a CRM Sales Pipeline Won/Lost Dataset 
2. Create a Pandas Data Frame Agent to interact with the dataset
3. Run a quick example to get you started with AI today!

**AI COURSE LAUNCH ON THURSDAY, MAY 30TH AT 2PM EST:**

👉 REGISTER HERE: https://learn.business-science.io/python-generative-ai-apps

## GOALS:

1. SHOW HOW AI CAN BE USED FOR DATA SCIENCE
2. SHOW SOME OF THE LOW-LEVEL DATA SCIENCE TASKS THAT CAN BE AUTOMATED WITH AI

## Libraries and AI Setup

In [None]:
from langchain.agents.agent_types import AgentType
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI

import pandas as pd
import yaml

# OPENAI SETUP:
OPENAI_API_KEY = yaml.safe_load(open('../credentials.yml'))['openai']

# PANDAS DATA FRAME:

df = pd.read_csv("data/sales_pipeline.csv")
df

# 1.0 CREATE A PANDAS AI AGENT:

model = ChatOpenAI(
    model="gpt-3.5-turbo",
    temperature=0,  
    api_key=OPENAI_API_KEY
)

pandas_agent = create_pandas_dataframe_agent(
    model,
    df,
    verbose=True,
    agent_type=AgentType.OPENAI_FUNCTIONS,
)

## Explore the Pandas Agent

In [None]:
pandas_agent

AgentExecutor(verbose=True, agent=RunnableAgent(runnable=RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_to_openai_function_messages(x['intermediate_steps']))
})
| ChatPromptTemplate(input_variables=['agent_scratchpad', 'input'], input_types={'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessage(content='\nYou are working with a pandas dataframe in Python. The name of the dataframe is `df`.\nThis is the result of `print(df.head())`:\n|    | opportunity_id   | sales_agent     | product        | account   | deal_stage   | engage_date   | close_date   |   close_value |\n|---:|:-----------------|:----------------|:---------------|:----------|:-------------|:--------------|:-------------|

## Answer Questions from Natural Language

### Q1. What are the column names in the data?

In [None]:
pandas_agent.invoke("What are the column names in the data?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `python_repl_ast` with `{'query': 'df.columns'}`


[0m[36;1m[1;3mIndex(['opportunity_id', 'sales_agent', 'product', 'account', 'deal_stage',
       'engage_date', 'close_date', 'close_value'],
      dtype='object')[0m[32;1m[1;3mThe column names in the data are: 'opportunity_id', 'sales_agent', 'product', 'account', 'deal_stage', 'engage_date', 'close_date', and 'close_value'.[0m

[1m> Finished chain.[0m


{'input': 'What are the column names in the data?',
 'output': "The column names in the data are: 'opportunity_id', 'sales_agent', 'product', 'account', 'deal_stage', 'engage_date', 'close_date', and 'close_value'."}

### Q2. How many deals were won?

In [None]:
pandas_agent.invoke("How many deals were won?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `python_repl_ast` with `{'query': "df[df['deal_stage'] == 'Won'].shape[0]"}`


[0m[36;1m[1;3m4238[0m[32;1m[1;3mThere were 4238 deals that were won.[0m

[1m> Finished chain.[0m


{'input': 'How many deals were won?',
 'output': 'There were 4238 deals that were won.'}

### Q3. What is the average value for deals that were won?

In [None]:
pandas_agent.invoke("What is the average value for deals that were won?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `python_repl_ast` with `{'query': "df[df['deal_stage'] == 'Won']['close_value'].mean()"}`


[0m[36;1m[1;3m2360.9093912222747[0m[32;1m[1;3mThe average value for deals that were won is $2360.91.[0m

[1m> Finished chain.[0m


{'input': 'What is the average value for deals that were won?',
 'output': 'The average value for deals that were won is $2360.91.'}