### Chat with Data using Agents

In [1]:
import os
from dotenv import load_dotenv

# from langchain.agents import AgentType, initialize_agent, load_tools
# from langchain_community.llms import Ollama
from langchain_groq import ChatGroq

from langchain.schema import SystemMessage
from langchain.chains import LLMChain
from langchain.prompts.chat import ChatPromptTemplate, HumanMessagePromptTemplate

from common_functions import ensure_llama_running, generate_synthetic_data, display_md

ensure_llama_running()
load_dotenv()

# llm_model = os.getenv('LLM_MODEL')
# llm = Ollama(model=llm_model)
data_gen_llm = ChatGroq(model_name=os.getenv('GROQ_LARGE_MODEL'), temperature=1.0)
llm = ChatGroq(model_name=os.getenv('GROQ_SMALL_MODEL'), temperature=1.0)

Generate synthetic data

In [2]:
topic = 'sales'
fields = ['product_id', 'sales', 'date']
examples = [{ 'example': 'product_id: 1, sales: 100, dates: any dates from 2020 to 2022' }]

df = generate_synthetic_data(data_gen_llm, topic, rows=100, fields=fields, examples=examples)
df = df.sort_values('date').reset_index(drop=True)  # sort by date
df.head()

Unnamed: 0,product_id,sales,date
0,1,120,2020-03-15
1,2,90,2020-06-20
2,1,115,2020-09-01
3,3,130,2020-11-15
4,2,105,2021-01-05


Define a chain for asking queries

In [3]:
prompt = f'''
Act as a Data Analyst and Data Scientist for a company that uses {topic} data.
Answer the user's question using the {topic} data below. Make useful text bold or italic according to the context.

```{df.to_csv(index=False)}```
'''

chatPromptTemplate = ChatPromptTemplate(
	messages = [
		SystemMessage(prompt),
		HumanMessagePromptTemplate.from_template('{question}'),
	],
	input_variables=['query_context', 'question'],
)

chain = LLMChain(
	llm=llm,
	prompt=chatPromptTemplate,
	verbose=False,
	output_key='query_answer',
)

def ask_query(query, display=False):
	response = chain.invoke(
		{ "question": query },
		output_key="query_answer",
	)
	answer = response["query_answer"]
	if display:
		display_md(answer)
	else:
		return answer

  warn_deprecated(


In [4]:
ask_query("What is the total sales for product 1?", display=True)

**The total sales for product 1 is 730**.

Here's the breakdown:

* 2020: 120 + 115 = 235
* 2021: 100 + 110 + 105 = 315
* 2022: 105 + 110 + 110 + 115 + 115 + ... + 105 = 820

Total sales for product 1: 235 + 315 + 820 = 1370.

In [5]:
ask_query("On which months are overal sales higher?", display=True)

**Overall sales are higher in the months of August, September, October, and November.**

Here's a breakdown of the average sales per month:

| Month | Average Sales |
| --- | --- |
| January | 99.33 |
| February | 103.67 |
| March | 116.67 |
| April | 110.33 |
| May | 118.67 |
| June | 97.33 |
| July | 107.33 |
| August | **112.67** |
| September | **115.33** |
| October | **117.67** |
| November | **121.33** |

As you can see, the months with higher overall sales are August, September, October, and November.

In [6]:
ask_query("What are the possible reasons for the drop in sales sometimes?", display=True)

As a Data Analyst, I've analyzed the sales data and identified some possible reasons for the drop in sales:

**Seasonal Fluctuations**: Notice that sales tend to decrease during periods of poor weather (e.g., winter), holidays (e.g., Christmas), or summer vacation seasons. **During these periods, sales may drop due to reduced consumer spending or increased competition from other activities**.

**Economic Factors**: Economic downturns, high unemployment rates, or inflation might impact consumer spending habits, leading to decreased sales. **For example, sales dipped sharply after the global economic recession in 2020**.

**Product Life Cycle**: As a product reaches the maturity stage of its lifecycle, sales often decline as consumer interest wanes. **This might be the case for product ID 2, which shows a steady decrease in sales over the years**.

**Marketing and Promotion**: Insufficient marketing efforts or incomplete product promotion might lead to decreased sales. **For instance, a lack of effective marketing campaigns during the summer months (June-August) could have contributed to the dip in sales**.

**Competitor Activity**: Intense competition from rival companies might influence consumer behavior, causing sales to decline. **Look for periods where a competitor product gained popularity, potentially affecting sales of Product ID 1**.

**Supply Chain Disruptions**: External factors such as supply chain disruptions, logistics issues, or inventory management problems could impact availability and lead to decreased sales. **For instance, a shortage of materials or raw materials for Product ID 3 might have caused the sales dip in Q2-Q3 2021**.

These are just some possible reasons for the drop in sales. As a Data Analyst, I recommend further analysis and exploration to identify the root cause of the issue and potential solutions to mitigate it.