
## 🏪 Coffee Sales Analysis.
---

### 1. Business Understanding ☕️

Before touching the data, ask:

* What problem am I solving?
* Who will use the insights?
* What decisions depend on the data?

**Example Questions (KPIs):**

* Which coffee type is most popular?
* How much revenue comes from each coffee type?
* Are sales increasing during certain hours of the day?

**Your Goal:** Align the analysis with a *business question*.

---

In [None]:
# Step 1: Import Libraries

import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Step 2: Load Data 

# Load index_1.csv into a DataFrame
df = pd.read_csv("index_1.csv")

# View the first few rows
print(df.head())

### 2. Data Understanding 📊

Now, check what the dataset contains.

#### Step 1: Inspect the Data
* `df.head()` → first rows (quick look).
* `df.info()` → column types, missing values.
* `df.describe()` → quick statistics (for numbers).

#### Step 2: Understand Variables

* **date/datetime** → time of purchase.
* **cash\_type** → type of payment (card/cash).
* **card** → anonymized customer ID.
* **money** → transaction amount.
* **coffee\_name** → type of coffee bought.

**Your Task:** Relate these to your *business questions*.

---

In [None]:
# Step 3: Initial Exploration: understand the structure of the data.
# Know how many rows and columns are in the dataset. Check for missing values. 
# Understand the data types of each column.


### 3. Data Exploration 🔎

Now start exploring patterns.

#### Step 1: Univariate Exploration

Look at one variable at a time.


df['coffee_name'].value_counts()

*Shows how many times each coffee was sold.*

df['money'].hist()

*Histogram of transaction amounts.*



In [None]:
# Explore one variable at a time. (columns)

#### Step 2: Bivariate Exploration

Compare two variables.


df.groupby('coffee_name')['money'].sum()

*Total revenue per coffee type.*

df.groupby('cash_type')['money'].sum()


*Revenue by payment method.*

---

In [None]:
# Explore relationships between variables. (columns)

### Step 3: Time-based Exploration

Since you have dates, explore trends.


df['date'] = pd.to_datetime(df['date'])
daily_sales = df.groupby('date')['money'].sum()
daily_sales.plot(kind='line')


*Trend of total sales over time.*


In [None]:
# Check out the trend of the sales over time. (rows)
#Check out for any hidden patterns.

### 4. Data Presentation 📈

Now, summarize insights clearly.

* Use **bar charts** for categories (e.g., coffee types).
* Use **line charts** for time trends.
* Use **tables** for top 5 items or customers.



#### Example: Coffee Popularity

'''

import matplotlib.pyplot as plt

df['coffee_name'].value_counts().plot(kind='bar')

plt.title("Most Popular Coffees")

plt.show()

'''

In [None]:
# plot a bar chart for the most popular coffee types.

#### Example: Revenue Trend

daily_sales.plot(kind='line', figsize=(8,4), title="Daily Revenue")

In [None]:
# Plot a line chart for the revenue trend over time.

### Answer These:

- The most popular coffee is: `??` (replace after running value_counts).
- Highest revenue comes from: `??`.
- Peak sales occur during: `??` (based on datetime analysis).
- Customers prefer: `??` (card or cash).

This is how we turn raw data into **business insights**!

## ✨Tips

* Always start with **why** (business question).
* Use **pandas** for exploration, **matplotlib/seaborn** for visuals.
* Tell a **story** in the end: "Latte and Hot Chocolate drive most revenue, with peak sales during midday."

---

## Business Objectives Template

### Primary Goal:
[What is the main business problem we're solving?]

### Success Metrics:
[How will we measure success?]

### Stakeholders:
[Who cares about these results?]

### Constraints:
[What limitations do we have? Time, budget, data quality?]

### Expected Outcomes:
[What decisions will be made based on this analysis?]