# Scenario 1 : Reatil Inventory Analysis

## My Prompt
You are a retail analysis expert. We are going to perform product-level inventory analysis using Python.

Use the following Python libraries: pandas, numpy, datetime, matplotlib, and seaborn.

You are given a DataFrame with the following columns:

- product_id
- category
- stock_level
- last_restock_date (string or datetime)
- sales_last_30_days
- supplier_lead_time (in days)
- unit_cost

I need you to generate Python code that:
1. Calculates **inventory turnover rate** (sales_last_30_days / stock_level)  
   - Handle division by zero and missing values appropriately.
2. Identifies **slow-moving products**
   - These are items with low turnover.
3. Flags **potential stockouts**
   - Use stock level, sales rate, and lead time to estimate which items may run out soon.
4. Creates **clear and labeled visualizations**
   - Include at least one bar chart and one scatter plot, using seaborn or matplotlib.

Please:
- Add all calculated columns to the existing DataFrame.
- Write clean, readable code with **docstrings and inline comments**.
- Do not assume the data is clean — handle nulls or formatting where needed.

Let me review and test the output. Then I’ll ask for refinements.


## Generated explanation 
📊 Inventory Analysis Plan
1. Data Preparation
Parse last_restock_date as datetime.

Handle missing values in relevant columns.

Replace 0s in stock_level with np.nan for safe division.

2. Inventory Turnover Rate
Formula: sales_last_30_days / stock_level

Handle stock_level == 0 or NaN to avoid division errors.

3. Identify Slow-Moving Products
Low turnover threshold (e.g., turnover < 0.5) → can be adjusted.

Add a boolean column is_slow_moving.

4. Flag Potential Stockouts
Estimate days of stock left: stock_level / (sales_last_30_days / 30)

Compare with supplier_lead_time

If days left < lead time → potential stockout.

5. Visualizations
Bar chart: Average turnover per category.

Scatter plot: Stock level vs. sales with annotations/highlighting for stockouts.


📊 Inventory Analysis Plan
1. Data Preparation
Parse last_restock_date as datetime.

Handle missing values in relevant columns.

Replace 0s in stock_level with np.nan for safe division.

2. Inventory Turnover Rate
Formula: sales_last_30_days / stock_level

Handle stock_level == 0 or NaN to avoid division errors.

3. Identify Slow-Moving Products
Low turnover threshold (e.g., turnover < 0.5) → can be adjusted.

Add a boolean column is_slow_moving.

4. Flag Potential Stockouts
Estimate days of stock left: stock_level / (sales_last_30_days / 30)

Compare with supplier_lead_time

If days left < lead time → potential stockout.

5. Visualizations
Bar chart: Average turnover per category.

Scatter plot: Stock level vs. sales with annotations/highlighting for stockouts.


📊 Inventory Analysis Plan
1. Data Preparation
Parse last_restock_date as datetime.

Handle missing values in relevant columns.

Replace 0s in stock_level with np.nan for safe division.

2. Inventory Turnover Rate
Formula: sales_last_30_days / stock_level

Handle stock_level == 0 or NaN to avoid division errors.

3. Identify Slow-Moving Products
Low turnover threshold (e.g., turnover < 0.5) → can be adjusted.

Add a boolean column is_slow_moving.

4. Flag Potential Stockouts
Estimate days of stock left: stock_level / (sales_last_30_days / 30)

Compare with supplier_lead_time

If days left < lead time → potential stockout.

5. Visualizations
Bar chart: Average turnover per category.

Scatter plot: Stock level vs. sales with annotations/highlighting for stockouts.



In [1]:
# Generated Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

# Example DataFrame (replace this with your actual data)
# df = pd.read_csv("your_inventory_data.csv")

# -------------------------------
# 🧹 Data Cleaning & Preparation
# -------------------------------

def preprocess_inventory(df):
    """
    Cleans and prepares the inventory DataFrame for analysis.
    - Converts dates
    - Handles missing values
    - Ensures numeric types are correct
    """
    # Convert 'last_restock_date' to datetime
    df['last_restock_date'] = pd.to_datetime(df['last_restock_date'], errors='coerce')
    
    # Replace 0 stock_level with NaN to avoid division by zero
    df['stock_level'] = df['stock_level'].replace(0, np.nan)
    
    # Handle nulls for relevant columns
    cols_to_check = ['sales_last_30_days', 'stock_level', 'supplier_lead_time', 'unit_cost']
    df[cols_to_check] = df[cols_to_check].apply(pd.to_numeric, errors='coerce')
    df.dropna(subset=cols_to_check, inplace=True)
    
    return df

# -------------------------------
# 📈 Inventory Turnover Analysis
# -------------------------------

def add_turnover_column(df):
    """
    Calculates inventory turnover rate.
    turnover = sales_last_30_days / stock_level
    Adds a new column 'inventory_turnover'
    """
    df['inventory_turnover'] = df['sales_last_30_days'] / df['stock_level']
    return df

# -------------------------------
# 🐢 Identify Slow-Moving Products
# -------------------------------

def flag_slow_moving(df, threshold=0.5):
    """
    Flags slow-moving products based on low turnover.
    Adds a boolean column 'is_slow_moving'.
    """
    df['is_slow_moving'] = df['inventory_turnover'] < threshold
    return df

# -------------------------------
# ⚠️ Flag Potential Stockouts
# -------------------------------

def flag_potential_stockouts(df):
    """
    Flags products likely to stock out before restock arrives.
    Adds 'days_left' and 'potential_stockout' columns.
    """
    # Estimate daily sales rate
    df['daily_sales_rate'] = df['sales_last_30_days'] / 30
    
    # Estimate days of stock left
    df['days_left'] = df['stock_level'] / df['daily_sales_rate']
    
    # Flag potential stockouts where days_left < supplier lead time
    df['potential_stockout'] = df['days_left'] < df['supplier_lead_time']
    return df

# -------------------------------
# 📊 Visualizations
# -------------------------------

def visualize_inventory(df):
    """
    Creates visualizations:
    - Bar chart: Average turnover by category
    - Scatter plot: Stock level vs. Sales
    """
    plt.figure(figsize=(10, 6))
    sns.barplot(data=df, x='category', y='inventory_turnover', ci=None)
    plt.title('Average Inventory Turnover by Category')
    plt.ylabel('Inventory Turnover')
    plt.xlabel('Category')
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

    plt.figure(figsize=(10, 6))
    sns.scatterplot(
        data=df,
        x='stock_level',
        y='sales_last_30_days',
        hue='potential_stockout',
        style='is_slow_moving',
        palette={True: 'red', False: 'green'}
    )
    plt.title('Stock Level vs. Sales (Last 30 Days)')
    plt.xlabel('Stock Level')
    plt.ylabel('Sales in Last 30 Days')
    plt.legend(title='Potential Stockout / Slow-Moving')
    plt.tight_layout()
    plt.show()

# -------------------------------
# 🧠 Putting it all together
# -------------------------------

def analyze_inventory(df):
    """
    Executes the full inventory analysis pipeline on the given DataFrame.
    Returns the enriched DataFrame.
    """
    df = preprocess_inventory(df)
    df = add_turnover_column(df)
    df = flag_slow_moving(df)
    df = flag_potential_stockouts(df)
    visualize_inventory(df)
    return df

# Example usage:
# df = analyze_inventory(df)


✅ Summary of New Columns Added to the DataFrame:
inventory_turnover

is_slow_moving (bool)

daily_sales_rate

days_left

potential_stockout (bool)

# Follow-up Prompts:
- "Handle division by zero when calculating inventory turnover."
- "Visualize turnover by category and stock vs sales using seaborn."
- "Ensure the DataFrame is cleaned of nulls and formatting issues before analysis."
- "Can you use color coding in the scatterplot for 'potential_stockout'?"
- "Add a reusable main function that executes the full analysis pipeline."

In [3]:
#Final Solution
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

# Example DataFrame (replace this with your actual data)
# df = pd.read_csv("your_inventory_data.csv")

# -------------------------------
# 🧹 Data Cleaning & Preparation
# -------------------------------

def preprocess_inventory(df):
    """
    Cleans and prepares the inventory DataFrame for analysis.
    - Converts dates
    - Handles missing values
    - Ensures numeric types are correct
    """
    # Convert 'last_restock_date' to datetime
    df['last_restock_date'] = pd.to_datetime(df['last_restock_date'], errors='coerce')

    # Replace 0 stock_level with NaN to avoid division by zero
    df['stock_level'] = df['stock_level'].replace(0, np.nan)

    # Handle nulls for relevant columns
    cols_to_check = ['sales_last_30_days', 'stock_level', 'supplier_lead_time', 'unit_cost']
    df[cols_to_check] = df[cols_to_check].apply(pd.to_numeric, errors='coerce')
    df.dropna(subset=cols_to_check, inplace=True)

    return df

# -------------------------------
# 📈 Inventory Turnover Analysis
# -------------------------------

def add_turnover_column(df):
    """
    Calculates inventory turnover rate.
    turnover = sales_last_30_days / stock_level
    Adds a new column 'inventory_turnover'. Handles division by zero.
    """
    df['inventory_turnover'] = df.apply(
        lambda row: row['sales_last_30_days'] / row['stock_level']
        if row['stock_level'] not in [0, np.nan] else np.nan,
        axis=1
    )
    return df

# -------------------------------
# 👒 Identify Slow-Moving Products
# -------------------------------

def flag_slow_moving(df, threshold=0.5):
    """
    Flags slow-moving products based on low turnover.
    Adds a boolean column 'is_slow_moving'.
    """
    df['is_slow_moving'] = df['inventory_turnover'] < threshold
    return df

# -------------------------------
# ⚠️ Flag Potential Stockouts
# -------------------------------

def flag_potential_stockouts(df):
    """
    Flags products likely to stock out before restock arrives.
    Adds 'days_left' and 'potential_stockout' columns.
    """
    # Estimate daily sales rate
    df['daily_sales_rate'] = df['sales_last_30_days'] / 30

    # Estimate days of stock left
    df['days_left'] = df['stock_level'] / df['daily_sales_rate']

    # Flag potential stockouts where days_left < supplier lead time
    df['potential_stockout'] = df['days_left'] < df['supplier_lead_time']
    return df

# -------------------------------
# 📊 Visualizations
# -------------------------------

def visualize_inventory(df):
    """
    Creates visualizations:
    - Bar chart: Average turnover by category
    - Scatter plot: Stock level vs. Sales
    """
    plt.figure(figsize=(10, 6))
    sns.barplot(data=df, x='category', y='inventory_turnover', ci=None)
    plt.title('Average Inventory Turnover by Category')
    plt.ylabel('Inventory Turnover')
    plt.xlabel('Category')
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

    plt.figure(figsize=(10, 6))
    sns.scatterplot(
        data=df,
        x='stock_level',
        y='sales_last_30_days',
        hue='potential_stockout',
        style='is_slow_moving',
        palette={True: 'red', False: 'green'}
    )
    plt.title('Stock Level vs. Sales (Last 30 Days)')
    plt.xlabel('Stock Level')
    plt.ylabel('Sales in Last 30 Days')
    plt.legend(title='Potential Stockout / Slow-Moving')
    plt.tight_layout()
    plt.show()

# -------------------------------
# 🧠 Full Analysis Pipeline
# -------------------------------

def analyze_inventory(df):
    """
    Executes the full inventory analysis pipeline on the given DataFrame.
    Returns the enriched DataFrame.
    """
    df = preprocess_inventory(df)
    df = add_turnover_column(df)
    df = flag_slow_moving(df)
    df = flag_potential_stockouts(df)
    visualize_inventory(df)
    return df

# -------------------------------
# 🔄 Main Function for Reusability
# -------------------------------

def main():
    """
    Main entry point to load data and run the full inventory analysis.
    Replace the data loading section with your actual source.
    """
    # Load data
    # df = pd.read_csv("your_inventory_data.csv")
    # For now, creating a placeholder DataFrame
    df = pd.DataFrame({
        'product_id': [1, 2, 3, 4],
        'category': ['A', 'B', 'A', 'C'],
        'stock_level': [10, 0, 50, 20],
        'last_restock_date': ['2023-07-01', '2023-07-05', '2023-06-20', '2023-07-15'],
        'sales_last_30_days': [100, 20, 10, 5],
        'supplier_lead_time': [5, 10, 15, 20],
        'unit_cost': [5.0, 3.0, 8.0, 2.5]
    })

    result_df = analyze_inventory(df)
    print(result_df)

# Example usage:
# if __name__ == "__main__":
#     main()


## Scenario 1: Discussion Questions
1. **How did different prompts handle date calculations?**

   The AI automatically converted last_restock_date into a datetime object using pd.to_datetime, with error handling for invalid formats. This only happened after I clarified that the column might not be clean. Without that clarification, earlier versions of the code did not convert or validate date fields. This shows that prompt specificity (mentioning unclean data and data types) directly influences the robustness of the code generated.
   
3. **What visualization approaches were suggested?**

   The AI used seaborn to create both a bar chart (inventory turnover by category) and a scatter plot (stock level vs. sales), which are appropriate choices for categorical and numeric comparisons. In a follow-up prompt, I asked the model to add color coding and legends to make the scatter plot more informative. Earlier prompt versions used basic matplotlib without styling, showing that prompt refinement improves visual clarity.

5. **How was error handling implemented?**

   After a follow-up prompt, the AI addressed potential division by zero in the inventory turnover calculation by replacing stock levels of zero with NaN. It also added data cleaning steps to convert numeric fields and drop rows with missing values. These additions were not present in the first version of the prompt, again emphasizing the importance of explicitly requesting data validation and error handling in prompts.

