
# AI Use Cases Library — Starter Analysis Notebook

This notebook demonstrates **basic analysis workflows** with the AI Use Case dataset (**2,260 cases**).  
It is intentionally **minimal** for v1.0: load the dataset, run a few quick cuts, and plot simple visuals.

**Folders**  
- `../../data/use-cases.csv` — main dataset  
- `../../insights/` — curated written insights  
- `../../charts/` — PNG charts for quick browsing  
- `../../tools/analysis-scripts/` — this folder  
- Back to [README](../../README.md) · See insights: [Trends](../../insights/trends-analysis.md) · [Vendor Comparison](../../insights/vendor-comparison.md)  

> Tip: Create a new branch for your experiments before committing changes.


## 1) Setup

In [None]:

import pandas as pd
import matplotlib.pyplot as plt

# Display options
pd.set_option("display.max_colwidth", 160)

# Load dataset (CSV). If you are testing locally, ensure this relative path is correct.
df = pd.read_csv("../../data/use-cases.csv")

print(df.shape)
df.head()


## 2) Basic Overview

In [None]:

# Column list and quick NA check
df.columns.tolist(), df.isna().sum().sort_values(ascending=False).head(10)


## 3) Cases by Industry

In [None]:

industry_counts = (
    df["Use Case Industry"]
      .fillna("Unspecified")
      .value_counts()
      .head(15)
      .sort_values(ascending=True)
)

plt.figure(figsize=(10,6))
industry_counts.plot(kind="barh")
plt.title("Top Industries by Case Count")
plt.xlabel("Cases")
plt.tight_layout()
plt.show()


## 4) Cases by Domain

In [None]:

domain_counts = (
    df["Use Case Domain"]
      .fillna("Unspecified")
      .value_counts()
      .head(15)
      .sort_values(ascending=True)
)

plt.figure(figsize=(10,6))
domain_counts.plot(kind="barh")
plt.title("Top Domains by Case Count")
plt.xlabel("Cases")
plt.tight_layout()
plt.show()


## 5) Vendor Mentions (quick demo)

In [None]:

# Simple keyword searches in Tool/Technology column — adjust as needed
tools = df["Tool/Technology"].fillna("").str.lower()

metrics = {
    "Microsoft (Azure/Copilot)": tools.str.contains(r"\bazure\b|\bmicrosoft\b|\bcopilot\b", regex=True).sum(),
    "OpenAI (GPT/ChatGPT/o-series)": tools.str.contains(r"\bopenai\b|\bgpt\b|\bchatgpt\b|\bo[13]", regex=True).sum(),
    "Anthropic (Claude)": tools.str.contains(r"\banthropic\b|\bclaude\b", regex=True).sum(),
    "AWS (Bedrock/SageMaker)": tools.str.contains(r"\bamazon\b|\baws\b|\bbedrock\b|\bsagemaker\b", regex=True).sum(),
    "Google (Gemini/Vertex)": tools.str.contains(r"\bgoogle\b|\bgemini\b|\bvertex\b", regex=True).sum(),
    "IBM (watsonx)": tools.str.contains(r"\bibm\b|\bwatsonx\b|\bwatson\b", regex=True).sum(),
}

pd.Series(metrics).sort_values(ascending=False)


## 6) Outcomes & Benefits (keyword scan)

In [None]:

outcomes = df["Outcomes & Benefits"].fillna("").str.lower()

keywords = {
    "Cost reduction": r"\bcost\b|\bspend\b|\breduce.*cost|\bsave money\b",
    "Time savings / Speed": r"\btime\b|\bfaster\b|\bspeed\b|\blatency\b|\bturnaround\b|\breduce.*time",
    "Productivity": r"\bproductivity\b|\bthroughput\b|\boutput\b",
    "Accuracy / Quality": r"\baccurac|\bquality\b|\bprecision\b|\berror\b|\bbug\b",
    "Automation": r"\bautomate|\bautonom|\bstraight-through\b",
    "Revenue / Growth": r"\brevenue\b|\bsales\b|\bconversion\b|\bgrowth\b",
    "Customer Satisfaction": r"\bcsat\b|\bsatisfaction\b|\bnps\b|\bexperience\b|\bcx\b",
    "Risk / Compliance": r"\brisk\b|\bcompliance\b|\bgovernance\b|\baudit\b|\bfraud\b",
}

series = pd.Series({k: outcomes.str.contains(v, regex=True).sum() for k,v in keywords.items()})
series.sort_values(ascending=True).plot(kind="barh", figsize=(10,6), title="Outcomes & Benefits — keyword frequency")
plt.xlabel("Cases Mentioned")
plt.tight_layout()
plt.show()


## 7) Export helper (optional)

In [None]:

# Example: Save a filtered slice as CSV (uncomment to use)
# subset = df[df["Use Case Industry"] == "Healthcare"]
# subset.to_csv("healthcare_cases.csv", index=False)



---

### Notes
- For deeper analysis, see `../../insights/` and the charts in `../../charts/`.
- Please keep analyses reproducible (pin environments, comment code meaningfully).

**Happy exploring!** 🚀  

➡️ Back to [README](../../README.md) · See insights: [Trends](../../insights/trends-analysis.md) · [Vendor Comparison](../../insights/vendor-comparison.md)  

