# Step 7 â€” Analysis of Medications Involved

This notebook examines **which medications are most frequently involved** in
reported medication errors and how they relate to error patterns.

Goals:

- Identify the **Top 10 most frequently involved medications** (Medication 1).
- Visualize how often these medications appear in error reports.
- Explore a **heatmap of medications vs. top error patterns** to see whether
  certain drugs (e.g., Fentanyl, Ketamine) are tied to specific failure modes.

This mirrors the "product-level" analysis in the loan project, but here the
products are **high-risk medications**.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")

# ---------------------------------------------------------
# 1. Load data and filter for target certificates
# ---------------------------------------------------------
try:
    df_med = pd.read_excel('Krista 240726 Final.xlsx', sheet_name='Medication')
except FileNotFoundError:
    df_med = pd.read_csv('Krista 240726 Final.xlsx - Medication.csv')

target_certificates = ['AEL', 'GFL', 'MTC', 'REACH', 'AMR']
df_filtered = df_med[df_med['Source'].isin(target_certificates)].copy()

print("Filtered shape (target certificates only):", df_filtered.shape)

# ---------------------------------------------------------
# 2. Prepare the data: focus on Medication 1
# ---------------------------------------------------------
df_meds_analysis = df_filtered[df_filtered['Medication 1'].notna()].copy()

# Identify Top 10 medications
med_counts = df_meds_analysis['Medication 1'].value_counts()
top_10_meds = med_counts.head(10).index
print("Top 10 Medications Identified:")
print(list(top_10_meds))

# Filter to top-10 meds
df_top_meds = df_meds_analysis[df_meds_analysis['Medication 1'].isin(top_10_meds)]
print("Filtered shape (top 10 meds only):", df_top_meds.shape)

# ---------------------------------------------------------
# 3. Bar chart: frequency of errors by medication
# ---------------------------------------------------------
plt.figure(figsize=(12, 6))
ax = sns.countplot(
    data=df_top_meds,
    x='Medication 1',
    order=top_10_meds,
    palette='mako'
)

plt.title('Top 10 Medications Involved in Errors', fontsize=16)
plt.xlabel('Medication', fontsize=12)
plt.ylabel('Count of Errors', fontsize=12)
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.5)

# Add count labels above each bar
for p in ax.patches:
    ax.annotate(
        f'{int(p.get_height())}',
        (p.get_x() + p.get_width() / 2., p.get_height()),
        ha='center', va='center', xytext=(0, 9), textcoords='offset points'
    )

plt.tight_layout()
plt.show()

# ---------------------------------------------------------
# 4. Heatmap: Top 10 Medications vs. Top 10 Patterns
# ---------------------------------------------------------
# Recompute pattern counts within the top-meds subset
pattern_counts_step = df_top_meds['Pattern Specifics'].value_counts()
top_10_patterns = pattern_counts_step.head(10).index

# Filter for both Top Meds AND Top Patterns
df_heatmap_meds = df_top_meds[df_top_meds['Pattern Specifics'].isin(top_10_patterns)]

# Create cross-tab: Pattern Specifics x Medication 1
heatmap_med_data = pd.crosstab(
    df_heatmap_meds['Pattern Specifics'],
    df_heatmap_meds['Medication 1']
)

# Reorder columns to match the sorted Top 10 meds
heatmap_med_data = heatmap_med_data.reindex(columns=top_10_meds)

plt.figure(figsize=(12, 8))
sns.heatmap(
    heatmap_med_data,
    annot=True,
    fmt='g',
    cmap='Blues',
    linewidths=.5
)

plt.title('Heat Map: Top 10 Medications vs. Top Error Patterns', fontsize=16)
plt.xlabel('Medication', fontsize=12)
plt.ylabel('Error Pattern', fontsize=12)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
