# Interpreting & Communicating Data-Driven Insights
This notebook demonstrates how to interpret model outputs and communicate them effectively for strategic decision-making in **sports analytics**.

We'll use a **synthetic cricket player performance dataset** and a **ticket pricing dataset** (from our hands-on project pack).

In [None]:
import pandas as pd
# Load sample datasets
player_df = pd.read_csv('cricket_t20_player_performance.csv')
ticket_df = pd.read_csv('ticket_sales_dynamic_pricing.csv')
player_df.head()

## Step 1: Build a Simple Predictive Model
We'll train a regression model to predict runs scored using player form and context, then interpret the results.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_absolute_error, r2_score
import numpy as np

X = player_df.drop(columns=['runs_scored','match_id'])
y = player_df['runs_scored']
num_cols = X.select_dtypes(include=[np.number]).columns.tolist()
cat_cols = X.select_dtypes(exclude=[np.number]).columns.tolist()
pre = ColumnTransformer([
    ('cat', OneHotEncoder(handle_unknown='ignore'), cat_cols)
], remainder='passthrough')

X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, random_state=42)

model = Pipeline([('pre', pre), ('reg', LinearRegression())])
model.fit(X_tr, y_tr)
pred = model.predict(X_te)
mae = mean_absolute_error(y_te, pred)
r2 = r2_score(y_te, pred)
print({'MAE':mae, 'R2':r2})

### Interpretation:
- **MAE** tells us on average how far our predictions are from the true runs scored.
- **R²** shows how much variance in runs scored we can explain with our model.
- We can use this to inform **selection decisions**: players predicted to score above a certain threshold should stay in the lineup.

## Step 2: Turn Predictions into Decisions
Let's make a small decision table for 5 random players.

In [None]:
sample = X_te.sample(5, random_state=42)
sample_pred = model.predict(sample)
decision_table = sample.copy()
decision_table['predicted_runs'] = np.round(sample_pred, 1)
decision_table['recommendation'] = ['Retain' if r>35 else 'Rotate' for r in decision_table['predicted_runs']]
decision_table

### Communication:
Instead of saying *'The model predicted 42.3 runs'*, communicate in **actionable language**:

- "Player is likely to score **40+ runs** next match → retain in starting XI."
- "Player is likely to underperform (<30 runs) → consider rotating or giving rest."

## Step 3: Business Insight Example – Ticket Pricing
We'll group data by price bands and visualize average tickets sold to find the sweet spot.

In [None]:
import matplotlib.pyplot as plt
ticket_df['price_band'] = pd.cut(ticket_df['price'], bins=[0,300,400,500,600,1000])
band_stats = ticket_df.groupby('price_band')['tickets_sold'].mean()
band_stats.plot(kind='bar', title='Avg Tickets Sold by Price Band')
plt.ylabel('Average Tickets Sold')
plt.show()
band_stats

### Interpretation & Recommendation:
- Identify the price band where **revenue = price × tickets sold** is maximized.
- Communicate as:
  > "Our analysis shows setting price around ₹450 yields the highest revenue, even though slightly fewer tickets are sold compared to cheaper bands."

## Key Takeaways
- **Interpretation:** Convert raw predictions into meaningful business/performance insights.
- **Communication:** Use clear language, visuals, and actionable recommendations.
- **Decision-Making:** Always link your insight to a specific action for coaches, managers, or stakeholders.