# Session 58: Interpreting Model Results

**Unit 5: Basics of Predictive Analytics**
**Hour: 58**
**Mode: Practical Lab**

---

### 1. Objective

This lab focuses on model interpretability. Building a model that performs well is only half the battle; we also need to understand **why** it makes the predictions it does. This is crucial for building trust with stakeholders and for ensuring our model is fair and logical.

We will inspect the coefficients of our trained Logistic Regression model to determine which features are most influential in predicting churn.

### 2. Setup

Let's recreate our Logistic Regression model.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Load and prep data
url = 'https://raw.githubusercontent.com/IBM/telco-customer-churn-on-icp4d/master/data/Telco-Customer-Churn.csv'
df = pd.read_csv(url)
df_subset = df[['tenure', 'MonthlyCharges', 'Contract', 'Churn']].copy()
df_subset.dropna(inplace=True)

# Prep data for modeling
X = df_subset.drop('Churn', axis=1)
y = df_subset['Churn']
X_encoded = pd.get_dummies(X, columns=['Contract'], drop_first=True)
X_train, X_test, y_train, y_test = train_test_split(X_encoded, y, test_size=0.2, random_state=42)

# Fit model
log_model = LogisticRegression(max_iter=1000)
log_model.fit(X_train, y_train)

### 3. Understanding Logistic Regression Coefficients

Similar to Linear Regression, a trained Logistic Regression model has coefficients for each feature. However, their interpretation is slightly different.

*   **Positive Coefficient:** An increase in this feature's value increases the **log-odds** (and thus the probability) of the outcome being the positive class ('Yes' for churn).
*   **Negative Coefficient:** An increase in this feature's value decreases the **log-odds** (and thus the probability) of the outcome being the positive class.

The **magnitude** of the coefficient indicates the strength of the feature's influence.

### 4. Extracting and Visualizing Coefficients

Let's get the coefficients from our trained model.

In [None]:
# The coefficients are stored in model.coef_. It's a 2D array, so we take the first row.
coefficients = log_model.coef_[0]

# The feature names are the columns of our encoded X_train DataFrame
feature_names = X_train.columns

# Let's create a DataFrame to see them together
coef_df = pd.DataFrame({'Feature': feature_names, 'Coefficient': coefficients})

# Sort by the absolute value of the coefficient to see the most influential features
coef_df['Abs_Coefficient'] = coef_df['Coefficient'].abs()
coef_df = coef_df.sort_values(by='Abs_Coefficient', ascending=False)

print(coef_df)

**Interpretation:**
*   `Contract_One year` and `Contract_Two year` have large **negative** coefficients. This means that being on one of these contracts **strongly decreases** the probability of churning compared to the baseline (Month-to-month).
*   `tenure` has a **negative** coefficient. This means that as tenure increases, the probability of churning **decreases**.
*   `MonthlyCharges` has a small **positive** coefficient. This means that as monthly charges increase, the probability of churning **slightly increases**.

This confirms our EDA findings: **Contract type** and **tenure** are the most powerful predictors of churn.

#### Visualizing the Coefficients

A bar chart is a great way to visualize these feature importances.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(10, 6))
sns.barplot(x='Coefficient', y='Feature', data=coef_df.sort_values(by='Coefficient', ascending=False))
plt.title('Feature Importance for Churn Prediction')
plt.xlabel('Coefficient Value (Log-Odds)')
plt.ylabel('Feature')
plt.axvline(0, color='black', lw=0.5)
plt.show()

### 5. Conclusion

In this lab, you learned how to look "inside the black box" of your predictive model:
1.  Understand that coefficients in a Logistic Regression model represent the feature's influence on the probability of the outcome.
2.  Extract the coefficients and feature names from a trained Scikit-learn model.
3.  Create a summary table and visualization to rank features by their importance.
4.  Use this information to explain *why* your model makes certain predictions, which is crucial for building trust and deriving actionable business insights.

Our interpretation confirms that focusing on retaining customers with low tenure and month-to-month contracts is the correct business strategy.

**Next Session:** We will discuss how to package these findings into a clear and concise summary for a business audience.