
# 🤖 MGMT 467 - Unit 2 Lab 2: Prompt Studio — Feature Engineering & Beyond

**Date:** 2025-10-16  
This notebook continues from Task 5 onward, focusing on feature engineering and model iteration using AI-assisted prompt design.

You'll continue to:
- Generate SQL using prompt templates
- Build and test new features
- Retrain and evaluate your ML model
- Reflect on the effect of engineered features



## Task 5.0: Bucket a Continuous Feature

**🎯 Goal:** Group 'total_minutes' into categories: low, medium, high.  
**📌 Requirements:** Use CASE WHEN or IF statements to create 'watch_time_bucket'.

---

### 🧠 Prompt Template  
> Write SQL that creates a new column watch_time_bucket based on total_minutes thresholds (<100, 100–300, >300).

---

### 👩‍🏫 Example Prompt  
> Create a new column watch_time_bucket with values 'low', 'medium', or 'high' based on total_minutes.

---

### 🔍 Exploration  
How does churn rate vary across these buckets?


In [1]:
%%bigquery
CREATE OR REPLACE TABLE
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS
SELECT
  t.*,
  CASE
    WHEN t.total_minutes < 100 THEN 'low'
    WHEN t.total_minutes BETWEEN 100 AND 300 THEN 'medium'
    WHEN t.total_minutes > 300 THEN 'high'
    ELSE 'unknown'
  END AS watch_time_bucket
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features` AS t;

Query is running:   0%|          |

In [2]:
%%bigquery
SELECT
  watch_time_bucket,
  COUNT(*) AS total_customers,
  SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END) AS churned_customers,
  SAFE_DIVIDE(SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END), COUNT(*)) AS churn_rate
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
GROUP BY
  watch_time_bucket
ORDER BY
  CASE watch_time_bucket
    WHEN 'low' THEN 1
    WHEN 'medium' THEN 2
    WHEN 'high' THEN 3
    ELSE 4
  END;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,watch_time_bucket,total_customers,churned_customers,churn_rate
0,low,52,24,0.461538
1,medium,1145,268,0.234061
2,high,9103,882,0.096891


**Exploration Question: How does churn rate vary across these buckets?**

The above query output shows the churn rate for different watch_time_bucket categories. From the output, we can see that:

1. Low watch time (less than 100 minutes): Has the highest churn rate at approximately 46.15%.
2. Medium watch time (100 to 300 minutes): Shows a lower churn rate of about 23.41%.
3. High watch time (more than 300 minutes): Has the lowest churn rate at approximately 9.69%.

This clearly indicates a strong inverse relationship: customers with lower total watch times are significantly more likely to churn, while those with higher watch times are much less likely to churn. This feature seems to be a very strong predictor of churn.




## Task 5.1: Create a Binary Flag Feature

**🎯 Goal:** Add a binary column flag_binge (1 if total_minutes > 500).  
**📌 Requirements:** Use IF logic to create a binary column in SQL.

---

### 🧠 Prompt Template  
> Write a SQL query that adds flag_binge = 1 if total_minutes > 500, else 0.

---

### 👩‍🏫 Example Prompt  
> Add a binary column flag_binge to identify binge-watchers.

---

### 🔍 Exploration  
Are binge-watchers more or less likely to churn?


In [3]:
%%bigquery
CREATE OR REPLACE TABLE
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS
SELECT
  t.*,
  CASE
    WHEN t.total_minutes > 500 THEN 1
    ELSE 0
  END AS flag_binge
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS t;

Query is running:   0%|          |

In [4]:
%%bigquery
SELECT
  column_name, data_type
FROM
  `heroic-trilogy-471119-k8.netflix.INFORMATION_SCHEMA.COLUMNS`
WHERE
  table_name = 'churn_features';

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,column_name,data_type
0,region,STRING
1,plan_tier,STRING
2,age_band,STRING
3,avg_rating,FLOAT64
4,total_minutes,FLOAT64
5,churn_label,INT64


In [5]:
%%bigquery
SELECT
  flag_binge,
  COUNT(*) AS total_customers,
  SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END) AS churned_customers,
  SAFE_DIVIDE(SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END), COUNT(*)) AS churn_rate
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
GROUP BY
  flag_binge
ORDER BY
  flag_binge;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,flag_binge,total_customers,churned_customers,churn_rate
0,0,4213,721,0.171137
1,1,6087,453,0.074421


**Exploration Question: Are binge-watchers more or less likely to churn?**

The above query output shows the churn rate for both binge and non-binge watchers. From the output, we can see that:

1. Non-binge-watchers (flag_binge = 0): Had a churn rate of approximately 17.1%.
2. Binge-watchers (flag_binge = 1): Had a significantly lower churn rate of approximately 7.4%.

Therefore, binge-watchers are less likely to churn.


## Task 5.2: Create an Interaction Term

**🎯 Goal:** Create plan_region_combo by combining plan_tier and region.  
**📌 Requirements:** Use CONCAT or STRING functions.

---

### 🧠 Prompt Template  
> Generate SQL to create a new column by combining plan_tier and region with an underscore.

---

### 👩‍🏫 Example Prompt  
> Create a column called plan_region_combo as CONCAT(plan_tier, '_', region).

---

### 🔍 Exploration  
Which plan-region combos have highest churn?


In [6]:
%%bigquery
CREATE OR REPLACE TABLE
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS
SELECT
  t.*,
  CONCAT(t.plan_tier, '_', t.region) AS plan_region_combo
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS t;

Query is running:   0%|          |

In [8]:
%%bigquery
SELECT DISTINCT
  plan_region_combo
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
ORDER BY
  plan_region_combo;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,plan_region_combo
0,premium_Canada
1,premium_USA


In [9]:
%%bigquery
SELECT
  plan_region_combo,
  COUNT(*) AS total_customers,
  SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END) AS churned_customers,
  SAFE_DIVIDE(SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END), COUNT(*)) AS churn_rate
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
GROUP BY
  plan_region_combo
ORDER BY
  churn_rate DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,plan_region_combo,total_customers,churned_customers,churn_rate
0,premium_Canada,3096,353,0.114018
1,premium_USA,7204,821,0.113964


**Exploration Question: Which plan-region combos have highest churn?**

The query output above shows the different plan_region_combo categories and their churn rates. Based on the output, we can see that:

1. Premium_Canada: Has a churn rate of approximately 11.40%.
2. Premium_USA: Has a churn rate of approximately 11.40%.

From these results, it appears that both premium_Canada and premium_USA have the highest churn rates, which are very similar. This suggests that customers on the premium plan in these two regions are more prone to churn compared to others.


## Task 5.3: Add Missingness Indicator Flags

**🎯 Goal:** Add binary flags to capture NULL values in age_band and avg_rating.  
**📌 Requirements:** Use IS NULL logic to create new flag columns.

---

### 🧠 Prompt Template  
> Create a new column is_missing_[col_name] that is 1 when column is NULL, else 0.

---

### 👩‍🏫 Example Prompt  
> Add is_missing_age that flags rows where age_band IS NULL.

---

### 🔍 Exploration  
Do missing values correlate with churn?


In [10]:
%%bigquery
CREATE OR REPLACE TABLE
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS
SELECT
  t.*,
  CASE WHEN t.age_band IS NULL THEN 1 ELSE 0 END AS is_missing_age_band,
  CASE WHEN t.avg_rating IS NULL THEN 1 ELSE 0 END AS is_missing_avg_rating
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed` AS t;

Query is running:   0%|          |

In [13]:
%%bigquery
SELECT
  COUNT(*) AS null_age_band_count
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
WHERE
  age_band IS NULL;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,null_age_band_count
0,0


The null_age_band_count is equal to 0. This indicated that there are genuinely no missing values in the age_band column.

In [11]:
%%bigquery
SELECT
  is_missing_age_band,
  COUNT(*) AS total_customers,
  SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END) AS churned_customers,
  SAFE_DIVIDE(SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END), COUNT(*)) AS churn_rate
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
GROUP BY
  is_missing_age_band
ORDER BY
  is_missing_age_band;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,is_missing_age_band,total_customers,churned_customers,churn_rate
0,0,10300,1174,0.113981


In [12]:
%%bigquery
SELECT
  is_missing_avg_rating,
  COUNT(*) AS total_customers,
  SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END) AS churned_customers,
  SAFE_DIVIDE(SUM(CASE WHEN churn_label = 1 THEN 1 ELSE 0 END), COUNT(*)) AS churn_rate
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`
GROUP BY
  is_missing_avg_rating
ORDER BY
  is_missing_avg_rating;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,is_missing_avg_rating,total_customers,churned_customers,churn_rate
0,0,8935,911,0.101959
1,1,1365,263,0.192674


**Exploration Question: Do missing values correlate with churn?**

The above 2 query outputs show the churn rates for when the age_band and average_rating columns are missing versus when there are not. From the output, we can see that:

1. For is_missing_age_band: The query result only shows customers where is_missing_age_band is 0, meaning age_band is not missing. The churn rate for these customers is approximately 11.40%. We previously validated as well that there are indeed no missing values in the age_band column. Since, we cannot compare the churn rates for when there are missing values in the age_band column because there are none, we cannot determine a correlation for this feature.

2. For is_missing_avg_rating: Customers with a non-missing avg_rating (is_missing_avg_rating = 0) have a churn rate of approximately 10.20%. Customers with a missing avg_rating (is_missing_avg_rating = 1) have a significantly higher churn rate of approximately 19.27%.

Hence, missing values in avg_rating appear to correlate strongly with churn, indicating that customers for whom avg_rating is not recorded are nearly twice as likely to churn.


## Task 5.4: Create Time-Based Features (Optional)

**🎯 Goal:** Add a column days_since_last_login.  
**📌 Requirements:** Use DATE_DIFF with CURRENT_DATE and last_login_date.

---

### 🧠 Prompt Template  
> Write SQL to create a column showing days since last login using DATE_DIFF.

---

### 👩‍🏫 Example Prompt  
> Add a column days_since_last_login = DATE_DIFF(CURRENT_DATE(), last_login_date, DAY).

---

### 🔍 Exploration  
Does login recency affect churn rate?



## Task 5.5: Assemble Enhanced Feature Table

**🎯 Goal:** Create churn_features_enhanced with all engineered columns.  
**📌 Requirements:** Include all prior features + engineered columns.

---

### 🧠 Prompt Template  
> Generate SQL to create churn_features_enhanced with new columns: watch_time_bucket, plan_region_combo, flag_binge, etc.

---

### 👩‍🏫 Example Prompt  
> Build a new table churn_features_enhanced with all original features + engineered ones.

---

### 🔍 Exploration  
Are row counts stable? Any NULLs introduced?


In [14]:
%%bigquery
CREATE OR REPLACE TABLE
  `heroic-trilogy-471119-k8.netflix.churn_features_enhanced` AS
SELECT
  * # Selects all columns, including all original and engineered features
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_bucketed`;

Query is running:   0%|          |

In [15]:
%%bigquery
SELECT
  (SELECT COUNT(*) FROM `heroic-trilogy-471119-k8.netflix.churn_features`) AS original_row_count,
  (SELECT COUNT(*) FROM `heroic-trilogy-471119-k8.netflix.churn_features_enhanced`) AS enhanced_row_count;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,original_row_count,enhanced_row_count
0,10300,10300


In [16]:
%%bigquery
SELECT
  COUNTIF(watch_time_bucket IS NULL) AS null_watch_time_bucket,
  COUNTIF(plan_region_combo IS NULL) AS null_plan_region_combo,
  COUNTIF(flag_binge IS NULL) AS null_flag_binge,
  COUNTIF(is_missing_age_band IS NULL) AS null_is_missing_age_band,
  COUNTIF(is_missing_avg_rating IS NULL) AS null_is_missing_avg_rating
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_enhanced`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,null_watch_time_bucket,null_plan_region_combo,null_flag_binge,null_is_missing_age_band,null_is_missing_avg_rating
0,0,0,0,0,0


**Exploration Question: Are row counts stable? Any NULLs introduced?**

The 2 query outputs above show the row counts stability and calculate the null values in the enhanced features. From the outputs, we can infer that:

1. Row Counts Stability: The original_row_count is 10300 and the enhanced_row_count is also 10300. This confirms that the row counts are stable, and no rows were lost or added during the feature engineering process.

2. NULLs Introduced: For all the newly engineered features (watch_time_bucket, plan_region_combo, flag_binge, is_missing_age_band, is_missing_avg_rating), the count of NULL values is 0. This means no unexpected NULLs were introduced in these columns during their creation. All our engineered features have been successfully populated without missing values.


Hence, the row counts are stable and there are no NULLS introduced.


## Task 6: Retrain Model on Engineered Features

**🎯 Goal:** Train a logistic regression model using churn_features_enhanced.  
**📌 Requirements:** Use BQML logistic_reg model with new feature columns.

---

### 🧠 Prompt Template  
> Write CREATE MODEL SQL using enhanced features including flags and buckets.

---

### 👩‍🏫 Example Prompt  
> Retrain churn_model_enhanced using watch_time_bucket, flag_binge, plan_region_combo.

---

### 🔍 Exploration  
Does model accuracy improve?


In [17]:
%%bigquery
CREATE OR REPLACE MODEL
  `heroic-trilogy-471119-k8.netflix.churn_model_enhanced`
OPTIONS
  (model_type='logistic_reg',
    input_label_cols=['churn_label'])
AS
SELECT
  plan_tier,
  region,
  age_band,
  avg_rating,
  total_minutes,
  watch_time_bucket,
  flag_binge,
  plan_region_combo,
  is_missing_age_band,
  is_missing_avg_rating,
  churn_label
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features_enhanced`;

Query is running:   0%|          |

In [22]:
%%bigquery
SELECT
  accuracy
FROM
  ML.EVALUATE(MODEL `heroic-trilogy-471119-k8.netflix.churn_model_enhanced`);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,accuracy
0,0.89284


Below are the outputs of the Logistic Regression model trained in Tasks 0-4. This is the baseline model.

In [19]:
%%bigquery
CREATE OR REPLACE MODEL `heroic-trilogy-471119-k8.netflix.churn_model`
OPTIONS(model_type='logistic_reg', input_label_cols=['churn_label']) AS
SELECT
  region,
  plan_tier,
  age_band,
  avg_rating,
  total_minutes,
  churn_label
FROM
  `heroic-trilogy-471119-k8.netflix.churn_features`

Query is running:   0%|          |

In [24]:
%%bigquery
SELECT
  accuracy
FROM
  ML.EVALUATE(MODEL `heroic-trilogy-471119-k8.netflix.churn_model`);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,accuracy
0,0.885406


**Exploration Question: Does model accuracy improve?**

The enhanced churn model has an accuracy of approximately 89.28% and the baseline model has an accuracy of appromately 88.54%.

This shows that the model accuracy after including the new engineered features has improved.


## Task 7: Compare Model Performance

**🎯 Goal:** Compare base model vs enhanced model using ML.EVALUATE.  
**📌 Requirements:** Use same evaluation query for both models.

---

### 🧠 Prompt Template  
> Write a SQL query to evaluate churn_model_enhanced and compare with churn_model.

---

### 👩‍🏫 Example Prompt  
> Compare ML.EVALUATE output from both models side-by-side.

---

### 🔍 Exploration  
Which features made the most difference?


In [29]:
%%bigquery
SELECT
  'churn_model_enhanced' AS model_name,
  * # Selects all evaluation metrics for the enhanced model
FROM
  ML.EVALUATE(MODEL `heroic-trilogy-471119-k8.netflix.churn_model_enhanced`)

UNION ALL

SELECT
  'churn_model' AS model_name,
  * # Selects all evaluation metrics for the base model
FROM
  ML.EVALUATE(MODEL `heroic-trilogy-471119-k8.netflix.churn_model`);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,model_name,precision,recall,accuracy,f1_score,log_loss,roc_auc
0,churn_model_enhanced,0.0,0.0,0.89284,0.0,0.327298,0.641703
1,churn_model,0.0,0.0,0.885406,0.0,0.346475,0.637541


**Reflection:** The churn_model_enhanced shows a slightly higher accuracy and ROC AUC compared to the base churn_model, confirming the improvement from feature engineering.

In [28]:
%%bigquery
SELECT
  processed_input, weight
FROM
  ML.WEIGHTS(MODEL `heroic-trilogy-471119-k8.netflix.churn_model_enhanced`)
ORDER BY
  ABS(weight) DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,processed_input,weight
0,flag_binge,-0.37798
1,is_missing_avg_rating,0.337516
2,__INTERCEPT__,0.288571
3,avg_rating,-0.017199
4,total_minutes,-0.000654
5,is_missing_age_band,0.0
6,plan_tier,
7,region,
8,age_band,
9,watch_time_bucket,


**Exploration Question: Which features made the most difference?**

The above query output shows the independent weights of the features in the enhanced churn model. These weights tell us the relative importance and direction of each feature's influence on the churn prediction. From the output, we can infer that:

1. flag_binge (-0.377980): This feature has the largest absolute weight and is negative. This indicates that being a binge-watcher significantly reduces the likelihood of churn, which aligns with our earlier exploration (binge-watchers are less likely to churn).

2. is_missing_avg_rating (0.337516): This feature has the second largest absolute weight and is positive. This means that a missing average rating significantly increases the likelihood of churn, also consistent with our previous findings.

3. intercept (0.288571): This is the baseline log-odds of churn when all other features are zero or at their reference category.

4. avg_rating (-0.017199) and total_minutes (-0.000654): These features have very small absolute weights, suggesting they have a minor impact on churn prediction in this model, especially compared to the flag_binge and is_missing_avg_rating features.

5. is_missing_age_band (0.000000): As expected, this has a weight of zero, confirming it had no influence since there were no missing values for age_band.

6. The categorical features (plan_tier, region, age_band, watch_time_bucket, plan_region_combo) show NaN weights in this specific output. This typically means their individual one-hot encoded components would have weights, but they aren't aggregated into a single 'input' weight here.

Hence, the features that made the most difference in improving the model's performance are flag_binge and is_missing_avg_rating. These two engineered features have the strongest coefficients, indicating they are significant predictors of churn in the enhanced model.

