
# 🤖 MGMT 467 - Unit 2 Lab 2: Prompt Studio — Feature Engineering & Beyond

**Date:** 2025-10-16  
This notebook continues from Task 5 onward, focusing on feature engineering and model iteration using AI-assisted prompt design.

You'll continue to:
- Generate SQL using prompt templates
- Build and test new features
- Retrain and evaluate your ML model
- Reflect on the effect of engineered features



## Task 5.0: Bucket a Continuous Feature

**🎯 Goal:** Group 'total_minutes' into categories: low, medium, high.  
**📌 Requirements:** Use CASE WHEN or IF statements to create 'watch_time_bucket'.

---

### 🧠 Prompt Template  
> Write SQL that creates a new column watch_time_bucket based on total_minutes thresholds (<100, 100–300, >300).

---

### 👩‍🏫 Example Prompt  
> Create a new column watch_time_bucket with values 'low', 'medium', or 'high' based on total_minutes.

---

### 🔍 Exploration  
How does churn rate vary across these buckets?


In [1]:
# prompt: Write an SQL query that creates a new column watch_time_bucket with values 'low', 'medium', or 'high' based on total_minutes thresholds (<100, 100–300, >300).

%%bigquery
SELECT
  *,
  CASE
    WHEN total_minutes < 100 THEN 'low'
    WHEN total_minutes BETWEEN 100 AND 300 THEN 'medium'
    ELSE 'high'
  END AS watch_time_bucket
FROM
  `mgmt-467-55510.netflix.cleaned_features`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,region,plan_tier,age_band,avg_rating,total_minutes,churn_label,watch_time_bucket
0,South America,Standard,45-54,4.1,950,,high
1,Europe,Standard,35-44,3.8,800,False,high
2,North America,Premium,25-34,4.5,1200,True,high
3,Asia,Basic,18-24,2.1,300,True,medium


In [2]:
#Exploring how churn rate varies across these buckets

%%bigquery
WITH ChurnedFeaturesWithWatchTime AS (
  SELECT
    *,
    CASE
      WHEN total_minutes < 100 THEN 'low'
      WHEN total_minutes BETWEEN 100 AND 300 THEN 'medium'
      ELSE 'high'
    END AS watch_time_bucket
  FROM
    `mgmt-467-55510.netflix.cleaned_features`
)
SELECT
  watch_time_bucket,
  AVG(CAST(churn_label AS INT64)) AS churn_rate
FROM
  ChurnedFeaturesWithWatchTime
GROUP BY
  watch_time_bucket
ORDER BY
  churn_rate DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,watch_time_bucket,churn_rate
0,medium,1.0
1,high,0.5


A higher churn rate is observed for the medium watch_time_bucket compared to the high watch_time_bucket


## Task 5.1: Create a Binary Flag Feature

**🎯 Goal:** Add a binary column flag_binge (1 if total_minutes > 500).  
**📌 Requirements:** Use IF logic to create a binary column in SQL.

---

### 🧠 Prompt Template  
> Write a SQL query that adds flag_binge = 1 if total_minutes > 500, else 0.

---

### 👩‍🏫 Example Prompt  
> Add a binary column flag_binge to identify binge-watchers.

---

### 🔍 Exploration  
Are binge-watchers more or less likely to churn?


In [3]:
# prompt: Add a binary column flag_binge using IF logic to identify binge-watchers (1 if total_minutes > 500, else 0)

%%bigquery
SELECT
  *,
  IF(total_minutes > 500, 1, 0) AS flag_binge
FROM
  `mgmt-467-55510.netflix.cleaned_features`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,region,plan_tier,age_band,avg_rating,total_minutes,churn_label,flag_binge
0,South America,Standard,45-54,4.1,950,,1
1,Europe,Standard,35-44,3.8,800,False,1
2,North America,Premium,25-34,4.5,1200,True,1
3,Asia,Basic,18-24,2.1,300,True,0


In [4]:
# prompt: Are binge-watchers more or less likely to churn?

%%bigquery
WITH ChurnedFeaturesWithBingeFlag AS (
  SELECT
    *,
    IF(total_minutes > 500, 1, 0) AS flag_binge
  FROM
    `mgmt-467-55510.netflix.cleaned_features`
)
SELECT
  flag_binge,
  AVG(CAST(churn_label AS INT64)) AS churn_rate
FROM
  ChurnedFeaturesWithBingeFlag
GROUP BY
  flag_binge
ORDER BY
  flag_binge DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,flag_binge,churn_rate
0,1,0.5
1,0,1.0


From these results, it appears that binge-watchers are less likely to churn compared to non-binge-watchers in this dataset. This is an interesting finding and suggests that high engagement (more than 500 total minutes) might correlate with higher customer retention.


## Task 5.2: Create an Interaction Term

**🎯 Goal:** Create plan_region_combo by combining plan_tier and region.  
**📌 Requirements:** Use CONCAT or STRING functions.

---

### 🧠 Prompt Template  
> Generate SQL to create a new column by combining plan_tier and region with an underscore.

---

### 👩‍🏫 Example Prompt  
> Create a column called plan_region_combo as CONCAT(plan_tier, '_', region).

---

### 🔍 Exploration  
Which plan-region combos have highest churn?


In [5]:
# prompt: Create a column called plan_region_combo as CONCAT(plan_tier, '_', region). Basically, creating this column by combining plan_tier and region with an underscore

%%bigquery
SELECT
  *,
  CONCAT(plan_tier, '_', region) AS plan_region_combo
FROM
  `mgmt-467-55510.netflix.cleaned_features`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,region,plan_tier,age_band,avg_rating,total_minutes,churn_label,plan_region_combo
0,South America,Standard,45-54,4.1,950,,Standard_South America
1,Europe,Standard,35-44,3.8,800,False,Standard_Europe
2,North America,Premium,25-34,4.5,1200,True,Premium_North America
3,Asia,Basic,18-24,2.1,300,True,Basic_Asia


In [6]:
# prompt: Which plan-region combos have highest churn?

%%bigquery
WITH FeaturesWithCombo AS (
  SELECT
    *,
    CONCAT(plan_tier, '_', region) AS plan_region_combo
  FROM
    `mgmt-467-55510.netflix.cleaned_features`
)
SELECT
  plan_region_combo,
  AVG(CAST(churn_label AS INT64)) AS churn_rate
FROM
  FeaturesWithCombo
GROUP BY
  plan_region_combo
ORDER BY
  churn_rate DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,plan_region_combo,churn_rate
0,Basic_Asia,1.0
1,Premium_North America,1.0
2,Standard_Europe,0.0
3,Standard_South America,


Observing the results, it is clear that certain plan-region combinations have a higher churn risk than others (such as Basic_Asia and Premium_North America).


## Task 5.3: Add Missingness Indicator Flags

**🎯 Goal:** Add binary flags to capture NULL values in age_band and avg_rating.  
**📌 Requirements:** Use IS NULL logic to create new flag columns.

---

### 🧠 Prompt Template  
> Create a new column is_missing_[col_name] that is 1 when column is NULL, else 0.

---

### 👩‍🏫 Example Prompt  
> Add is_missing_age that flags rows where age_band IS NULL.

---

### 🔍 Exploration  
Do missing values correlate with churn?


In [7]:
# prompt: Create a new column is_missing_age_band and is_missing_avg_rating that is 1 when column is NULL, else 0

%%bigquery
SELECT
  *,
  CASE WHEN age_band IS NULL THEN 1 ELSE 0 END AS is_missing_age_band,
  CASE WHEN avg_rating IS NULL THEN 1 ELSE 0 END AS is_missing_avg_rating
FROM
  `mgmt-467-55510.netflix.cleaned_features`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,region,plan_tier,age_band,avg_rating,total_minutes,churn_label,is_missing_age_band,is_missing_avg_rating
0,South America,Standard,45-54,4.1,950,,0,0
1,Europe,Standard,35-44,3.8,800,False,0,0
2,North America,Premium,25-34,4.5,1200,True,0,0
3,Asia,Basic,18-24,2.1,300,True,0,0


In [8]:
# prompt: Do missing values correlate with churn?

%%bigquery
WITH FeaturesWithMissingFlags AS (
  SELECT
    *,
    CASE WHEN age_band IS NULL THEN 1 ELSE 0 END AS is_missing_age_band,
    CASE WHEN avg_rating IS NULL THEN 1 ELSE 0 END AS is_missing_avg_rating
  FROM
    `mgmt-467-55510.netflix.cleaned_features`
)
SELECT
  is_missing_age_band,
  is_missing_avg_rating,
  AVG(CAST(churn_label AS INT64)) AS churn_rate
FROM
  FeaturesWithMissingFlags
GROUP BY
  is_missing_age_band, is_missing_avg_rating
ORDER BY
  churn_rate DESC;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,is_missing_age_band,is_missing_avg_rating,churn_rate
0,0,0,0.666667


Based on this current data, we cannot observe a correlation between missing values in these specific columns and churn, as there are no missing values to analyze.


## Task 5.4: Create Time-Based Features (Optional)

**🎯 Goal:** Add a column days_since_last_login.  
**📌 Requirements:** Use DATE_DIFF with CURRENT_DATE and last_login_date.

---

### 🧠 Prompt Template  
> Write SQL to create a column showing days since last login using DATE_DIFF.

---

### 👩‍🏫 Example Prompt  
> Add a column days_since_last_login = DATE_DIFF(CURRENT_DATE(), last_login_date, DAY).

---

### 🔍 Exploration  
Does login recency affect churn rate?



## Task 5.5: Assemble Enhanced Feature Table

**🎯 Goal:** Create churn_features_enhanced with all engineered columns.  
**📌 Requirements:** Include all prior features + engineered columns.

---

### 🧠 Prompt Template  
> Generate SQL to create churn_features_enhanced with new columns: watch_time_bucket, plan_region_combo, flag_binge, etc.

---

### 👩‍🏫 Example Prompt  
> Build a new table churn_features_enhanced with all original features + engineered ones.

---

### 🔍 Exploration  
Are row counts stable? Any NULLs introduced?


In [10]:
# prompt: Generate SQL to create churn_features_enhanced with new columns: watch_time_bucket, plan_region_combo, flag_binge, etc

%%bigquery
CREATE OR REPLACE TABLE `mgmt-467-55510.netflix.churn_features_enhanced` AS
SELECT
  *,
  CASE
    WHEN total_minutes < 100 THEN 'low'
    WHEN total_minutes BETWEEN 100 AND 300 THEN 'medium'
    ELSE 'high'
  END AS watch_time_bucket,
  IF(total_minutes > 500, 1, 0) AS flag_binge,
  CONCAT(plan_tier, '_', region) AS plan_region_combo,
  CASE WHEN age_band IS NULL THEN 1 ELSE 0 END AS is_missing_age_band,
  CASE WHEN avg_rating IS NULL THEN 1 ELSE 0 END AS is_missing_avg_rating
FROM
  `mgmt-467-55510.netflix.cleaned_features`;

Query is running:   0%|          |

In [11]:
# prompt: check cleaned_features and churn_features to see if all row counts are stable
%%bigquery
SELECT
  (SELECT COUNT(*) FROM `mgmt-467-55510.netflix.cleaned_features`) AS original_row_count,
  (SELECT COUNT(*) FROM `mgmt-467-55510.netflix.churn_features_enhanced`) AS enhanced_row_count;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,original_row_count,enhanced_row_count
0,4,4


In [12]:
# prompt: check cleaned_features and churn_features_enhanced for introduction of NULL values

%%bigquery
SELECT
  COUNTIF(region IS NULL) AS null_region,
  COUNTIF(plan_tier IS NULL) AS null_plan_tier,
  COUNTIF(age_band IS NULL) AS null_age_band,
  COUNTIF(avg_rating IS NULL) AS null_avg_rating,
  COUNTIF(total_minutes IS NULL) AS null_total_minutes,
  COUNTIF(churn_label IS NULL) AS null_churn_label,
  COUNTIF(watch_time_bucket IS NULL) AS null_watch_time_bucket,
  COUNTIF(flag_binge IS NULL) AS null_flag_binge,
  COUNTIF(plan_region_combo IS NULL) AS null_plan_region_combo,
  COUNTIF(is_missing_age_band IS NULL) AS null_is_missing_age_band,
  COUNTIF(is_missing_avg_rating IS NULL) AS null_is_missing_avg_rating
FROM
  `mgmt-467-55510.netflix.churn_features_enhanced`;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,null_region,null_plan_tier,null_age_band,null_avg_rating,null_total_minutes,null_churn_label,null_watch_time_bucket,null_flag_binge,null_plan_region_combo,null_is_missing_age_band,null_is_missing_avg_rating
0,0,0,0,0,0,1,0,0,0,0,0


Yes, both the original cleaned_features and the churn_features_enhanced have stable row counts (4 rows) and there are no NULLS introduced.


## Task 6: Retrain Model on Engineered Features

**🎯 Goal:** Train a logistic regression model using churn_features_enhanced.  
**📌 Requirements:** Use BQML logistic_reg model with new feature columns.

---

### 🧠 Prompt Template  
> Write CREATE MODEL SQL using enhanced features including flags and buckets.

---

### 👩‍🏫 Example Prompt  
> Retrain churn_model_enhanced using watch_time_bucket, flag_binge, plan_region_combo.

---

### 🔍 Exploration  
Does model accuracy improve?


In [13]:
# prompt: Retrain churn_model_enhanced using watch_time_bucket, flag_binge, plan_region_combo

%%bigquery
CREATE OR REPLACE MODEL `mgmt-467-55510.netflix.churn_model_enhanced`
OPTIONS(
  MODEL_TYPE='LOGISTIC_REG',
  input_label_cols=['churn_label']
) AS
SELECT
  churn_label,
  watch_time_bucket,
  flag_binge,
  plan_region_combo,
  is_missing_age_band,
  is_missing_avg_rating
FROM
  `mgmt-467-55510.netflix.churn_features_enhanced`
WHERE
  churn_label IS NOT NULL;

Query is running:   0%|          |

It seems that after comparing the base model (created in tasks 0-4) with the enhanced one, there isn't too much of a significant difference in terms of improvement.


## Task 7: Compare Model Performance

**🎯 Goal:** Compare base model vs enhanced model using ML.EVALUATE.  
**📌 Requirements:** Use same evaluation query for both models.

---

### 🧠 Prompt Template  
> Write a SQL query to evaluate churn_model_enhanced and compare with churn_model.

---

### 👩‍🏫 Example Prompt  
> Compare ML.EVALUATE output from both models side-by-side.

---

### 🔍 Exploration  
Which features made the most difference?


In [14]:
# prompt: Write a SQL query to evaluate churn_model_enhanced and compare with churn_model.

%%bigquery
SELECT
  *
FROM
  ML.EVALUATE(MODEL `mgmt-467-55510.netflix.churn_model_enhanced`,
    (SELECT
      churn_label,
      watch_time_bucket,
      flag_binge,
      plan_region_combo,
      is_missing_age_band,
      is_missing_avg_rating
    FROM
      `mgmt-467-55510.netflix.churn_features_enhanced`
    WHERE
      churn_label IS NOT NULL)
  );

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,precision,recall,accuracy,f1_score,log_loss,roc_auc
0,1.0,1.0,1.0,1.0,3e-06,1.0


In [16]:
# prompt: Get the model weights for churn_model_enhanced to see feature importance.

%%bigquery
SELECT
  *
FROM
  ML.WEIGHTS(MODEL `mgmt-467-55510.netflix.churn_model_enhanced`);

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,processed_input,weight,category_weights
0,watch_time_bucket,,"[{'category': 'high', 'weight': 2.664535259100..."
1,flag_binge,-4.572876,[]
2,plan_region_combo,,"[{'category': 'Standard_Europe', 'weight': -12..."
3,is_missing_age_band,0.0,[]
4,is_missing_avg_rating,0.0,[]
5,__INTERCEPT__,4.572876,[]


flag_binge, the high category of watch_time_bucket, and the Standard_Europe category of plan_region_combo are the most influential features, significantly reducing the likelihood of churn. The magnitude of their negative weights suggests they are very strong indicators of customer retention in this model.


## 🤔 Chain-of-Thought Prompts: Feature Engineering

### 1. Why bucket continuous values like watch time?
- What patterns become clearer by using categories like "low", "medium", "high"?



### 2. What value do interaction terms (e.g., `plan_tier_region`) add?
- Could some plans behave differently in different regions?

### 3. What’s the purpose of binary flags like `flag_binge`?
- Can these capture unique behaviors not reflected in raw totals?

### 4. After evaluating the enhanced model:
- Which new features helped the most?
- Did any surprise you?

✍️ Write your responses in a text cell below or in a shared doc for discussion.


1. Bucketing continuous values helps simplify complex data and enhance interpretability. Additionally, this methods allows us to capture non-linear relationships and organizes outliers much more efficiently. Using "low","medium", and "high" ensure that trends are  quickly identified and we are able make out actionable insights for the future.

2. Interaction terms help capture complex relationships that may not be easily identified by considering each feature independently. They account for target variable changes and predict patterns in data more accurately based on context. In different regions, aspects such as market maturity, cultural norms, and economic factors can contribute to the observed churn rates. Regional data can help emphasize localized dynamics that are otherwise missed.

3. Binary flags are used to highlight events indicating whether a particular condition is met, simplify model interpretation, capture non-linear relationships, and handle missing data. The flags tell you if a feature might fit into a specific category and show clear differences between aspects in the model to test hypotheses.

4. I was really suprised by the insights I gained through the models. I found the difference with and without the interaction variables interesting and helpful towards finding the customer retention.