### **Problem 4: Shift Scheduling Imbalance and Batching Gaps**

Production managers believe the scheduling is unbalanced. Investigate with a 30-day scope:

**Your tasks:**

1. Parse `Date Produced` to extract weekday and ISO calendar week.
2. Count total batches produced by shift and weekday.
3. Highlight shifts where output dropped by > 30% compared to the average.
4. Check if any shifts repeatedly missed full daily coverage (i.e., 0 entries for a date).
5. Group by `Operator` and count how many unique days they worked.
6. Identify any operator who worked more than 20 out of 30 days — flag for burnout.
7. Create a pivot table showing production volume by day and shift.

*Hint: Use `.dt.day_name()` and `.pivot_table()`.*

In [273]:
import pandas as pd
import numpy as np
import re

In [274]:
data = pd.read_csv('Spool_Manufacturing_Batch_Log.csv')

In [275]:
df = pd.DataFrame(data)

In [276]:
# 1. Parse `Date Produced` to extract weekday and ISO calendar week.
df['Date Produced'] = pd.to_datetime(df['Date Produced'])  # Convert to datetime

In [277]:
df['Weekday Produced'] = df['Date Produced'].dt.day_name() # Extract weekday name

In [278]:
df['ISO Week'] = df['Date Produced'].dt.isocalendar().week # extract ISO calendar week

In [279]:
df[['Date Produced', 'Weekday Produced', 'ISO Week']].head()

Unnamed: 0,Date Produced,Weekday Produced,ISO Week
0,2025-05-01,Thursday,18
1,2025-05-01,Thursday,18
2,2025-05-01,Thursday,18
3,2025-05-01,Thursday,18
4,2025-05-01,Thursday,18


In [280]:
# 2. Count total batches produced by shift and weekday.
total_batches_per_shift_weekday = df.groupby(['Shift', 'Weekday Produced'])['Batch ID'].size().unstack().fillna(0)

In [281]:
total_batches_per_shift_weekday.head()

Weekday Produced,Friday,Monday,Saturday,Sunday,Thursday,Tuesday,Wednesday
Shift,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Shift A,28,25,26,28,27,17,23
Shift B,34,28,23,24,28,24,30
Shift C,28,19,23,20,35,31,19


In [282]:
# Define desired weekday order
weekday_order = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

In [283]:
total_batches_per_shift_weekday = total_batches_per_shift_weekday.reindex(columns=weekday_order) # reorders columns

In [284]:
total_batches_per_shift_weekday.head()

Weekday Produced,Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday
Shift,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Shift A,28,25,17,23,27,28,26
Shift B,24,28,24,30,28,34,23
Shift C,20,19,31,19,35,28,23


In [285]:
# 3. Highlight shifts where output dropped by > 30% compared to the average.
output_per_shift = df.groupby('Shift')['Batch ID'].count()

In [286]:
output_per_shift

Shift
Shift A    174
Shift B    191
Shift C    175
Name: Batch ID, dtype: int64

In [287]:
average_output_per_shift = output_per_shift.mean()

In [288]:
average_output_per_shift

np.float64(180.0)

In [289]:
dropped_average = average_output_per_shift * 0.7

In [290]:
dropped_average

np.float64(125.99999999999999)

In [291]:
flagged_shifts = output_per_shift[output_per_shift < dropped_average]

In [292]:
flagged_shifts

Series([], Name: Batch ID, dtype: int64)

In [293]:
# 4. Check if any shifts repeatedly missed full daily coverage (i.e., 0 entries for a date).
daily_counts = df.groupby(['Date Produced', 'Shift'])['Batch ID'].count().unstack(fill_value=0)

In [294]:
missed_coverage = (daily_counts == 0)

In [295]:
missed_coverage_summary = missed_coverage.sum()

In [296]:
missed_coverage_summary

Shift
Shift A    0
Shift B    0
Shift C    0
dtype: int64

In [297]:
flagged_shifts = missed_coverage_summary[missed_coverage_summary > 2]

In [298]:
flagged_shifts

Series([], dtype: int64)

In [299]:
# 5. Group by `Operator` and count how many unique days they worked.
operator_unique_days_worked = df.groupby('Operator')['Date Produced'].nunique().sort_values(ascending=False)

In [300]:
operator_unique_days_worked

Operator
Paul Macdonald     29
Mary Anderson      28
David Pittman      28
Rita Graves        26
Patrick Meyer      26
Sherry Bryant      26
Kristen Cole       25
Jacqueline Bass    25
Amanda Anderson    24
Name: Date Produced, dtype: int64

In [301]:
# 6. Identify any operator who worked more than 20 out of 30 days — flag for burnout.
burnout_operators = operator_unique_days_worked.apply(lambda x: 'Burnout' if x > 20 else 'OK') # Series

In [302]:
burnout_operators

Operator
Paul Macdonald     Burnout
Mary Anderson      Burnout
David Pittman      Burnout
Rita Graves        Burnout
Patrick Meyer      Burnout
Sherry Bryant      Burnout
Kristen Cole       Burnout
Jacqueline Bass    Burnout
Amanda Anderson    Burnout
Name: Date Produced, dtype: object

In [303]:
# if df is needed
burnout_operators_df = operator_unique_days_worked.to_frame(name='Days Worked')

In [304]:
burnout_operators_df

Unnamed: 0_level_0,Days Worked
Operator,Unnamed: 1_level_1
Paul Macdonald,29
Mary Anderson,28
David Pittman,28
Rita Graves,26
Patrick Meyer,26
Sherry Bryant,26
Kristen Cole,25
Jacqueline Bass,25
Amanda Anderson,24


In [305]:
burnout_operators_df['Burnout Flag'] = burnout_operators_df['Days Worked'].apply(lambda x: 'Burnout' if x > 20 else 'OK')

In [306]:
burnout_operators_df

Unnamed: 0_level_0,Days Worked,Burnout Flag
Operator,Unnamed: 1_level_1,Unnamed: 2_level_1
Paul Macdonald,29,Burnout
Mary Anderson,28,Burnout
David Pittman,28,Burnout
Rita Graves,26,Burnout
Patrick Meyer,26,Burnout
Sherry Bryant,26,Burnout
Kristen Cole,25,Burnout
Jacqueline Bass,25,Burnout
Amanda Anderson,24,Burnout


In [307]:
# Merge burnout flag to original df
df = df.merge(burnout_operators_df, on='Operator', how='left')

In [308]:
df.head()

Unnamed: 0,Batch ID,Date Produced,Material Type,Color,Production Line,Weight (g),Scrap Rate (%),Pass/Fail,Operator,Phone,Email,Shift,Machine Barcode,Lot Number,Weekday Produced,ISO Week,Days Worked,Burnout Flag
0,eb6221c8-f45a-49f6-8c0c-ee28f5a29fc0,2025-05-01,PLA,Black,Line 2,1024.84,1.79,Pass,Jacqueline Bass,001-988-061-3911x7775,haynesdavid@yahoo.com,Shift C,MCH-001,L9935,Thursday,18,25,Burnout
1,9748d109-45e1-4bb0-98af-53396946b791,2025-05-01,PLA,Red,Line 1,1032.38,4.28,Pass,Kristen Cole,300-905-2906x4997,theodore63@yahoo.com,Shift A,MCH-001,L4257,Thursday,18,25,Burnout
2,35de154c-67d6-4144-a9e2-8afe65353fb2,2025-05-01,ABS,Blue,Line 4,988.29,1.65,Pass,Sherry Bryant,001-741-699-1830x254,timothy04@knox.net,Shift C,MCH-001,L3615,Thursday,18,26,Burnout
3,f3aca5db-d6d8-4dd3-9831-10d47c1c2c08,2025-05-01,ABS,Black,Line 2,1078.96,3.15,Pass,Kristen Cole,300-905-2906x4997,theodore63@yahoo.com,Shift C,MCH-003,L2674,Thursday,18,25,Burnout
4,5299334b-a2bd-43b4-99ca-e5073e6bab53,2025-05-01,PLA,Black,Line 3,976.53,2.81,Pass,Kristen Cole,300-905-2906x4997,theodore63@yahoo.com,Shift A,MCH-001,L8527,Thursday,18,25,Burnout


In [309]:
# 7. Create a pivot table showing production volume by day and shift.
production_volume_by_day_shift = df.groupby(['Weekday Produced', 'Shift'])['Batch ID'].count().unstack(fill_value=0)

In [314]:
production_volume_by_day_shift = production_volume_by_day_shift.reindex(weekday_order)

In [315]:
production_volume_by_day_shift

Shift,Shift A,Shift B,Shift C
Weekday Produced,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Sunday,28,24,20
Monday,25,28,19
Tuesday,17,24,31
Wednesday,23,30,19
Thursday,27,28,35
Friday,28,34,28
Saturday,26,23,23
