# Race Dashboard



This document describes the requirements and design decisions we will adopt for the Race Dashboard for the Sanofi Asset Efficiency challenge.  This document is intended to provide a detailed explanation of what is to be included on the dashboard, required data sources, and assumptions being made in the design.  It will also show the steps for getting the data into the necessary format.

## Scope/objective

The objective is to be able to display the documented metrics and categories as best suited on a Dashboard presentation for Sanofi staff to be able to access.


## Metrics / categories
The metrics have been mapped into sectors to mimic different sectors of a race track.  The metrics are:
 
 
 
•     Race = 8 Laps = 8 Months   
•     Lap = Monthly Progress   
<br>
<br>

•     **Sector 1 = OEE Improvement   
•     Sector 2 = OEE Variability Improvement   
•     Sector 3 = Stoppage Reduction   
•     Sector 4 = Changeover Improvement**   
• _Sector 5 = Most effective OEE application   
• Sector 6 = Best Innovation   
• Sector 7 = Most consistent OEE improvement progress   
• Sector 8 = Collaboration   
• Sector 9 = Team Spirit_

 
 


| Change log |
|:----------:|    


| Date | Initials | Comments |
|------|:---------|:---------|
| 2021-06-23 | MC | in leaderboard, replace NaN values in laptime calc with the max laptime for that lap
| 2021-06-23 | MC | use race review dates for grouping data, rather than calendar months
| 2021-06-24 |JB | missing OEE_Diff figures should default to 'OEE %' - OEE start point, not just 'OEE %' value
| 2021-06-24 |JB | for sector one, multiply sum of OEE_Diff by -1.  Should have always been doing this.
|2021-06-25 | MC | in leaderboard, change prev_race_time calc to include all but last 2 cols, to handle new race cols as they arrive
|2021-06-28 | MC | correct dates in nominations spreadsheet, and merge on 1 row with a date within the review period
|2021-06-29 | MC | corrected nominations spreadsheet for Lisieux IWK - should be TR200 Packaging
|2021-06-30 | MC | populate missing OEE % values before calculating rolling_std using the OEE % column
|2021-06-30 | MC | remove start_changeover_calc merge as it's not being used any more, and was dropping rows

In [252]:
import pandas as pd
import numpy as np
import datetime


# Viz libs
import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns

# display options
# pd.options.display.float_format = "{:.2f}".format


## Read file and cleanse

- OEE.xlsx should contain the OEE latest data from QS by week   
- QSDashboard.xlsx has a list of plants/sites taking part, with original target OEE  
- UnplannedTechLoss.xlxs should contain the latest data from QS (my unplanned chart by line/week)   
- changeover.xlsx should contain the changeover information in QlikSense

#### Cleaning required:   
- OEE % needs converting to numeric, coerce the nulls to nan values
- I think W53-2020 data is bobbings - causes dup indexes for date 2021-01-10 and don't need 2020 data anyway...so dropping it

In [253]:
# dir = "C:/Users/mark_/McLaren Technology Group/McLaren Accelerator - Sanofi - Sanofi/Data Analysis/"
# dir = "C:/Users/mark_/Sanofi/Sanofi x McLaren sharing - General/Race Dashboard data/"
dir = "C:/Users/mark_/Documents/McLaren2021/Sanofi/Race Dashboard data/"
output_dir = "C:/Users/mark_/Documents/McLaren2021/Sanofi/Race Dashboard data/"

# dir = 'C:/Users/james.blood/Documents/McLarenSanofi/McLarenSanofi/data/'

file = (dir + 'OEE.xlsx')
df_weekly = pd.read_excel(file)
file = (dir + 'QSDashboard.xlsx')
df_dash = pd.read_excel(file)
file = (dir + 'Unplanned_tech_loss.xlsx')
df_techloss = pd.read_excel(file)
file = (dir + 'changeover.xlsx')
df_changeover = pd.read_excel(file)
df_weekly = df_weekly.loc[df_weekly['Week'].str.contains('2021')]
df_techloss = df_techloss.loc[df_techloss['Week'].str.contains('2021')]
df_weekly['OEE %'] = pd.to_numeric(df_weekly['OEE %'], errors='coerce')
df_techloss.rename(columns={'Unplanned losses - %OEE':'Unplanned_tech_loss'}, inplace=True)
df_techloss['Unplanned_tech_loss'] = pd.to_numeric(df_techloss['Unplanned_tech_loss'], errors='coerce')
df_changeover.rename(columns={'Change over losses - %OEE':'Changeover'}, inplace=True)
df_changeover['Changeover'] = pd.to_numeric(df_changeover['Changeover'], errors='coerce')
# don't use their progress figure as it's a static val
# df_dash.rename(columns={'⇗ OEE% progress':'OEE% progress'}, inplace=True)

create a datetime from the week number

In [254]:
df_weekly['WeekOfYear'] = pd.to_numeric(df_weekly['Week'].str[1:3])
df_weekly['Year'] = pd.to_numeric(df_weekly['Week'].str[4:])
dates = df_weekly.Year*100+df_weekly.WeekOfYear
df_weekly['Date'] = pd.to_datetime(dates.astype(str) + '0', format='%Y%W%w')
# df_weekly.drop(columns=['Year','WeekOfYear'], inplace=True)
df_weekly.head()

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date
11,W01-2021,C2 Packaging Line,0.16897,1,2021,2021-01-10
12,W01-2021,C9 Packaging Line,,1,2021,2021-01-10
13,W01-2021,GAMMA1,0.406686,1,2021,2021-01-10
14,W01-2021,IMA C80/2,0.510044,1,2021,2021-01-10
15,W01-2021,L18 Packaging Line,0.173736,1,2021,2021-01-10


In [255]:
#merge the 2 dataframes to get the start OEE
df_weekly = df_weekly.merge(df_dash[['Plant','Line', 'OEE  Start point','OEE% Target (2022)']],on='Line')

In [256]:
df_weekly = df_weekly.merge(df_techloss[['Line', 'Week', 'Unplanned_tech_loss']],on=['Line','Week'])

In [257]:
df_weekly = df_weekly.merge(df_changeover[['Line','Week','Changeover']])

#### Start Changeover

*not sure this is needed if we are using changeover_diff, which will naturally reward increases / decreases in changeover??*

Start changeover value isn't provided, so going to calc our own start point using the average changeover for each site in 2021 up to April 2021.  This needs to be done before we drop the early 2021 rows.

This is then merged into the df_weekly dataframe as a loose join.

In [258]:
# start_changeover_calc = df_weekly[['Plant','Line','Changeover']][df_weekly['Date'] < '2021-04-30'].groupby(['Plant','Line']).mean().reset_index()
# start_changeover_calc.rename(columns={'Changeover':'start_changeover'}, inplace=True)
# df_weekly = df_weekly.merge(start_changeover_calc[['Line','start_changeover']])

In [259]:
 df_weekly[['Plant','Line','Changeover']][df_weekly['Date'] < '2021-04-30']

Unnamed: 0,Plant,Line,Changeover
0,Maisons-Alfort,C2 Packaging Line,0.079648
1,Maisons-Alfort,C2 Packaging Line,0.066610
2,Maisons-Alfort,C2 Packaging Line,0.108895
3,Maisons-Alfort,C2 Packaging Line,0.098150
4,Maisons-Alfort,C2 Packaging Line,0.064475
...,...,...,...
289,Frankfurt,AL5 Packaging 1,0.122727
300,Frankfurt,AL6,0.184958
301,Frankfurt,AL6,0.101231
302,Frankfurt,AL6,0.056547


#### Dates for the Asset Challenge

Start Date is going to be fixed as 2021-04-01. Remove all the rows from df_weekly before this date

End Date will move and act as a cutoff before each Race meeting

In [260]:
df_weekly[['Date','OEE %']][df_weekly.Line.str.contains('LINE')]

Unnamed: 0,Date,OEE %
314,2021-05-02,0.0
315,2021-05-09,0.0
316,2021-05-16,0.372515
317,2021-05-23,0.441221
318,2021-05-30,0.385174
319,2021-06-06,0.314807
320,2021-06-13,0.518701
321,2021-06-20,0.36854
322,2021-06-27,0.566651
323,2021-07-04,0.602027


In [261]:
start_date = '2021-04-01'
df_weekly = df_weekly[df_weekly['Date'] > start_date].sort_values('Date')

# do we need this?  We now have race review dates
# end_date = '2021-07-15'
# df_weekly = df_weekly[df_weekly['Date'] < end_date].sort_values('Date')

### PCT_CHANGE
Using pct_change python function with periods=4, giving a 4 week (4 previous rows) rolling pct_change figure
- I believe we are doing this rolling average calculation within Tableau at the moment, so this isn't being used here

- Not sure whether this is required any more - removing for all categories (2021-06-30)

In [262]:
# df_weekly.sort_values(['Line','Date'], inplace = True)
# df_weekly['OEE_pct_chg'] = (df_weekly.groupby('Line')['OEE %']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly['techloss_pct_chg'] = (df_weekly.groupby('Line')['Unplanned_tech_loss']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly['Changeover_pct_chg'] = (df_weekly.groupby('Line')['Changeover']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly.head()

## Standard Deviation
Calculate std_dev and mean on a 4 week rolling basis

Standard deviation is the square root of the variance, so no need to calculate both and have left var out

#### Populate missing OEE %
- Find the weekly min/max OEE % from any site   
- Merge those columns into df_weekly   
- fill any NaN with the min OEE we calc'd for that week   

In [263]:
df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['OEE %'].abs())
       .groupby(pd.Grouper(key='Date',freq='W'))['OEE %'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('Week'))
df_weekly_minmax.reset_index(inplace=True)
df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekMin','WeekMax']])
df_weekly['OEE %'].fillna(df_weekly.WeekMin, inplace=True)
df_weekly.drop(columns=['WeekMin','WeekMax'], inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover
0,W13-2021,C2 Packaging Line,0.458414,13,2021,2021-04-04,Maisons-Alfort,0.397503,0.470,0.465009,0.044140
1,W13-2021,L18 Packaging Line,0.465923,13,2021,2021-04-04,Tours,0.377683,0.547,0.208273,0.134271
2,W13-2021,GAMMA1,0.431186,13,2021,2021-04-04,SCOPPITO,0.418683,0.570,0.291389,0.209384
3,W13-2021,L25 Packaging Line,0.393740,13,2021,2021-04-04,Tours,0.351564,0.478,0.206618,0.132655
4,W13-2021,M18 Filling,0.046124,13,2021,2021-04-04,Frankfurt,0.443522,0.650,0.114818,0.789781
...,...,...,...,...,...,...,...,...,...,...,...
197,W26-2021,M18 Filling,0.000000,26,2021,2021-07-04,Frankfurt,0.443522,0.650,0.000000,0.943690
198,W26-2021,L25 Packaging Line,0.520840,26,2021,2021-07-04,Tours,0.351564,0.478,0.147637,0.113122
199,W26-2021,TR200 Packaging Line,0.000000,26,2021,2021-07-04,Lisieux,0.483505,0.650,,
200,W26-2021,C2 Packaging Line,0.156844,26,2021,2021-07-04,Maisons-Alfort,0.397503,0.470,0.461691,0.212684


In [264]:
df_weekly['rolling_std'] = df_weekly.groupby('Line')['OEE %'].apply(lambda x : x.rolling(4,1).agg(np.std))
df_weekly.head(5)

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover,rolling_std
0,W13-2021,C2 Packaging Line,0.458414,13,2021,2021-04-04,Maisons-Alfort,0.397503,0.47,0.465009,0.04414,
1,W13-2021,L18 Packaging Line,0.465923,13,2021,2021-04-04,Tours,0.377683,0.547,0.208273,0.134271,
2,W13-2021,GAMMA1,0.431186,13,2021,2021-04-04,SCOPPITO,0.418683,0.57,0.291389,0.209384,
3,W13-2021,L25 Packaging Line,0.39374,13,2021,2021-04-04,Tours,0.351564,0.478,0.206618,0.132655,
4,W13-2021,M18 Filling,0.046124,13,2021,2021-04-04,Frankfurt,0.443522,0.65,0.114818,0.789781,


In [265]:
file = (dir + 'Nominations Category Scoring.xlsx')
df_nom_sectors = pd.read_excel(file, sheet_name='Nomination scoring', usecols="A:H", parse_dates=['Date'])

In [266]:
df_nom_sectors['Date'] = pd.Series(df_nom_sectors['Date']).fillna(method='ffill')
df_nom_sectors = df_nom_sectors.fillna(0)

df_weekly = df_weekly.merge(df_nom_sectors[['Line','Plant','Date','Best Solution','Best Innovation','Improvement Iterations','Lessons and Sharing','Team Contribution and Spirit']], how='outer', on=['Date','Plant','Line'])
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.470,0.465009,0.044140,,,,,,
1,W13-2021,L18 Packaging Line,0.465923,13.0,2021.0,2021-04-04,Tours,0.377683,0.547,0.208273,0.134271,,,,,,
2,W13-2021,GAMMA1,0.431186,13.0,2021.0,2021-04-04,SCOPPITO,0.418683,0.570,0.291389,0.209384,,,,,,
3,W13-2021,L25 Packaging Line,0.393740,13.0,2021.0,2021-04-04,Tours,0.351564,0.478,0.206618,0.132655,,,,,,
4,W13-2021,M18 Filling,0.046124,13.0,2021.0,2021-04-04,Frankfurt,0.443522,0.650,0.114818,0.789781,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
227,,AL5 Packaging 1,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
228,,AL6,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
229,,M18 Filling,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
230,,M21 Filling,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0


## Create review dates
create a review_date column for grouping the data later, so we only get the data we're interested in for each review

In [267]:
def thurs_of_weekbefore(year, week):
    return datetime.date.fromisocalendar(year, week-1, 4)  # (year, week before (w-1), thursday)

review_weeks = [16, 20, 24, 28, 34, 38, 42, 47]
review_dates = []

for i in review_weeks:
    if i > 0:
        review_dates.append((thurs_of_weekbefore(2021,i)))

df_review_dates = pd.DataFrame(review_dates)
df_review_dates.rename(columns={0:'Review_Date'}, inplace=True)
df_review_dates['Review_Date'] = pd.to_datetime(df_review_dates.Review_Date)

# df_review_dates.info()
df_weekly = pd.merge_asof(df_weekly.sort_values('Date'), df_review_dates, left_on='Date', right_on='Review_Date', allow_exact_matches=True, direction='forward')


#### We need the diff between the weekly OEE % figures, and the Weekly Changeover figures
Need something to calculate the OEE Progress and Changeover, otherwise we will have problems when we group and sum values later
- Create OEE_diff with OEE % from groupby of each Line, Week (only 1 row per week, so 'mean' will yield the same)   
- Find the diff between the rows in OEE_Diff for each Line   
- fillNA (first row for each Line) with OEE Start point - should only be needed on the first row for each Line   

repeat same logic for Changeover - there will be more NaN as start_changeover wasn't provided for all.  We populate this later

In [268]:
# this was calculating the wrong Diff - the first row of each site was looking at the previous site for all but the 1st calc
# needed to sort by Line and Date first 

# OEE_Diff = df_weekly.groupby(['Line',pd.Grouper(key='Date',freq='W')])['OEE %'].mean().reset_index()
# OEE_Diff["OEE_Diff"] = OEE_Diff["OEE %"].diff()
# df_weekly = df_weekly.merge(OEE_Diff[["Line","Date","OEE_Diff"]], on=(["Line","Date"]))

# df_weekly['OEE_Diff'].fillna(df_weekly['OEE %'] - df_weekly['OEE  Start point'], inplace=True)
# df_weekly[["Line","Date","OEE %","OEE_Diff"]].head(50).sort_values(by=['Line', 'Date'])

In [269]:
OEE_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','OEE %','OEE  Start point']]
OEE_Diff['OEE_Diff'] = OEE_Diff.groupby('Line')['OEE %'].diff().fillna(df_weekly['OEE %'] - df_weekly['OEE  Start point'])
df_weekly = df_weekly.merge(OEE_Diff[["Line","Date","OEE_Diff"]], on=(["Line","Date"]))
df_weekly[["Line","Date","OEE %","OEE_Diff"]].head(50).sort_values(by=['Line', 'Date'])

Unnamed: 0,Line,Date,OEE %,OEE_Diff
2,AL5 Packaging 1,2021-04-04,0.046124,-0.433569
17,AL5 Packaging 1,2021-04-11,0.259071,0.212947
31,AL5 Packaging 1,2021-04-18,0.449745,0.190675
41,AL5 Packaging 1,2021-04-25,0.642652,0.192906
1,AL6,2021-04-04,0.367897,0.03524
21,AL6,2021-04-11,0.360681,-0.007216
34,AL6,2021-04-18,0.33414,-0.026541
45,AL6,2021-04-25,0.309545,-0.024595
0,C2 Packaging Line,2021-04-04,0.458414,0.060911
23,C2 Packaging Line,2021-04-11,0.530707,0.072293


In [270]:
# Changeover_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover','start_changeover']]
Changeover_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover']]
# Changeover_Diff['Changeover_Diff'] = Changeover_Diff.groupby('Line')['Changeover'].diff().fillna(df_weekly['start_changeover'] - df_weekly['Changeover'])
Changeover_Diff['Changeover_Diff'] = Changeover_Diff.groupby('Line')['Changeover'].diff().fillna(0)
df_weekly = df_weekly.merge(Changeover_Diff[["Line","Date","Changeover_Diff"]], on=(["Line","Date"]))

In [271]:
Changeover_Diff[Changeover_Diff.Line.str.contains('AL5')]

Unnamed: 0,Line,Date,Changeover,Changeover_Diff
2,AL5 Packaging 1,2021-04-04,,0.0
17,AL5 Packaging 1,2021-04-11,,0.0
31,AL5 Packaging 1,2021-04-18,0.087039,0.0
41,AL5 Packaging 1,2021-04-25,0.122727,0.035688
60,AL5 Packaging 1,2021-05-02,0.063261,-0.059466
81,AL5 Packaging 1,2021-05-09,0.089293,0.026032
91,AL5 Packaging 1,2021-05-16,0.060573,-0.028721
103,AL5 Packaging 1,2021-05-23,,0.0
126,AL5 Packaging 1,2021-05-30,,0.0
136,AL5 Packaging 1,2021-06-06,0.088889,0.0


In [272]:
# Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover','start_changeover']]
Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover']]
Changeover_mean['Changeover_rolling_mean'] = Changeover_mean.groupby('Line')['Changeover'].apply(lambda x : x.rolling(4,1).mean())
df_weekly = df_weekly.merge(Changeover_mean[["Line","Date","Changeover_rolling_mean"]], on=(["Line","Date"]))

In [273]:
Changeover_mean[Changeover_mean.Line.str.contains('LINE')]

Unnamed: 0,Line,Date,Changeover,Changeover_rolling_mean
52,LINE 01 - UHLMANN 1880,2021-05-02,0.0,0.0
70,LINE 01 - UHLMANN 1880,2021-05-09,0.0,0.0
86,LINE 01 - UHLMANN 1880,2021-05-16,0.016561,0.00552
105,LINE 01 - UHLMANN 1880,2021-05-23,0.040157,0.014179
121,LINE 01 - UHLMANN 1880,2021-05-30,0.024481,0.0203
131,LINE 01 - UHLMANN 1880,2021-06-06,0.205835,0.071758
152,LINE 01 - UHLMANN 1880,2021-06-13,0.025332,0.073951
163,LINE 01 - UHLMANN 1880,2021-06-20,0.023659,0.069827
176,LINE 01 - UHLMANN 1880,2021-06-27,0.018125,0.068238
200,LINE 01 - UHLMANN 1880,2021-07-04,0.02452,0.022909


#### Populate missing Unplanned Tech Loss

- Create weekly min/max cols for Unplanned tech loss from any site   
- Merge those columns into df_weekly   
- fill any NaN rows with the max Unplanned_tech_loss found for that week   

**this might be flawed!!** 

In [274]:
df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['Unplanned_tech_loss'].abs())
       .groupby(pd.Grouper(key='Date',freq='W'))['Unplanned_tech_loss'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('WeekUTL'))
df_weekly_minmax.reset_index(inplace=True)
df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekUTLMin','WeekUTLMax']])
df_weekly['Unplanned_tech_loss'].fillna(df_weekly.WeekUTLMax, inplace=True)
df_weekly.drop(columns=['WeekUTLMin','WeekUTLMax'], inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,...,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit,Review_Date,OEE_Diff,Changeover_Diff,Changeover_rolling_mean
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.47,0.465009,...,,,,,,,2021-04-15,0.060911,0.0,0.044140
1,W13-2021,AL6,0.367897,13.0,2021.0,2021-04-04,Frankfurt,0.332657,0.45,0.342851,...,,,,,,,2021-04-15,0.035240,0.0,0.184958
2,W13-2021,AL5 Packaging 1,0.046124,13.0,2021.0,2021-04-04,Frankfurt,0.479693,0.50,0.465009,...,,,,,,,2021-04-15,-0.433569,0.0,
3,W13-2021,IMA C80/2,0.581304,13.0,2021.0,2021-04-04,SCOPPITO,0.451031,0.58,0.224561,...,,,,,,,2021-04-15,0.130272,0.0,0.070836
4,W13-2021,SUPPO Packaging Line,0.432432,13.0,2021.0,2021-04-04,Lisieux,0.353021,0.53,0.148267,...,,,,,,,2021-04-15,0.079411,0.0,0.178806
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
227,,L25 Packaging Line,,,,2021-09-12,Tours,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.178769
228,,L18 Packaging Line,,,,2021-09-12,Tours,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.111640
229,,M21 Filling,,,,2021-09-12,Frankfurt,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.220155
230,,C2 Packaging Line,,,,2021-09-12,Maisons-Alfort,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.212684


#### Populate missing Changeover 

In [275]:
df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['Changeover'].abs())
       .groupby(pd.Grouper(key='Date',freq='W'))['Changeover'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('WeekChangeover'))
df_weekly_minmax.reset_index(inplace=True)
df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekChangeoverMin','WeekChangeoverMax']])
df_weekly['Changeover'].fillna(df_weekly.WeekChangeoverMax, inplace=True)
df_weekly.drop(columns=['WeekChangeoverMin','WeekChangeoverMax'], inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,...,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit,Review_Date,OEE_Diff,Changeover_Diff,Changeover_rolling_mean
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.47,0.465009,...,,,,,,,2021-04-15,0.060911,0.0,0.044140
1,W13-2021,AL6,0.367897,13.0,2021.0,2021-04-04,Frankfurt,0.332657,0.45,0.342851,...,,,,,,,2021-04-15,0.035240,0.0,0.184958
2,W13-2021,AL5 Packaging 1,0.046124,13.0,2021.0,2021-04-04,Frankfurt,0.479693,0.50,0.465009,...,,,,,,,2021-04-15,-0.433569,0.0,
3,W13-2021,IMA C80/2,0.581304,13.0,2021.0,2021-04-04,SCOPPITO,0.451031,0.58,0.224561,...,,,,,,,2021-04-15,0.130272,0.0,0.070836
4,W13-2021,SUPPO Packaging Line,0.432432,13.0,2021.0,2021-04-04,Lisieux,0.353021,0.53,0.148267,...,,,,,,,2021-04-15,0.079411,0.0,0.178806
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
227,,L25 Packaging Line,,,,2021-09-12,Tours,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.178769
228,,L18 Packaging Line,,,,2021-09-12,Tours,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.111640
229,,M21 Filling,,,,2021-09-12,Frankfurt,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.220155
230,,C2 Packaging Line,,,,2021-09-12,Maisons-Alfort,,,,...,,0.0,0.0,0.0,0.0,0.0,2021-09-16,,0.0,0.212684


#### Populate missing start_changeover (NaN)
If we've got this far and you still don't have a start_changeover, then you're a new site and can have this week's Changeover value

These rows still have NaN Changeover values

In [277]:
df_weekly[['Line','Date','Changeover']][df_weekly['Changeover'].isna()]

Unnamed: 0,Line,Date,Changeover
202,M21 Filling,2021-08-15,
203,M18 Filling,2021-08-15,
204,AL6,2021-08-15,
205,M22 Filling,2021-08-15,
206,AL5 Packaging 1,2021-08-15,
207,SUPPO Packaging Line,2021-08-15,
208,TR200 Packaging Line,2021-08-15,
209,MEDISEAL PURAN,2021-08-15,
210,C2 Packaging Line,2021-08-15,
211,IMA C80/2,2021-08-15,


In [278]:
# df_weekly.start_changeover.fillna(df_weekly.Changeover, inplace=True)
df_weekly['OEE  Start point'].fillna(df_weekly['OEE %'], inplace=True)
df_weekly['OEE% Target (2022)'].fillna(0.65, inplace=True)

### Calculating Sector times


The lap time is a sum of the calculated sector scores + the pole position time from the F1 race data (eg for Paul Ricard it was 88 secs):



Sector [1-4] calculations   
**sector 1**
How much has your OEE increased / decreased?  Sum difference between each week and multiply total by -1.  This provides a negative figure to subtract from your laptime, so that larger OEE increase is rewarded with a bigger reduction in laptime

df_weekly['sector_1'] = df_weekly['OEE_Diff'].mul(-1)

OEE_Diff calculation
- Sort values by Line and Date
- Find the difference between each weekly OEE figure
- Fill NaN values from missing OEE figures with the weekly OEE minus OEE Start Point for that site


**Sector 2** 
How big was your rolling std deviation this period, over the previous 4 weeks std dev?  

df_weekly['sector_2'] = df_weekly['rolling_std']

rolling_std = rolling std deviation for past 4 weeks for each site


 
**Sector 3**
We want to reduce Unplanned tech loss (recorded as % of OEE) Unplanned tech loss is calculated within QlikSense but missing values sometimes.  Fill the missing values and then display the average Unplanned tech loss :

df_weekly['sector_3'] = df_weekly['Unplanned_tech_loss']


Populate missing unplanned tech loss:
- Create weekly min/max cols for Unplanned tech loss from any site
- Merge those columns into df_weekly
- fill any NaN unplanned tech loss rows with the max OEE calc'd for that week (bigger is worse)

 
**Sector 4**
We're trying to reduce changeover time (recorded as % of OEE).  
Start changeover value isn't provided, so calc our own start point for each Line using the average changeover in 2021 up to 30 April, 2021.
   
start_changeover_calc = df_weekly[['Plant','Line','Changeover']][df_weekly['Date'] < '2021-04-30'].groupby(['Plant','Line']).mean().reset_index()
start_changeover_calc.rename(columns={'Changeover':'start_changeover'}, inplace=True)
df_weekly = df_weekly.merge(start_changeover_calc[['Line','start_changeover']])



df_weekly['sector_4'] = df_weekly['Changeover_rolling_mean']

Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line', 'Date', 'Changeover', 'start_changeover']]   
Changeover_mean['Changeover_rolling_mean'] = Changeover_mean.groupby('Line')['Changeover'].apply(lambda x : x.rolling(4,1).mean())
df_weekly = df_weekly.merge(Changeover_mean[["Line","Date","Changeover_rolling_mean"]], on=(["Line","Date"]))



**Clean the sectors of NaN before summing them**   
Sometimes, when we haven't got enough information for pct_change calcs, we were getting no values coming through for the lap_time.  We should make sure there is a value in each of the sectors, otherwise there is an unfair advantage by not having data available.  Find all NaN values and replace with the mean for that column(sector)

**Sectors 5 - 9**   
These scores are taken from the Nomination process.  Read in the Nomination s/s, merge any values we find with df_weekly, replace all NaN (missing) values with 0, and reduce the scores we find to 10% of their original value.  This value is then subtracted from the lap_time - so the better you do in the nominations the more your lap_time gets reduced by.

 
**lap_time**   
df_weekly['lap_time'] = df_weekly[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9']].sum(axis=1)

In [279]:
# df_weekly['sector_1'] = (df_weekly['WeekMax'] - df_weekly['OEE %'])
# df_weekly['sector_1'] = (df_weekly['OEE  Start point'] - df_weekly['OEE %'])
df_weekly['sector_1'] = df_weekly['OEE_Diff'].mul(-1)
df_weekly['sector_2'] = df_weekly['rolling_std']
df_weekly['sector_3'] = df_weekly['Unplanned_tech_loss']
# df_weekly['sector_4'] = (df_weekly['start_changeover'] - df_weekly['Changeover'])
# df_weekly['sector_4'] = (df_weekly['Changeover'] - df_weekly['start_changeover'])
df_weekly['sector_4'] = df_weekly['Changeover_rolling_mean']
# take 10% of the sector5-9 scores 
df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']] = df_weekly[['Best Solution','Best Innovation','Improvement Iterations','Lessons and Sharing','Team Contribution and Spirit']] * -0.1
df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']] = df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']].fillna(0)

# we'll use these in the absence of values for a sector
df_weekly[['sector_1','sector_2','sector_3','sector_4']] = df_weekly[['sector_1','sector_2','sector_3','sector_4']].fillna(df_weekly[['sector_1','sector_2','sector_3','sector_4']].mean())

#this will sum and handle the NaN
df_weekly['lap_time'] = df_weekly[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9']].sum(axis=1)

# now add the pole['Laptime'] from fastf1 to the lap_time adjustment we've created
# just use 88 secs rather than playing with timedeltas for now
# df_weekly['lap_time'] = pole['LapTime'] + pd.to_timedelta(df_weekly['lap_time'], unit='S')
# df_weekly['lap_time'] = 88 + df_weekly['lap_time']
df_weekly.groupby(['Line', pd.Grouper(key='Date', freq='W')])['lap_time'].sum()
# print (df_weekly['sector_1_time'] , df_weekly['sector_2_time'] , df_weekly['sector_3_time'], df_weekly['sector_4_time'])

Line                  Date      
AL5 Packaging 1       2021-04-04    1.159030
                      2021-04-11    0.477869
                      2021-04-18    0.413241
                      2021-04-25    0.302763
                      2021-05-02    0.667515
                                      ...   
TR200 Packaging Line  2021-06-20    0.635328
                      2021-06-27    0.297245
                      2021-07-04    1.729838
                      2021-08-15    0.522206
                      2021-09-12    0.409362
Name: lap_time, Length: 232, dtype: float64

#### Write out df_weekly to excel

In [280]:
df_weekly.to_excel(output_dir + "df_weekly_with_calcs.xlsx")

#### Monthly Calcs

Repeat the process for a df_monthly spreadsheet.  We will use this for calculating the Leader board  
group df_weekly by review_date so we can get the right data for each review meeting

In [281]:
# df_monthly = df_weekly.groupby([pd.Grouper(key='Date',freq='M'),'Line'])[['start_changeover','OEE  Start point','OEE %','Unplanned_tech_loss','Changeover','rolling_std','techloss_pct_chg','Changeover_pct_chg']].mean().reset_index()
# df_monthly = df_weekly.groupby([pd.Grouper(key='Date',freq='M'),'Line']).lap_time.sum().reset_index()
df_monthly = df_weekly.groupby(['Review_Date','Line']).lap_time.sum().reset_index()
# change the name of review_date to save renaming all references to Date later
df_monthly = df_monthly.rename(columns={'Review_Date':'Date'})
df_monthly

Unnamed: 0,Date,Line,lap_time
0,2021-04-15,AL5 Packaging 1,1.636899
1,2021-04-15,AL6,1.154460
2,2021-04-15,C2 Packaging Line,0.969295
3,2021-04-15,C9 Packaging Line,1.037885
4,2021-04-15,GAMMA1,1.170176
...,...,...,...
83,2021-09-16,M21 Filling,0.591000
84,2021-09-16,M22 Filling,0.491664
85,2021-09-16,MEDISEAL PURAN,0.428794
86,2021-09-16,SUPPO Packaging Line,0.456799


In [282]:
# df_monthly_minmax = (df_weekly.assign(Data_Value=df_weekly['OEE %'].abs())
#        .groupby(pd.Grouper(key='Date',freq='M'))['OEE %'].agg([('Min' , 'min'), ('Max', 'max')])
#        .add_prefix('Month'))
# df_monthly_minmax.reset_index(inplace=True)
# df_monthly = df_monthly.merge(df_monthly_minmax[['Date','MonthMin','MonthMax']])
df_monthly['lap_time'] = df_monthly['lap_time'] + 88
df_monthly

Unnamed: 0,Date,Line,lap_time
0,2021-04-15,AL5 Packaging 1,89.636899
1,2021-04-15,AL6,89.154460
2,2021-04-15,C2 Packaging Line,88.969295
3,2021-04-15,C9 Packaging Line,89.037885
4,2021-04-15,GAMMA1,89.170176
...,...,...,...
83,2021-09-16,M21 Filling,88.591000
84,2021-09-16,M22 Filling,88.491664
85,2021-09-16,MEDISEAL PURAN,88.428794
86,2021-09-16,SUPPO Packaging Line,88.456799


### Leader board table

In [283]:
# filter using the end_date to stop picking up future dated nomination rows of zero I created when joining the s/s
# pivot = df_monthly[df_monthly['Date'] < end_date].pivot(index='Line', columns='Date', values='lap_time')
pivot = df_monthly.pivot(index='Line', columns='Date', values='lap_time')
pivot.reset_index(inplace=True)
# pivot creates NaN for rows with no monthly data for each race review data
# populate each NaN value with the max for that column - so they get the max laptime for that race
# we can search for cols [1:] and find all cols after Date and Line
pivot.iloc[:,1:] = pivot.iloc[:,1:].fillna(pivot.iloc[:,1:].max())

# sum all the columns to get a race_time
pivot['race_time'] = pivot.sum(axis=1)
# sum all but the last 2 cols (this lap and the race_time) to calc prev_race_time

# pivot['prev_race_time'] = pivot[pivot.columns[2]] + pivot[pivot.columns[3]]
pivot['prev_race_time'] = pivot.iloc[:,1:-2].sum(axis=1)

pivot = pivot.merge(df_dash[['Plant','Line']], on='Line')
pivot.sort_values('race_time', inplace=True)
pivot['position'] = np.arange(1,len(pivot) + 1)
pivot['gap_to_leader'] = pivot['race_time'] - pivot['race_time'].iloc[0]
pivot.sort_values('prev_race_time', inplace=True)
pivot['prev_position'] = np.arange(1,len(pivot) + 1)
pivot['Gain/Loss'] = pivot.prev_position - pivot.position
pivot.sort_values('race_time', inplace=True)
pivot['interval'] = pivot.race_time.diff()
pivot = pivot.merge(df_dash[['Line','OEE  Start point', '⇗ OEE% progress', 'OEE% Target (2022)']], on='Line')
pivot

Unnamed: 0,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,2021-08-19 00:00:00,2021-09-16 00:00:00,race_time,prev_race_time,Plant,position,gap_to_leader,prev_position,Gain/Loss,interval,OEE Start point,⇗ OEE% progress,OEE% Target (2022)
0,GAMMA1,89.170176,69.325767,68.226756,90.143947,88.607297,88.646505,494.120448,405.473943,SCOPPITO,1,0.0,1,0,,0.418683,0.085148,0.57
1,IMA C80/2,88.926865,69.256522,75.232693,89.520411,88.540948,88.540666,500.018105,411.477439,SCOPPITO,2,5.897657,2,0,5.897657,0.451031,0.043365,0.58
2,L18 Packaging Line,88.863912,74.226704,89.407996,89.262767,88.498359,88.482486,518.742224,430.259738,Tours,3,24.621775,3,0,18.724118,0.377683,0.086173,0.547
3,L25 Packaging Line,88.866388,74.908842,90.011979,89.722251,88.586427,88.549614,520.645501,432.095887,Tours,4,26.525053,4,0,1.903277,0.351564,0.001613,0.478
4,M22 Filling,88.970596,81.806345,84.85767,89.844474,88.572975,88.491664,522.543725,434.052061,Frankfurt,5,28.423277,5,0,1.898224,0.530068,0.12028,0.65
5,M21 Filling,89.756877,83.292606,84.767181,90.769388,88.531191,88.591,525.708243,437.117243,Frankfurt,6,31.587795,6,0,3.164518,0.599671,0.022006,0.65
6,M18 Filling,89.977078,82.596174,85.446232,90.866765,88.759629,88.935114,526.580993,437.645879,Frankfurt,7,32.460545,7,0,0.87275,0.443522,-0.010057,0.65
7,AL6,89.15446,90.323044,84.205598,89.779721,88.414162,88.43091,530.307895,441.876985,Frankfurt,8,36.187447,8,0,3.726902,0.332657,0.078541,0.45
8,LINE 01 - UHLMANN 1880,89.977078,88.542111,89.636847,89.083857,88.392947,88.392168,534.025008,445.63284,SUZANO,9,39.904559,9,0,3.717112,0.427854,0.015085,0.65
9,SUPPO Packaging Line,88.601992,89.937915,89.978621,89.681359,88.438778,88.456799,535.095463,446.638664,Lisieux,10,40.975015,10,0,1.070456,0.353021,-0.010676,0.53


#### write this out for tableau

In [284]:
pivot.to_csv(output_dir + "leaderboard.csv")

END OF PROCESSING - Sanity checks below

In [285]:
pivot.sort_values(pivot.columns[1], inplace=True)
pivot['apr_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['may_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2,3]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['jun_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2,3,4]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['jly_position'] = np.arange(1,len(pivot) + 1)

In [286]:
# regex filter on column name contains Line, position or 2021
pivot.filter(regex='Line|position|2021')

Unnamed: 0,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,2021-08-19 00:00:00,2021-09-16 00:00:00,position,prev_position,apr_position,may_position,jun_position,jly_position
0,GAMMA1,89.170176,69.325767,68.226756,90.143947,88.607297,88.646505,1,1,10,2,1,1
1,IMA C80/2,88.926865,69.256522,75.232693,89.520411,88.540948,88.540666,2,2,5,1,2,2
2,L18 Packaging Line,88.863912,74.226704,89.407996,89.262767,88.498359,88.482486,3,3,3,3,3,3
3,L25 Packaging Line,88.866388,74.908842,90.011979,89.722251,88.586427,88.549614,4,4,4,4,4,4
4,M22 Filling,88.970596,81.806345,84.85767,89.844474,88.572975,88.491664,5,5,7,5,5,5
5,M21 Filling,89.756877,83.292606,84.767181,90.769388,88.531191,88.591,6,6,12,7,6,6
6,M18 Filling,89.977078,82.596174,85.446232,90.866765,88.759629,88.935114,7,7,13,6,7,7
7,AL6,89.15446,90.323044,84.205598,89.779721,88.414162,88.43091,8,8,9,14,8,8
8,LINE 01 - UHLMANN 1880,89.977078,88.542111,89.636847,89.083857,88.392947,88.392168,9,9,14,9,10,9
9,SUPPO Packaging Line,88.601992,89.937915,89.978621,89.681359,88.438778,88.456799,10,10,1,10,11,10


In [287]:
# df_weekly.groupby(['Review_Date','Line']).lap_time.sum().reset_index()
sectors = df_weekly.filter(regex='Line|sector|Review_Date')

In [288]:
# sectors[sectors.columns[[2,3,4,5]]].sum(axis=1)
sectors = df_weekly.filter(regex='Line|sector|Review_Date')
sectors = sectors.iloc[:,0:6]
sectors = sectors.groupby(['Line','Review_Date']).sum().reset_index()
sectors['time'] = sectors.sum(axis=1)
sectors_pivot = sectors.pivot(index='Line', columns='Review_Date', values='time')
sectors_pivot.reset_index(inplace=True)
sectors_pivot
# sectors[sectors[sectors.columns[[2,3,4,5]]].sum(axis=1)
# sectors[sectors.Line.str.contains('AL5')]

Review_Date,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,2021-08-19 00:00:00,2021-09-16 00:00:00
0,AL5 Packaging 1,1.636899,1.931643,2.895951,2.434052,0.450605,0.494276
1,AL6,1.15446,2.323044,1.805598,1.779721,0.414162,0.43091
2,C2 Packaging Line,0.969295,2.451874,2.261138,2.572525,0.534653,0.58353
3,C9 Packaging Line,1.037885,1.747841,2.361488,1.831886,0.439962,0.42491
4,GAMMA1,1.170176,1.725767,1.926756,2.143947,0.607297,0.646505
5,IMA C80/2,0.926865,1.656522,1.732693,1.520411,0.540948,0.540666
6,L18 Packaging Line,0.863912,1.426704,1.407996,1.262767,0.498359,0.482486
7,L25 Packaging Line,0.866388,2.108842,2.011979,1.722251,0.586427,0.549614
8,LINE 01 - UHLMANN 1880,,0.542111,1.636847,1.083857,0.392947,0.392168
9,M18 Filling,1.977078,2.596174,2.646232,2.866765,0.759629,0.935114


In [289]:
# search for cols [1:] and find all cols after Date and Line
sectors_pivot.iloc[:,1:] = sectors_pivot.iloc[:,1:].fillna(sectors_pivot.iloc[:,1:].max())


In [290]:
sectors_pivot.sort_values(sectors_pivot.columns[1], inplace=True)
sectors_pivot['apr_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['may_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2,3]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['jun_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2,3,4]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['jly_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot

Review_Date,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,2021-08-19 00:00:00,2021-09-16 00:00:00,apr_position,temp_race_time,may_position,jun_position,jly_position
6,L18 Packaging Line,0.863912,1.426704,1.407996,1.262767,0.498359,0.482486,3,4.961379,1,1,1
8,LINE 01 - UHLMANN 1880,1.977078,0.542111,1.636847,1.083857,0.392947,0.392168,13,5.239893,3,3,2
5,IMA C80/2,0.926865,1.656522,1.732693,1.520411,0.540948,0.540666,5,5.836491,5,4,3
13,SUPPO Packaging Line,0.601992,1.937915,1.978621,1.681359,0.438778,0.456799,1,6.199886,4,5,4
11,M22 Filling,0.970596,1.806345,2.05767,1.844474,0.572975,0.491664,7,6.679085,6,7,5
7,L25 Packaging Line,0.866388,2.108842,2.011979,1.722251,0.586427,0.549614,4,6.70946,9,8,6
4,GAMMA1,1.170176,1.725767,1.926756,2.143947,0.607297,0.646505,10,6.966647,8,6,7
3,C9 Packaging Line,1.037885,1.747841,2.361488,1.831886,0.439962,0.42491,8,6.9791,7,9,8
1,AL6,1.15446,2.323044,1.805598,1.779721,0.414162,0.43091,9,7.062823,12,10,9
14,TR200 Packaging Line,0.72702,1.591164,1.381941,3.366632,0.522206,0.409362,2,7.066758,2,2,10


In [291]:
df_nom_sectors.Line.unique()

array(['L18 Packaging Line', 'L25 Packaging Line',
       'LINE 01 - UHLMANN 1880', 'MEDISEAL PURAN', 'GAMMA1', 'IMA C80/2',
       'C2 Packaging Line', 'C9 Packaging Line', 'TR200 Packaging Line',
       'SUPPO Packaging Line', 'AL5 Packaging 1', 'AL6', 'M18 Filling',
       'M21 Filling', 'M22 Filling'], dtype=object)

In [292]:
aggregations = {
    'sector_1':'sum',
    'sector_2':'mean',
    'sector_3':'sum',
    'sector_4':'mean',
    'sector_5':'sum',
    'sector_6':'sum',
    'sector_7':'sum',
    'sector_8':'sum',
    'sector_9':'sum',
    'lap_time':'sum'
}
all_sectors = df_weekly.groupby(['Review_Date','Line'])[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9','lap_time']].agg(aggregations).reset_index()

In [293]:
all_sectors[all_sectors.Review_Date == '2021-07-08']

Unnamed: 0,Review_Date,Line,sector_1,sector_2,sector_3,sector_4,sector_5,sector_6,sector_7,sector_8,sector_9,lap_time
43,2021-07-08,AL5 Packaging 1,0.0,0.180824,1.423505,0.071813,0.0,0.0,0.0,0.0,0.0,2.434052
44,2021-07-08,AL6,0.136446,0.091301,1.032491,0.061395,0.0,0.0,0.0,0.0,0.0,1.779721
45,2021-07-08,C2 Packaging Line,0.271201,0.07211,1.68196,0.082731,0.0,0.0,0.0,0.0,0.0,2.572525
46,2021-07-08,C9 Packaging Line,0.010068,0.052656,1.363444,0.061938,0.0,0.0,0.0,0.0,0.0,1.831886
47,2021-07-08,GAMMA1,0.005434,0.107147,0.798803,0.22778,0.0,0.0,0.0,0.0,0.0,2.143947
48,2021-07-08,IMA C80/2,-0.037165,0.039564,0.653713,0.186402,0.0,0.0,0.0,0.0,0.0,1.520411
49,2021-07-08,L18 Packaging Line,-0.009593,0.038636,0.458949,0.164717,0.0,0.0,0.0,0.0,0.0,1.262767
50,2021-07-08,L25 Packaging Line,-0.164782,0.038427,0.7499,0.245856,0.0,0.0,0.0,0.0,0.0,1.722251
51,2021-07-08,LINE 01 - UHLMANN 1880,-0.28722,0.098874,0.740656,0.058731,0.0,0.0,0.0,0.0,0.0,1.083857
52,2021-07-08,M18 Filling,0.537044,0.191487,0.574535,0.247309,0.0,0.0,0.0,0.0,0.0,2.866765


In [294]:
df_weekly[['Line','OEE %','rolling_std','Unplanned_tech_loss','Changeover_rolling_mean','Review_Date','Date']][df_weekly.Line.str.contains('LINE')]

Unnamed: 0,Line,OEE %,rolling_std,Unplanned_tech_loss,Changeover_rolling_mean,Review_Date,Date
52,LINE 01 - UHLMANN 1880,0.0,,0.0,0.0,2021-05-13,2021-05-02
70,LINE 01 - UHLMANN 1880,0.0,0.0,0.0,0.0,2021-05-13,2021-05-09
86,LINE 01 - UHLMANN 1880,0.372515,0.215072,0.337516,0.00552,2021-06-10,2021-05-16
105,LINE 01 - UHLMANN 1880,0.441221,0.236574,0.203047,0.014179,2021-06-10,2021-05-23
121,LINE 01 - UHLMANN 1880,0.385174,0.202037,0.281732,0.0203,2021-06-10,2021-05-30
131,LINE 01 - UHLMANN 1880,0.314807,0.051869,0.31205,0.071758,2021-06-10,2021-06-06
152,LINE 01 - UHLMANN 1880,0.518701,0.086351,0.117504,0.073951,2021-07-08,2021-06-13
163,LINE 01 - UHLMANN 1880,0.36854,0.086634,0.080061,0.069827,2021-07-08,2021-06-20
176,LINE 01 - UHLMANN 1880,0.566651,0.119715,0.233553,0.068238,2021-07-08,2021-06-27
200,LINE 01 - UHLMANN 1880,0.602027,0.102797,0.309537,0.022909,2021-07-08,2021-07-04


In [305]:
df_weekly[['Line','Review_Date','Changeover_rolling_mean']].groupby(['Line','Review_Date']).sum().reset_index()

Unnamed: 0,Line,Review_Date,Changeover_rolling_mean
0,AL5 Packaging 1,2021-04-15,0.000000
1,AL5 Packaging 1,2021-05-13,0.373511
2,AL5 Packaging 1,2021-06-10,0.304670
3,AL5 Packaging 1,2021-07-08,0.287251
4,AL5 Packaging 1,2021-08-19,0.079760
...,...,...,...
83,TR200 Packaging Line,2021-05-13,0.632965
84,TR200 Packaging Line,2021-06-10,0.584134
85,TR200 Packaging Line,2021-07-08,0.543412
86,TR200 Packaging Line,2021-08-19,0.151361
