# Race Dashboard



This document describes the requirements and design decisions we will adopt for the Race Dashboard for the Sanofi Asset Efficiency challenge.  This document is intended to provide a detailed explanation of what is to be included on the dashboard, required data sources, and assumptions being made in the design.  It will also show the steps for getting the data into the necessary format.

## Scope/objective

The objective is to be able to display the documented metrics and categories as best suited on a Dashboard presentation for Sanofi staff to be able to access.


## Metrics / categories
The metrics have been mapped into sectors to mimic different sectors of a race track.  The metrics are:
 
 
 
•     Race = 8 Laps = 8 Months   
•     Lap = Monthly Progress   
<br>
<br>

•     **Sector 1 = OEE Improvement   
•     Sector 2 = OEE Variability Improvement   
•     Sector 3 = Stoppage Reduction   
•     Sector 4 = Changeover Improvement**   
• _Sector 5 = Most effective OEE application   
• Sector 6 = Best Innovation   
• Sector 7 = Most consistent OEE improvement progress   
• Sector 8 = Collaboration   
• Sector 9 = Team Spirit_

 
 


| Change log |
|:----------:|    


| Date | Initials | Comments |
|------|:---------|:---------|
| 2021-06-23 | MC | in leaderboard, replace NaN values in laptime calc with the max laptime for that lap
| 2021-06-23 | MC | use race review dates for grouping data, rather than calendar months
| 2021-06-24 |JB | missing OEE_Diff figures should default to 'OEE %' - OEE start point, not just 'OEE %' value
| 2021-06-24 |JB | for sector one, multiply sum of OEE_Diff by -1.  (was doing this in Tableau calcs but too late for lap_time calc).
|2021-06-25 | MC | in leaderboard, change prev_race_time calc to include all but last 2 cols, to handle new race cols as they arrive
|2021-06-28 | MC | correct dates in nominations spreadsheet, and merge on 1 row with a date within the review period
|2021-06-29 | MC | corrected nominations spreadsheet for Lisieux IWK - should be TR200 Packaging
|2021-06-30 | MC | populate missing OEE % values before calculating rolling_std using the OEE % column
|2021-06-30 | MC | remove start_changeover_calc merge as it's not being used any more, and was dropping rows
|2021-07-06 | MC | populate missing OEE % values with OEE from previous week for that site.

In [48]:
import pandas as pd
import numpy as np
import datetime


# Viz libs
import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns

# display options
# pd.options.display.float_format = "{:.2f}".format


## Read file and cleanse

- OEE.xlsx should contain the OEE latest data from QS by week   
- QSDashboard.xlsx has a list of plants/sites taking part, with original target OEE  
- UnplannedTechLoss.xlxs should contain the latest data from QS (my unplanned chart by line/week)   
- changeover.xlsx should contain the changeover information in QlikSense

#### Cleaning required:   
- OEE % needs converting to numeric, coerce the nulls to nan values
- I think W53-2020 data is bobbings - causes dup indexes for date 2021-01-10 and don't need 2020 data anyway...so dropping it

In [49]:
# this dir should point at where ever your sanofi Mclaren directory is syncing
dir = "C:/Users/mark_/Sanofi/Sanofi x McLaren sharing - General/Race Dashboard data/"

# this is a local copy of sanofi share for when I don't trust Microsoft syncing!
# dir = "C:/Users/mark_/Documents/McLaren2021/Sanofi/Race Dashboard data/"

# write out the files where ever you like.  As long as same file is used within Tableau.
output_dir = "C:/Users/mark_/Documents/McLaren2021/Sanofi/Race Dashboard data/"

# dir = 'C:/Users/james.blood/Documents/McLarenSanofi/McLarenSanofi/data/'

file = (dir + 'OEE.xlsx')
df_weekly = pd.read_excel(file)
file = (dir + 'QSDashboard.xlsx')
df_dash = pd.read_excel(file)
file = (dir + 'Unplanned_tech_loss.xlsx')
df_techloss = pd.read_excel(file)
file = (dir + 'changeover.xlsx')
df_changeover = pd.read_excel(file)
df_weekly = df_weekly.loc[df_weekly['Week'].str.contains('2021')]
df_techloss = df_techloss.loc[df_techloss['Week'].str.contains('2021')]
df_weekly['OEE %'] = pd.to_numeric(df_weekly['OEE %'], errors='coerce')
df_techloss.rename(columns={'Unplanned losses - %OEE':'Unplanned_tech_loss'}, inplace=True)
df_techloss['Unplanned_tech_loss'] = pd.to_numeric(df_techloss['Unplanned_tech_loss'], errors='coerce')
df_changeover.rename(columns={'Change over losses - %OEE':'Changeover'}, inplace=True)
df_changeover['Changeover'] = pd.to_numeric(df_changeover['Changeover'], errors='coerce')
# don't use their progress figure as it's a static val
# df_dash.rename(columns={'⇗ OEE% progress':'OEE% progress'}, inplace=True)

create a datetime from the week number

In [50]:
df_weekly['WeekOfYear'] = pd.to_numeric(df_weekly['Week'].str[1:3])
df_weekly['Year'] = pd.to_numeric(df_weekly['Week'].str[4:])
dates = df_weekly.Year*100+df_weekly.WeekOfYear
df_weekly['Date'] = pd.to_datetime(dates.astype(str) + '0', format='%Y%W%w')
# df_weekly.drop(columns=['Year','WeekOfYear'], inplace=True)
df_weekly.head()

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date
11,W01-2021,C2 Packaging Line,0.16897,1,2021,2021-01-10
12,W01-2021,C9 Packaging Line,,1,2021,2021-01-10
13,W01-2021,GAMMA1,0.406686,1,2021,2021-01-10
14,W01-2021,IMA C80/2,0.510044,1,2021,2021-01-10
15,W01-2021,L18 Packaging Line,0.173736,1,2021,2021-01-10


In [51]:
#merge the 2 dataframes to get the start OEE
df_weekly = df_weekly.merge(df_dash[['Plant','Line', 'OEE  Start point','OEE% Target (2022)']],on='Line')

In [52]:
df_weekly = df_weekly.merge(df_techloss[['Line', 'Week', 'Unplanned_tech_loss']],on=['Line','Week'])

In [53]:
df_weekly = df_weekly.merge(df_changeover[['Line','Week','Changeover']])

#### Start Changeover

*not sure this is needed if we are using changeover_diff, which will naturally reward increases / decreases in changeover??*

Start changeover value isn't provided, so going to calc our own start point using the average changeover for each site in 2021 up to April 2021.  This needs to be done before we drop the early 2021 rows.

This is then merged into the df_weekly dataframe as a loose join.

In [54]:
# start_changeover_calc = df_weekly[['Plant','Line','Changeover']][df_weekly['Date'] < '2021-04-30'].groupby(['Plant','Line']).mean().reset_index()
# start_changeover_calc.rename(columns={'Changeover':'start_changeover'}, inplace=True)
# df_weekly = df_weekly.merge(start_changeover_calc[['Line','start_changeover']])

#### Dates for the Asset Challenge

Start Date is going to be fixed as 2021-04-01. Remove all the rows from df_weekly before this date

Review Date will move and act as a cutoff for each Race meeting

In [55]:
start_date = '2021-04-01'
df_weekly = df_weekly[df_weekly['Date'] > start_date].sort_values('Date')

# this is needed to prevent including additional values in the race time.
review_date = '2021-07-08'
# df_weekly = df_weekly[df_weekly['Date'] < end_date].sort_values('Date')

### PCT_CHANGE
Using pct_change python function with periods=4, giving a 4 week (4 previous rows) rolling pct_change figure
- I believe we are doing this rolling average calculation within Tableau at the moment, so this isn't being used here

- Not sure whether this is required any more - removing for all categories (2021-06-30)

In [56]:
# df_weekly.sort_values(['Line','Date'], inplace = True)
# df_weekly['OEE_pct_chg'] = (df_weekly.groupby('Line')['OEE %']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly['techloss_pct_chg'] = (df_weekly.groupby('Line')['Unplanned_tech_loss']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly['Changeover_pct_chg'] = (df_weekly.groupby('Line')['Changeover']
#                                    .apply(pd.Series.pct_change, periods=4))
# df_weekly.head()

#### Populate missing OEE %
2021-06-07 - change of plan! Populating with min OEE value does not work when some Lines record 0% OEE.  Missing values can be just delayed data collection which sorts itself later.  So, until proper data arrives:

- Use the previous week's value (ffill) for that Line.  This will still leave NaN if they existed in the first row(s).  
- populate the remaining NaN with 'OEE  start point'
- ~~Find the weekly min/max OEE % from any site~~   
- ~~Merge those columns into df_weekly~~   
- ~~fill any NaN with the min OEE we calc'd for that week~~    


In [58]:
# df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['OEE %'].abs())
#        .groupby(pd.Grouper(key='Date',freq='W'))['OEE %'].agg([('Min' , 'min'), ('Max', 'max')])
#        .add_prefix('Week'))
# df_weekly_minmax.reset_index(inplace=True)
# df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekMin','WeekMax']])
# df_weekly['OEE %'].fillna(df_weekly.WeekMin, inplace=True)
# df_weekly.drop(columns=['WeekMin','WeekMax'], inplace=True)
df_weekly['OEE %'] = df_weekly.sort_values(['Line','Date']).groupby(['Line'])['OEE %'].ffill()
df_weekly['OEE %'] = df_weekly.sort_values(['Line','Date']).groupby(['Line'])['OEE %'].fillna(df_weekly['OEE  Start point'])
# df_weekly[df_weekly['OEE %'].isna()]
df_weekly[df_weekly.Line.str.contains('TR200')]

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover
282,W13-2021,TR200 Packaging Line,0.596432,13,2021,2021-04-04,Lisieux,0.483505,0.65,0.163573,0.075189
283,W14-2021,TR200 Packaging Line,0.515187,14,2021,2021-04-11,Lisieux,0.483505,0.65,0.127162,0.366955
284,W15-2021,TR200 Packaging Line,0.65547,15,2021,2021-04-18,Lisieux,0.483505,0.65,0.14568,0.130089
285,W16-2021,TR200 Packaging Line,0.50113,16,2021,2021-04-25,Lisieux,0.483505,0.65,0.217642,0.036253
286,W17-2021,TR200 Packaging Line,0.521534,17,2021,2021-05-02,Lisieux,0.483505,0.65,0.203961,0.158523
287,W18-2021,TR200 Packaging Line,0.555848,18,2021,2021-05-09,Lisieux,0.483505,0.65,0.148352,0.143708
288,W19-2021,TR200 Packaging Line,0.531703,19,2021,2021-05-16,Lisieux,0.483505,0.65,0.225213,0.144646
289,W20-2021,TR200 Packaging Line,0.596425,20,2021,2021-05-23,Lisieux,0.483505,0.65,0.16966,0.196068
290,W21-2021,TR200 Packaging Line,0.51705,21,2021,2021-05-30,Lisieux,0.483505,0.65,0.251404,0.163872
291,W22-2021,TR200 Packaging Line,0.769493,22,2021,2021-06-06,Lisieux,0.483505,0.65,0.158634,0.057577


## Standard Deviation
Calculate std_dev and mean on a 4 week rolling basis

Standard deviation is the square root of the variance, so no need to calculate both and have left var out

In [59]:
df_weekly['rolling_std'] = df_weekly.groupby('Line')['OEE %'].apply(lambda x : x.rolling(4,1).agg(np.std))
df_weekly.head(5)

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover,rolling_std
12,W13-2021,C2 Packaging Line,0.458414,13,2021,2021-04-04,Maisons-Alfort,0.397503,0.47,0.465009,0.04414,
228,W13-2021,M22 Filling,0.649336,13,2021,2021-04-04,Frankfurt,0.530068,0.65,0.234635,0.035164,
255,W13-2021,SUPPO Packaging Line,0.432432,13,2021,2021-04-04,Lisieux,0.353021,0.53,0.148267,0.178806,
174,W13-2021,M18 Filling,0.046124,13,2021,2021-04-04,Frankfurt,0.443522,0.65,0.114818,0.789781,
39,W13-2021,C9 Packaging Line,0.371845,13,2021,2021-04-04,Maisons-Alfort,0.528518,0.53,0.229297,0.081871,


In [60]:
file = (dir + 'Nominations Category Scoring.xlsx')
df_nom_sectors = pd.read_excel(file, sheet_name='Nomination scoring', usecols="A:H", parse_dates=['Date'])

In [61]:
df_nom_sectors['Date'] = pd.Series(df_nom_sectors['Date']).fillna(method='ffill')
df_nom_sectors = df_nom_sectors.fillna(0)

df_weekly = df_weekly.merge(df_nom_sectors[['Line','Plant','Date','Best Solution','Best Innovation','Improvement Iterations','Lessons and Sharing','Team Contribution and Spirit']],how='outer', on=['Date','Plant','Line'])
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,Changeover,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.47,0.465009,0.044140,,,,,,
1,W13-2021,M22 Filling,0.649336,13.0,2021.0,2021-04-04,Frankfurt,0.530068,0.65,0.234635,0.035164,,,,,,
2,W13-2021,SUPPO Packaging Line,0.432432,13.0,2021.0,2021-04-04,Lisieux,0.353021,0.53,0.148267,0.178806,,,,,,
3,W13-2021,M18 Filling,0.046124,13.0,2021.0,2021-04-04,Frankfurt,0.443522,0.65,0.114818,0.789781,,,,,,
4,W13-2021,C9 Packaging Line,0.371845,13.0,2021.0,2021-04-04,Maisons-Alfort,0.528518,0.53,0.229297,0.081871,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
242,,AL5 Packaging 1,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
243,,AL6,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
244,,M18 Filling,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0
245,,M21 Filling,,,,2021-09-12,Frankfurt,,,,,,0.0,0.0,0.0,0.0,0.0


## Create review dates
create a review_date column for grouping the data later, so we only get the data we're interested in for each review

In [62]:
def thurs_of_weekbefore(year, week):
    return datetime.date.fromisocalendar(year, week-1, 4)  # (year, week before (w-1), thursday)

review_weeks = [16, 20, 24, 28, 34, 38, 42, 47]
review_dates = []

for i in review_weeks:
    if i > 0:
        review_dates.append((thurs_of_weekbefore(2021,i)))

df_review_dates = pd.DataFrame(review_dates)
df_review_dates.rename(columns={0:'Review_Date'}, inplace=True)
df_review_dates['Review_Date'] = pd.to_datetime(df_review_dates.Review_Date)

# df_review_dates.info()
df_weekly = pd.merge_asof(df_weekly.sort_values('Date'), df_review_dates, left_on='Date', right_on='Review_Date', allow_exact_matches=True, direction='forward')


#### We need the diff between the weekly OEE % figures, and the Weekly Changeover figures
Need something to calculate the OEE Progress and Changeover, otherwise we will have problems when we group and sum values later
- Create OEE_diff with OEE % from groupby of each Line, Week (only 1 row per week, so 'mean' will yield the same)   
- Find the diff between the rows in OEE_Diff for each Line   
- fillNA (first row for each Line) with OEE Start point - should only be needed on the first row for each Line   

repeat same logic for Changeover - there will be more NaN as start_changeover wasn't provided for all.  We populate this later

In [63]:
# this was calculating the wrong Diff - the first row of each site was looking at the previous site for all but the 1st calc
# needed to sort by Line and Date first 

# OEE_Diff = df_weekly.groupby(['Line',pd.Grouper(key='Date',freq='W')])['OEE %'].mean().reset_index()
# OEE_Diff["OEE_Diff"] = OEE_Diff["OEE %"].diff()
# df_weekly = df_weekly.merge(OEE_Diff[["Line","Date","OEE_Diff"]], on=(["Line","Date"]))

# df_weekly['OEE_Diff'].fillna(df_weekly['OEE %'] - df_weekly['OEE  Start point'], inplace=True)
# df_weekly[["Line","Date","OEE %","OEE_Diff"]].head(50).sort_values(by=['Line', 'Date'])

after merging with nominations spreadsheet we will have future dated rows that get in the way of calcs later - get rid of them using the review_date set earlier

In [64]:
df_weekly = df_weekly[df_weekly['Review_Date'] <= review_date]

In [65]:
OEE_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','OEE %','OEE  Start point']]
OEE_Diff['OEE_Diff'] = OEE_Diff.groupby('Line')['OEE %'].diff().fillna(df_weekly['OEE %'] - df_weekly['OEE  Start point'])
df_weekly = df_weekly.merge(OEE_Diff[["Line","Date","OEE_Diff"]], on=(["Line","Date"]))
df_weekly[["Line","Date","OEE %","OEE_Diff"]].head(50).sort_values(by=['Line', 'Date'])

Unnamed: 0,Line,Date,OEE %,OEE_Diff
12,AL5 Packaging 1,2021-04-04,0.479693,0.0
25,AL5 Packaging 1,2021-04-11,0.479693,0.0
30,AL5 Packaging 1,2021-04-18,0.449745,-0.029948
3,AL6,2021-04-04,0.367897,0.03524
16,AL6,2021-04-11,0.360681,-0.007216
31,AL6,2021-04-18,0.33414,-0.026541
45,AL6,2021-04-25,0.309545,-0.024595
0,C2 Packaging Line,2021-04-04,0.458414,0.060911
19,C2 Packaging Line,2021-04-11,0.530707,0.072293
38,C2 Packaging Line,2021-04-18,0.410676,-0.120032


In [66]:
# Changeover_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover','start_changeover']]
Changeover_Diff = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover']]
# Changeover_Diff['Changeover_Diff'] = Changeover_Diff.groupby('Line')['Changeover'].diff().fillna(df_weekly['start_changeover'] - df_weekly['Changeover'])
Changeover_Diff['Changeover_Diff'] = Changeover_Diff.groupby('Line')['Changeover'].diff().fillna(0)
df_weekly = df_weekly.merge(Changeover_Diff[["Line","Date","Changeover_Diff"]], on=(["Line","Date"]))

In [67]:
Changeover_Diff[Changeover_Diff.Line.str.contains('TR200')]

Unnamed: 0,Line,Date,Changeover,Changeover_Diff
4,TR200 Packaging Line,2021-04-04,0.075189,0.0
15,TR200 Packaging Line,2021-04-11,0.366955,0.291765
26,TR200 Packaging Line,2021-04-18,0.130089,-0.236865
48,TR200 Packaging Line,2021-04-25,0.036253,-0.093836
55,TR200 Packaging Line,2021-05-02,0.158523,0.12227
69,TR200 Packaging Line,2021-05-09,0.143708,-0.014815
96,TR200 Packaging Line,2021-05-16,0.144646,0.000938
106,TR200 Packaging Line,2021-05-23,0.196068,0.051422
118,TR200 Packaging Line,2021-05-30,0.163872,-0.032196
137,TR200 Packaging Line,2021-06-06,0.057577,-0.106295


In [68]:
# Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover','start_changeover']]
Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line','Date','Changeover']]
Changeover_mean['Changeover_rolling_mean'] = Changeover_mean.groupby('Line')['Changeover'].apply(lambda x : x.rolling(4,1).mean())
df_weekly = df_weekly.merge(Changeover_mean[["Line","Date","Changeover_rolling_mean"]], on=(["Line","Date"]))

In [69]:
Changeover_mean[Changeover_mean.Line.str.contains('TR200')]

Unnamed: 0,Line,Date,Changeover,Changeover_rolling_mean
4,TR200 Packaging Line,2021-04-04,0.075189,0.075189
15,TR200 Packaging Line,2021-04-11,0.366955,0.221072
26,TR200 Packaging Line,2021-04-18,0.130089,0.190744
48,TR200 Packaging Line,2021-04-25,0.036253,0.152122
55,TR200 Packaging Line,2021-05-02,0.158523,0.172955
69,TR200 Packaging Line,2021-05-09,0.143708,0.117143
96,TR200 Packaging Line,2021-05-16,0.144646,0.120783
106,TR200 Packaging Line,2021-05-23,0.196068,0.160736
118,TR200 Packaging Line,2021-05-30,0.163872,0.162074
137,TR200 Packaging Line,2021-06-06,0.057577,0.140541


#### Populate missing Unplanned Tech Loss

- Create weekly min/max cols for Unplanned tech loss from any site   
- Merge those columns into df_weekly   
- fill any NaN rows with the max Unplanned_tech_loss found for that week   

**this might be flawed!!** 

In [70]:
df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['Unplanned_tech_loss'].abs())
       .groupby(pd.Grouper(key='Date',freq='W'))['Unplanned_tech_loss'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('WeekUTL'))
df_weekly_minmax.reset_index(inplace=True)
df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekUTLMin','WeekUTLMax']])
df_weekly['Unplanned_tech_loss'].fillna(df_weekly.WeekUTLMax, inplace=True)
df_weekly.drop(columns=['WeekUTLMin','WeekUTLMax'], inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,...,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit,Review_Date,OEE_Diff,Changeover_Diff,Changeover_rolling_mean
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.470,0.465009,...,,,,,,,2021-04-15,0.060911,0.000000,0.044140
1,W13-2021,L18 Packaging Line,0.465923,13.0,2021.0,2021-04-04,Tours,0.377683,0.547,0.208273,...,,,,,,,2021-04-15,0.088240,0.000000,0.134271
2,W13-2021,IMA C80/2,0.581304,13.0,2021.0,2021-04-04,SCOPPITO,0.451031,0.580,0.224561,...,,,,,,,2021-04-15,0.130272,0.000000,0.070836
3,W13-2021,AL6,0.367897,13.0,2021.0,2021-04-04,Frankfurt,0.332657,0.450,0.342851,...,,,,,,,2021-04-15,0.035240,0.000000,0.184958
4,W13-2021,TR200 Packaging Line,0.596432,13.0,2021.0,2021-04-04,Lisieux,0.483505,0.650,0.163573,...,,,,,,,2021-04-15,0.112928,0.000000,0.075189
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
197,W26-2021,GAMMA1,0.491885,26.0,2021.0,2021-07-04,SCOPPITO,0.418683,0.570,0.143339,...,0.055086,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.111172,0.000000,0.219199
198,W26-2021,C9 Packaging Line,0.406623,26.0,2021.0,2021-07-04,Maisons-Alfort,0.528518,0.530,0.452948,...,0.081720,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.082118,-0.057612,0.064467
199,W26-2021,L18 Packaging Line,0.411426,26.0,2021.0,2021-07-04,Tours,0.377683,0.547,0.185620,...,0.053623,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.081630,0.037749,0.183051
200,W26-2021,L25 Packaging Line,0.381273,26.0,2021.0,2021-07-04,Tours,0.351564,0.478,0.192098,...,0.019916,0.0,0.0,0.0,0.0,0.0,2021-07-08,0.036620,-0.031079,0.252406


#### Populate missing Changeover 

Now, this doesn't really work!  Sometimes the max value changeover for the week is 100%, and there isn't any correlation in my calc between OEE + Changeover + unplanned outage being < 100% ...

...but I don't think we're using/relying on changeover figure for sector calcs here or in tableau.  Going to remove these rows below.

In [71]:
df_weekly_minmax = (df_weekly.assign(Data_Value=df_weekly['Changeover'].abs())
       .groupby(pd.Grouper(key='Date',freq='W'))['Changeover'].agg([('Min' , 'min'), ('Max', 'max')])
       .add_prefix('WeekChangeover'))
df_weekly_minmax.reset_index(inplace=True)
df_weekly = df_weekly.merge(df_weekly_minmax[['Date','WeekChangeoverMin','WeekChangeoverMax']])
df_weekly['Changeover'].fillna(df_weekly.WeekChangeoverMax, inplace=True)
df_weekly.drop(columns=['WeekChangeoverMin','WeekChangeoverMax'], inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,...,rolling_std,Best Solution,Best Innovation,Improvement Iterations,Lessons and Sharing,Team Contribution and Spirit,Review_Date,OEE_Diff,Changeover_Diff,Changeover_rolling_mean
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.470,0.465009,...,,,,,,,2021-04-15,0.060911,0.000000,0.044140
1,W13-2021,L18 Packaging Line,0.465923,13.0,2021.0,2021-04-04,Tours,0.377683,0.547,0.208273,...,,,,,,,2021-04-15,0.088240,0.000000,0.134271
2,W13-2021,IMA C80/2,0.581304,13.0,2021.0,2021-04-04,SCOPPITO,0.451031,0.580,0.224561,...,,,,,,,2021-04-15,0.130272,0.000000,0.070836
3,W13-2021,AL6,0.367897,13.0,2021.0,2021-04-04,Frankfurt,0.332657,0.450,0.342851,...,,,,,,,2021-04-15,0.035240,0.000000,0.184958
4,W13-2021,TR200 Packaging Line,0.596432,13.0,2021.0,2021-04-04,Lisieux,0.483505,0.650,0.163573,...,,,,,,,2021-04-15,0.112928,0.000000,0.075189
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
197,W26-2021,GAMMA1,0.491885,26.0,2021.0,2021-07-04,SCOPPITO,0.418683,0.570,0.143339,...,0.055086,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.111172,0.000000,0.219199
198,W26-2021,C9 Packaging Line,0.406623,26.0,2021.0,2021-07-04,Maisons-Alfort,0.528518,0.530,0.452948,...,0.081720,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.082118,-0.057612,0.064467
199,W26-2021,L18 Packaging Line,0.411426,26.0,2021.0,2021-07-04,Tours,0.377683,0.547,0.185620,...,0.053623,0.0,0.0,0.0,0.0,0.0,2021-07-08,-0.081630,0.037749,0.183051
200,W26-2021,L25 Packaging Line,0.381273,26.0,2021.0,2021-07-04,Tours,0.351564,0.478,0.192098,...,0.019916,0.0,0.0,0.0,0.0,0.0,2021-07-08,0.036620,-0.031079,0.252406


#### Populate missing start_changeover (NaN)
If we've got this far and you still don't have a start_changeover, then you're a new site and can have this week's Changeover value

These rows still have NaN Changeover values

In [72]:
df_weekly[['Line','Date','Changeover']][df_weekly['Changeover'].isna()]

Unnamed: 0,Line,Date,Changeover


In [73]:
# df_weekly.start_changeover.fillna(df_weekly.Changeover, inplace=True)
df_weekly['OEE  Start point'].fillna(df_weekly['OEE %'], inplace=True)
df_weekly['OEE% Target (2022)'].fillna(0.65, inplace=True)

### Calculating Sector times


The lap time is a sum of the calculated sector scores + the pole position time from the F1 race data (eg for Paul Ricard it was 88 secs):



Sector [1-4] calculations   
**sector 1**
How much has your OEE increased / decreased?  Sum difference between each week and multiply total by -1.  This provides a negative figure to subtract from your laptime, so that larger OEE increase is rewarded with a bigger reduction in laptime

df_weekly['sector_1'] = df_weekly['OEE_Diff'].mul(-1)

OEE_Diff calculation
- Sort values by Line and Date
- Find the difference between each weekly OEE figure
- ~~Fill NaN values from missing OEE figures with the weekly OEE minus OEE Start Point for that site~~   
- any missing OEE figures will already be populated with previous week's value or the start OEE value


**Sector 2** 
How big was your rolling std deviation this period, over the previous 4 weeks std dev?  

df_weekly['sector_2'] = df_weekly['rolling_std']

rolling_std = rolling std deviation for past 4 weeks for each site


 
**Sector 3**
We want to reduce Unplanned tech loss (recorded as % of OEE) Unplanned tech loss is calculated within QlikSense but missing values sometimes.  Fill the missing values and then display the average Unplanned tech loss :

df_weekly['sector_3'] = df_weekly['Unplanned_tech_loss']


Populate missing unplanned tech loss:
- Create weekly min/max cols for Unplanned tech loss from any site
- Merge those columns into df_weekly
- fill any NaN unplanned tech loss rows with the max OEE calc'd for that week (bigger is worse)

 
**Sector 4**
We're trying to reduce changeover time (recorded as % of OEE).  
Start changeover value isn't provided, so calc our own start point for each Line using the average changeover in 2021 up to 30 April, 2021.
   
start_changeover_calc = df_weekly[['Plant','Line','Changeover']][df_weekly['Date'] < '2021-04-30'].groupby(['Plant','Line']).mean().reset_index()
start_changeover_calc.rename(columns={'Changeover':'start_changeover'}, inplace=True)
df_weekly = df_weekly.merge(start_changeover_calc[['Line','start_changeover']])



df_weekly['sector_4'] = df_weekly['Changeover_rolling_mean']

Changeover_mean = df_weekly.sort_values(by=['Line', 'Date'])[['Line', 'Date', 'Changeover', 'start_changeover']]   
Changeover_mean['Changeover_rolling_mean'] = Changeover_mean.groupby('Line')['Changeover'].apply(lambda x : x.rolling(4,1).mean())
df_weekly = df_weekly.merge(Changeover_mean[["Line","Date","Changeover_rolling_mean"]], on=(["Line","Date"]))



**Clean the sectors of NaN before summing them**   
Sometimes, when we haven't got enough information for pct_change calcs, we were getting no values coming through for the lap_time.  We should make sure there is a value in each of the sectors, otherwise there is an unfair advantage by not having data available.  Find all NaN values and replace with the mean for that column(sector)

**Sectors 5 - 9**   
These scores are taken from the Nomination process.  Read in the Nomination s/s, merge any values we find with df_weekly, replace all NaN (missing) values with 0, and reduce the scores we find to 10% of their original value.  This value is then subtracted from the lap_time - so the better you do in the nominations the more your lap_time gets reduced by.

 
**lap_time**   
df_weekly['lap_time'] = df_weekly[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9']].sum(axis=1)

In [74]:
# df_weekly['sector_1'] = (df_weekly['WeekMax'] - df_weekly['OEE %'])
# df_weekly['sector_1'] = (df_weekly['OEE  Start point'] - df_weekly['OEE %'])
df_weekly['sector_1'] = df_weekly['OEE_Diff'].mul(-1)
df_weekly['sector_2'] = df_weekly['rolling_std']
df_weekly['sector_3'] = df_weekly['Unplanned_tech_loss']
# df_weekly['sector_4'] = (df_weekly['start_changeover'] - df_weekly['Changeover'])
# df_weekly['sector_4'] = (df_weekly['Changeover'] - df_weekly['start_changeover'])
df_weekly['sector_4'] = df_weekly['Changeover_rolling_mean']
# take 10% of the sector5-9 scores 
df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']] = df_weekly[['Best Solution','Best Innovation','Improvement Iterations','Lessons and Sharing','Team Contribution and Spirit']] * -0.1
df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']] = df_weekly[['sector_5','sector_6','sector_7','sector_8','sector_9']].fillna(0)

# we'll use these in the absence of values for a sector
df_weekly[['sector_1','sector_2','sector_3','sector_4']] = df_weekly[['sector_1','sector_2','sector_3','sector_4']].fillna(df_weekly[['sector_1','sector_2','sector_3','sector_4']].mean())

#this will sum and handle the NaN
df_weekly['lap_time'] = df_weekly[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9']].sum(axis=1)

# now add the pole['Laptime'] from fastf1 to the lap_time adjustment we've created
# just use 88 secs rather than playing with timedeltas for now
# df_weekly['lap_time'] = pole['LapTime'] + pd.to_timedelta(df_weekly['lap_time'], unit='S')
# df_weekly['lap_time'] = 88 + df_weekly['lap_time']
df_weekly.groupby(['Line', pd.Grouper(key='Date', freq='W')])['lap_time'].sum()
# print (df_weekly['sector_1_time'] , df_weekly['sector_2_time'] , df_weekly['sector_3_time'], df_weekly['sector_4_time'])

Line                  Date      
AL5 Packaging 1       2021-04-04    0.708660
                      2021-04-11    0.540053
                      2021-04-18    0.449241
                      2021-04-25    0.134643
                      2021-05-02    0.593747
                                      ...   
TR200 Packaging Line  2021-06-06    0.162536
                      2021-06-13    0.704221
                      2021-06-20    0.635328
                      2021-06-27    0.297245
                      2021-07-04    0.892840
Name: lap_time, Length: 202, dtype: float64

#### Write out df_weekly to excel

In [75]:
df_weekly.to_excel(output_dir + "df_weekly_with_calcs.xlsx")

#### Monthly Calcs

Repeat the process for a df_monthly spreadsheet.  We will use this for calculating the Leader board  
group df_weekly by review_date so we can get the right data for each review meeting

In [76]:
# df_monthly = df_weekly.groupby([pd.Grouper(key='Date',freq='M'),'Line'])[['start_changeover','OEE  Start point','OEE %','Unplanned_tech_loss','Changeover','rolling_std','techloss_pct_chg','Changeover_pct_chg']].mean().reset_index()
# df_monthly = df_weekly.groupby([pd.Grouper(key='Date',freq='M'),'Line']).lap_time.sum().reset_index()
df_monthly = df_weekly.groupby(['Review_Date','Line']).lap_time.sum().reset_index()
# change the name of review_date to save renaming all references to Date later
df_monthly = df_monthly.rename(columns={'Review_Date':'Date'})
df_monthly['lap_time'] = df_monthly['lap_time'] + 88
df_monthly.head()

Unnamed: 0,Date,Line,lap_time
0,2021-04-15,AL5 Packaging 1,89.248712
1,2021-04-15,AL6,89.137846
2,2021-04-15,C2 Packaging Line,88.952682
3,2021-04-15,C9 Packaging Line,89.021271
4,2021-04-15,GAMMA1,89.153563


### Leader board table

In [78]:
# filter using the end_date to stop picking up future dated nomination rows of zero I created when joining the s/s
# pivot = df_monthly[df_monthly['Date'] < end_date].pivot(index='Line', columns='Date', values='lap_time')
pivot = df_monthly.pivot(index='Line', columns='Date', values='lap_time')
pivot.reset_index(inplace=True)
# pivot creates NaN for rows with no monthly data for each race review data
# populate each NaN value with the max for that column - so they get the max laptime for that race
# we can search for cols [1:] and find all cols after Date and Line
pivot.iloc[:,1:] = pivot.iloc[:,1:].fillna(pivot.iloc[:,1:].max())

# sum all the columns to get a race_time
pivot['race_time'] = pivot.sum(axis=1)
# sum all but the last 2 cols (this lap and the race_time) to calc prev_race_time

# pivot['prev_race_time'] = pivot[pivot.columns[2]] + pivot[pivot.columns[3]]
pivot['prev_race_time'] = pivot.iloc[:,1:-2].sum(axis=1)

pivot = pivot.merge(df_dash[['Plant','Line']], on='Line')
pivot.sort_values('race_time', inplace=True)
pivot['position'] = np.arange(1,len(pivot) + 1)
pivot['gap_to_leader'] = pivot['race_time'] - pivot['race_time'].iloc[0]
pivot.sort_values('prev_race_time', inplace=True)
pivot['prev_position'] = np.arange(1,len(pivot) + 1)
pivot['Gain/Loss'] = pivot.prev_position - pivot.position
pivot.sort_values('race_time', inplace=True)
pivot['interval'] = pivot.race_time.diff()
pivot = pivot.merge(df_dash[['Line','OEE  Start point', '⇗ OEE% progress', 'OEE% Target (2022)']], on='Line')
pivot

Unnamed: 0,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,race_time,prev_race_time,Plant,position,gap_to_leader,prev_position,Gain/Loss,interval,OEE Start point,⇗ OEE% progress,OEE% Target (2022)
0,GAMMA1,89.153563,69.325767,68.226756,89.979194,316.68528,226.706086,SCOPPITO,1,0.0,1,0,,0.418683,0.085148,0.57
1,IMA C80/2,88.910252,69.256522,75.232693,89.674113,323.07358,233.399467,SCOPPITO,2,6.388299,2,0,6.388299,0.451031,0.043365,0.58
2,L18 Packaging Line,88.847299,74.226704,89.407996,89.425327,341.907326,252.481999,Tours,3,25.222045,3,0,18.833746,0.377683,0.086173,0.547
3,L25 Packaging Line,88.849775,74.908842,90.011979,89.862601,343.633196,253.770595,Tours,4,26.947916,4,0,1.725871,0.351564,0.001613,0.478
4,M22 Filling,88.953983,81.806345,84.85767,89.522488,345.140486,255.617998,Frankfurt,5,28.455206,5,0,1.50729,0.530068,0.12028,0.65
5,M21 Filling,89.248712,83.742378,84.767181,89.760825,347.519096,257.758271,Frankfurt,6,30.833816,6,0,2.37861,0.599671,0.022006,0.65
6,M18 Filling,89.960465,82.596174,85.446232,90.979137,348.982009,258.002872,Frankfurt,7,32.296728,7,0,1.462913,0.443522,-0.010057,0.65
7,AL6,89.137846,90.323044,84.205598,89.902144,353.568632,263.666488,Frankfurt,8,36.883352,8,0,4.586623,0.332657,0.078541,0.45
8,SUPPO Packaging Line,88.585379,89.604811,89.904755,89.318502,357.413446,268.094944,Lisieux,9,40.728166,10,1,3.844814,0.353021,-0.010676,0.53
9,LINE 01 - UHLMANN 1880,89.960465,88.525497,89.636847,89.547483,357.670293,268.122809,SUZANO,10,40.985013,11,1,0.256846,0.427854,0.015085,0.65


#### write this out for tableau

In [79]:
pivot.to_csv(output_dir + "leaderboard.csv")
pivot.to_excel(output_dir + "leaderboard.xlsx")

# END OF PROCESSING - Sanity checks below

In [80]:
pivot.sort_values(pivot.columns[1], inplace=True)
pivot['apr_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['may_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2,3]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['jun_position'] = np.arange(1,len(pivot) + 1)
pivot['temp_race_time'] = pivot[pivot.columns[[1,2,3,4]]].sum(axis=1)
pivot.sort_values(by='temp_race_time', inplace=True)
pivot['jly_position'] = np.arange(1,len(pivot) + 1)

In [81]:
# regex filter on column name contains Line, position or 2021
pivot.filter(regex='Line|position|2021')

Unnamed: 0,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,position,prev_position,apr_position,may_position,jun_position,jly_position
0,GAMMA1,89.153563,69.325767,68.226756,89.979194,1,1,10,2,1,1
1,IMA C80/2,88.910252,69.256522,75.232693,89.674113,2,2,5,1,2,2
2,L18 Packaging Line,88.847299,74.226704,89.407996,89.425327,3,3,3,3,3,3
3,L25 Packaging Line,88.849775,74.908842,90.011979,89.862601,4,4,4,4,4,4
4,M22 Filling,88.953983,81.806345,84.85767,89.522488,5,5,7,5,5,5
5,M21 Filling,89.248712,83.742378,84.767181,89.760825,6,6,11,7,6,6
6,M18 Filling,89.960465,82.596174,85.446232,90.979137,7,7,13,6,7,7
7,AL6,89.137846,90.323044,84.205598,89.902144,8,8,9,15,8,8
8,SUPPO Packaging Line,88.585379,89.604811,89.904755,89.318502,9,10,1,8,10,9
9,LINE 01 - UHLMANN 1880,89.960465,88.525497,89.636847,89.547483,10,11,14,10,11,10


In [33]:
# df_weekly.groupby(['Review_Date','Line']).lap_time.sum().reset_index()
sectors = df_weekly.filter(regex='Line|sector|Review_Date')

In [34]:
# sectors[sectors.columns[[2,3,4,5]]].sum(axis=1)
sectors = df_weekly.filter(regex='Line|sector|Review_Date')
sectors = sectors.iloc[:,0:6]
sectors = sectors.groupby(['Line','Review_Date']).sum().reset_index()
sectors['time'] = sectors.sum(axis=1)
sectors_pivot = sectors.pivot(index='Line', columns='Review_Date', values='time')
sectors_pivot.reset_index(inplace=True)
sectors_pivot
# sectors[sectors[sectors.columns[[2,3,4,5]]].sum(axis=1)
# sectors[sectors.Line.str.contains('AL5')]

Review_Date,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00
0,AL5 Packaging 1,1.248712,1.725754,2.495004,1.559946
1,AL6,1.137846,2.323044,1.805598,1.902144
2,C2 Packaging Line,0.952682,2.451874,2.261138,2.622942
3,C9 Packaging Line,1.021271,1.747841,2.361488,2.130804
4,GAMMA1,1.153563,1.725767,1.926756,1.979194
5,IMA C80/2,0.910252,1.656522,1.732693,1.674113
6,L18 Packaging Line,0.847299,1.426704,1.407996,1.425327
7,L25 Packaging Line,0.849775,2.108842,2.011979,1.862601
8,LINE 01 - UHLMANN 1880,,0.525497,1.636847,1.547483
9,M18 Filling,1.960465,2.596174,2.646232,2.979137


In [35]:
# search for cols [1:] and find all cols after Date and Line
sectors_pivot.iloc[:,1:] = sectors_pivot.iloc[:,1:].fillna(sectors_pivot.iloc[:,1:].max())


In [36]:
sectors_pivot.sort_values(sectors_pivot.columns[1], inplace=True)
sectors_pivot['apr_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['may_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2,3]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['jun_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot['temp_race_time'] = sectors_pivot[sectors_pivot.columns[[1,2,3,4]]].sum(axis=1)
sectors_pivot.sort_values(by='temp_race_time', inplace=True)
sectors_pivot['jly_position'] = np.arange(1,len(sectors_pivot) + 1)
sectors_pivot.drop(columns='temp_race_time', inplace=True)
sectors_pivot

Review_Date,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,apr_position,may_position,jun_position,jly_position
6,L18 Packaging Line,0.847299,1.426704,1.407996,1.425327,3,2,1,1
13,SUPPO Packaging Line,0.585379,1.604811,1.904755,1.318502,1,1,3,2
8,LINE 01 - UHLMANN 1880,1.960465,0.525497,1.636847,1.547483,13,4,4,3
5,IMA C80/2,0.910252,1.656522,1.732693,1.674113,5,5,5,4
14,TR200 Packaging Line,0.710407,1.591164,1.381941,2.529634,2,3,2,5
11,M22 Filling,0.953983,1.806345,2.05767,1.522488,7,6,7,6
4,GAMMA1,1.153563,1.725767,1.926756,1.979194,10,8,6,7
12,MEDISEAL PURAN,1.960465,1.019996,1.838256,1.971539,15,11,8,8
7,L25 Packaging Line,0.849775,2.108842,2.011979,1.862601,4,9,9,9
0,AL5 Packaging 1,1.248712,1.725754,2.495004,1.559946,11,10,12,10


In [37]:
pivot.filter(regex='Line|position|2021').sort_values('jun_position')

Unnamed: 0,Line,2021-04-15 00:00:00,2021-05-13 00:00:00,2021-06-10 00:00:00,2021-07-08 00:00:00,position,prev_position,apr_position,may_position,jun_position,jly_position
0,GAMMA1,89.153563,69.325767,68.226756,89.979194,1,1,10,2,1,1
1,IMA C80/2,88.910252,69.256522,75.232693,89.674113,2,2,5,1,2,2
2,L18 Packaging Line,88.847299,74.226704,89.407996,89.425327,3,3,3,3,3,3
3,L25 Packaging Line,88.849775,74.908842,90.011979,89.862601,4,4,4,4,4,4
4,M22 Filling,88.953983,81.806345,84.85767,89.522488,5,5,7,5,5,5
5,M21 Filling,89.248712,83.742378,84.767181,89.760825,6,6,11,7,6,6
6,M18 Filling,89.960465,82.596174,85.446232,90.979137,7,7,13,6,7,7
7,AL6,89.137846,90.323044,84.205598,89.902144,8,8,9,15,8,8
10,TR200 Packaging Line,88.710407,89.591164,89.381941,90.529634,11,9,2,9,9,11
8,SUPPO Packaging Line,88.585379,89.604811,89.904755,89.318502,9,10,1,8,10,9


In [38]:
aggregations = {
    'sector_1':'sum',
    'sector_2':'mean',
    'sector_3':'mean',
    'sector_4':'mean',
    'sector_5':'sum',
    'sector_6':'sum',
    'sector_7':'sum',
    'sector_8':'sum',
    'sector_9':'sum',
    'lap_time':'sum'
}
all_sectors = df_weekly.groupby(['Review_Date','Line'])[['sector_1','sector_2','sector_3','sector_4','sector_5','sector_6','sector_7','sector_8','sector_9','lap_time']].agg(aggregations).reset_index()

In [39]:
all_sectors[all_sectors.Review_Date == '2021-07-08'].sort_values('sector_1')

Unnamed: 0,Review_Date,Line,sector_1,sector_2,sector_3,sector_4,sector_5,sector_6,sector_7,sector_8,sector_9,lap_time
43,2021-07-08,AL5 Packaging 1,-0.469051,0.191344,0.241631,0.074274,0.0,0.0,0.0,0.0,0.0,1.559946
56,2021-07-08,SUPPO Packaging Line,-0.17283,0.072115,0.24424,0.056478,0.0,0.0,0.0,0.0,0.0,1.318502
50,2021-07-08,L25 Packaging Line,-0.025216,0.021244,0.19859,0.252119,0.0,0.0,0.0,0.0,0.0,1.862601
47,2021-07-08,GAMMA1,-0.008305,0.074553,0.194326,0.227996,0.0,0.0,0.0,0.0,0.0,1.979194
48,2021-07-08,IMA C80/2,0.024126,0.043614,0.178797,0.190086,0.0,0.0,0.0,0.0,0.0,1.674113
51,2021-07-08,LINE 01 - UHLMANN 1880,0.030771,0.105944,0.210129,0.063105,0.0,0.0,0.0,0.0,0.0,1.547483
54,2021-07-08,M22 Filling,0.04329,0.112922,0.127066,0.129812,0.0,0.0,0.0,0.0,0.0,1.522488
53,2021-07-08,M21 Filling,0.054741,0.156167,0.137801,0.132553,0.0,0.0,0.0,0.0,0.0,1.760825
49,2021-07-08,L18 Packaging Line,0.104408,0.043782,0.114678,0.171769,0.0,0.0,0.0,0.0,0.0,1.425327
46,2021-07-08,C9 Packaging Line,0.105945,0.060942,0.380177,0.065095,0.0,0.0,0.0,0.0,0.0,2.130804


In [40]:
all_sectors['lap_time'] = all_sectors['lap_time'] + 88

In [41]:
all_sectors

Unnamed: 0,Review_Date,Line,sector_1,sector_2,sector_3,sector_4,sector_5,sector_6,sector_7,sector_8,sector_9,lap_time
0,2021-04-15,AL5 Packaging 1,0.0,0.048821,0.429527,0.146007,0.0,0.0,0.0,0.0,0.0,89.248712
1,2021-04-15,AL6,-0.028024,0.051373,0.367537,0.164026,0.0,0.0,0.0,0.0,0.0,89.137846
2,2021-04-15,C2 Packaging Line,-0.133204,0.074381,0.429527,0.039035,0.0,0.0,0.0,0.0,0.0,88.952682
3,2021-04-15,C9 Packaging Line,0.269447,0.088693,0.225816,0.061403,0.0,0.0,0.0,0.0,0.0,89.021271
4,2021-04-15,GAMMA1,0.028993,0.063492,0.305332,0.19346,0.0,0.0,0.0,0.0,0.0,89.153563
5,2021-04-15,IMA C80/2,0.017397,0.10103,0.244322,0.101075,0.0,0.0,0.0,0.0,0.0,88.910252
6,2021-04-15,L18 Packaging Line,-0.033218,0.068275,0.236805,0.135179,0.0,0.0,0.0,0.0,0.0,88.847299
7,2021-04-15,L25 Packaging Line,0.004575,0.065351,0.208968,0.148281,0.0,0.0,0.0,0.0,0.0,88.849775
8,2021-04-15,M18 Filling,-0.019499,0.196217,0.177578,0.616187,0.0,0.0,0.0,0.0,0.0,89.960465
9,2021-04-15,M21 Filling,0.0,0.048821,0.429527,0.146007,0.0,0.0,0.0,0.0,0.0,89.248712


In [42]:
df_weekly[['Line','OEE %','rolling_std','Unplanned_tech_loss','Changeover_rolling_mean','Review_Date','Date']][df_weekly.Line.str.contains('M18')]

Unnamed: 0,Line,OEE %,rolling_std,Unplanned_tech_loss,Changeover_rolling_mean,Review_Date,Date
9,M18 Filling,0.046124,,0.114818,0.789781,2021-04-15,2021-04-04
18,M18 Filling,0.463021,0.294791,0.240339,0.442593,2021-04-15,2021-04-11
32,M18 Filling,0.386723,0.221973,0.069673,0.425201,2021-05-13,2021-04-18
47,M18 Filling,0.35242,0.183226,0.485988,0.327211,2021-05-13,2021-04-25
63,M18 Filling,0.530336,0.079604,0.311511,0.137063,2021-05-13,2021-05-02
78,M18 Filling,0.562696,0.103957,0.227898,0.122545,2021-05-13,2021-05-09
92,M18 Filling,0.698908,0.142578,0.108069,0.024941,2021-06-10,2021-05-16
98,M18 Filling,0.0,0.307459,0.0,0.26663,2021-06-10,2021-05-23
116,M18 Filling,0.191951,0.323388,0.080336,0.41598,2021-06-10,2021-05-30
130,M18 Filling,0.537044,0.318333,0.216673,0.416194,2021-06-10,2021-06-06


In [85]:
df_weekly['sector_1'] = df_weekly['sector_1'].mul(-1)
highlighted_sectors = df_weekly[['Line','Review_Date','Changeover_rolling_mean']].groupby(['Line','Review_Date']).sum().reset_index()
highlighted_sectors = df_weekly.pivot(index='Line', columns='Date', values=['sector_1','sector_2','sector_3','sector_4']).style.highlight_max(color = 'purple', axis = 0)
# highlighted_sectors['sector_1'] = df_weekly.pivot(index='Line', columns='Date', values='sector_1').style.highlight_max(color = 'purple', axis = 0)
df_weekly['sector_1'] = df_weekly['sector_1'].mul(-1)
highlighted_sectors

Unnamed: 0_level_0,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_1,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_2,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_3,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4,sector_4
Date,2021-04-04 00:00:00,2021-04-11 00:00:00,2021-04-18 00:00:00,2021-04-25 00:00:00,2021-05-02 00:00:00,2021-05-09 00:00:00,2021-05-16 00:00:00,2021-05-23 00:00:00,2021-05-30 00:00:00,2021-06-06 00:00:00,2021-06-13 00:00:00,2021-06-20 00:00:00,2021-06-27 00:00:00,2021-07-04 00:00:00,2021-04-04 00:00:00,2021-04-11 00:00:00,2021-04-18 00:00:00,2021-04-25 00:00:00,2021-05-02 00:00:00,2021-05-09 00:00:00,2021-05-16 00:00:00,2021-05-23 00:00:00,2021-05-30 00:00:00,2021-06-06 00:00:00,2021-06-13 00:00:00,2021-06-20 00:00:00,2021-06-27 00:00:00,2021-07-04 00:00:00,2021-04-04 00:00:00,2021-04-11 00:00:00,2021-04-18 00:00:00,2021-04-25 00:00:00,2021-05-02 00:00:00,2021-05-09 00:00:00,2021-05-16 00:00:00,2021-05-23 00:00:00,2021-05-30 00:00:00,2021-06-06 00:00:00,2021-06-13 00:00:00,2021-06-20 00:00:00,2021-06-27 00:00:00,2021-07-04 00:00:00,2021-04-04 00:00:00,2021-04-11 00:00:00,2021-04-18 00:00:00,2021-04-25 00:00:00,2021-05-02 00:00:00,2021-05-09 00:00:00,2021-05-16 00:00:00,2021-05-23 00:00:00,2021-05-30 00:00:00,2021-06-06 00:00:00,2021-06-13 00:00:00,2021-06-20 00:00:00,2021-06-27 00:00:00,2021-07-04 00:00:00
Line,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2,Unnamed: 32_level_2,Unnamed: 33_level_2,Unnamed: 34_level_2,Unnamed: 35_level_2,Unnamed: 36_level_2,Unnamed: 37_level_2,Unnamed: 38_level_2,Unnamed: 39_level_2,Unnamed: 40_level_2,Unnamed: 41_level_2,Unnamed: 42_level_2,Unnamed: 43_level_2,Unnamed: 44_level_2,Unnamed: 45_level_2,Unnamed: 46_level_2,Unnamed: 47_level_2,Unnamed: 48_level_2,Unnamed: 49_level_2,Unnamed: 50_level_2,Unnamed: 51_level_2,Unnamed: 52_level_2,Unnamed: 53_level_2,Unnamed: 54_level_2,Unnamed: 55_level_2,Unnamed: 56_level_2
AL5 Packaging 1,0.0,0.0,-0.029948,0.192906,-0.136848,-0.062193,0.063953,0.0,0.0,-0.507565,0.330179,0.136351,0.01676,-0.014238,0.097643,0.0,0.01729,0.087615,0.085253,0.092469,0.083941,0.031694,0.031977,0.253782,0.239303,0.230223,0.224082,0.071769,0.465009,0.394046,0.314964,0.135051,0.280638,0.302882,0.230673,0.619956,0.4947,0.0,0.333841,0.173684,0.192306,0.266691,0.146007,0.146007,0.087039,0.104883,0.091009,0.09058,0.083964,0.071042,0.074933,0.074731,0.074752,0.064235,0.079034,0.079075
AL6,0.03524,-0.007216,-0.026541,-0.024595,0.042794,0.075993,-0.027837,0.0988,-0.149572,0.126385,-0.104511,0.063769,-0.048738,-0.078298,0.097643,0.005102,0.017776,0.026647,0.02265,0.051255,0.052506,0.061416,0.062384,0.068884,0.074493,0.058049,0.047659,0.052392,0.342851,0.392222,0.516086,0.486667,0.49131,0.399435,0.432331,0.288794,0.286623,0.271771,0.206465,0.290337,0.354232,0.396381,0.184958,0.143095,0.114245,0.106977,0.078945,0.078701,0.086895,0.076246,0.08513,0.080396,0.075543,0.069819,0.055182,0.053813
C2 Packaging Line,0.060911,0.072293,-0.120032,-0.038125,0.045678,0.071465,-0.060302,-0.061913,-0.00833,0.068896,0.039284,-0.032762,0.0,-0.223237,0.097643,0.051119,0.060433,0.068162,0.068111,0.04888,0.048225,0.050177,0.060863,0.037918,0.051397,0.045457,0.017736,0.118093,0.465009,0.394046,0.456647,0.604006,0.514876,0.449509,0.479049,0.50177,0.4947,0.421654,0.434789,0.406865,0.378615,0.685652,0.04414,0.033929,0.042407,0.037359,0.031083,0.029387,0.014546,0.016415,0.027476,0.046696,0.054718,0.076029,0.080278,0.056597
C9 Packaging Line,-0.156673,-0.112774,0.185507,0.064867,-0.003218,0.026828,-0.174445,0.048091,0.112681,-0.006812,0.083021,-0.045966,-0.060883,-0.082118,0.097643,0.079743,0.093471,0.107335,0.117687,0.037776,0.079719,0.082222,0.085362,0.079486,0.077638,0.037802,0.04661,0.08172,0.229297,0.222335,0.394748,0.393027,0.380134,0.448504,0.444345,0.619956,0.362147,0.386026,0.354641,0.334504,0.378615,0.452948,0.081871,0.040935,0.02729,0.020468,0.0,0.001385,0.024889,0.036109,0.061528,0.079214,0.05571,0.069295,0.070908,0.064467
GAMMA1,0.012503,-0.041496,0.068978,0.01315,0.10614,-0.070194,-0.028489,-0.006388,-0.076577,0.08727,0.116359,0.003118,0.0,-0.111172,0.097643,0.029342,0.034725,0.036291,0.077791,0.053462,0.048383,0.048109,0.047643,0.041369,0.084034,0.099853,0.059237,0.055086,0.291389,0.319276,0.138837,0.200669,0.179874,0.163928,0.196914,0.193325,0.179267,0.180546,0.13015,0.1252,0.378615,0.143339,0.209384,0.177536,0.239314,0.249749,0.224872,0.24433,0.212056,0.214613,0.269997,0.27035,0.257434,0.233924,0.201427,0.219199
IMA C80/2,0.130272,-0.147669,0.017881,-0.045296,0.126901,-0.023979,0.039114,0.007142,-0.12347,0.07265,0.010669,0.031821,0.029993,-0.096609,0.097643,0.104418,0.080592,0.07769,0.054607,0.05723,0.064034,0.020455,0.056612,0.056713,0.051457,0.048656,0.032795,0.041548,0.224561,0.264082,0.238785,0.180133,0.16404,0.162985,0.185422,0.149106,0.28292,0.213088,0.188204,0.187384,0.145926,0.193675,0.070836,0.131315,0.158429,0.178621,0.188748,0.190167,0.176568,0.166666,0.182659,0.173885,0.191808,0.184578,0.184157,0.199801
L18 Packaging Line,0.08824,-0.055022,-0.009504,-0.043839,0.122455,0.010762,-0.000284,-0.033991,-0.032848,0.092181,-0.049611,0.073659,-0.046826,-0.08163,0.097643,0.038906,0.034836,0.044526,0.05067,0.063867,0.064961,0.016095,0.032122,0.040168,0.038149,0.051868,0.031486,0.053623,0.208273,0.265337,0.226192,0.239612,0.141736,0.143958,0.093607,0.196155,0.187603,0.143274,0.098353,0.083866,0.090875,0.18562,0.134271,0.136087,0.126963,0.138506,0.147048,0.148664,0.177399,0.162827,0.16065,0.158194,0.161486,0.172588,0.169953,0.183051
L25 Packaging Line,0.042176,-0.046751,-0.159143,0.105899,0.143051,-0.080778,-0.032293,0.022129,0.053744,-0.043539,-0.020987,0.017593,-0.00801,0.03662,0.097643,0.033058,0.107939,0.088484,0.104036,0.105016,0.06164,0.049343,0.031859,0.031859,0.028297,0.027422,0.009343,0.019916,0.206618,0.211318,0.152292,0.133694,0.15241,0.107606,0.089513,0.209447,0.250883,0.126578,0.224615,0.176019,0.201629,0.192098,0.132655,0.163907,0.258599,0.273209,0.299018,0.335569,0.320791,0.301482,0.273366,0.26526,0.233708,0.246058,0.276305,0.252406
LINE 01 - UHLMANN 1880,,,,,-0.427854,0.0,0.372515,0.068707,-0.056047,-0.070367,0.203893,-0.15016,0.198111,-0.282615,,,,,0.097643,0.0,0.215072,0.236574,0.202037,0.051869,0.086351,0.086634,0.119715,0.131077,,,,,0.0,0.0,0.337516,0.203047,0.281732,0.31205,0.117504,0.080061,0.233553,0.409397,,,,,0.0,0.0,0.00552,0.014179,0.0203,0.071758,0.073951,0.069827,0.068238,0.040405
M18 Filling,-0.397398,0.416897,-0.076297,-0.034303,0.177917,0.032359,0.136212,-0.698908,0.191951,0.345093,0.101833,0.023002,-0.041439,-0.62044,0.097643,0.294791,0.221973,0.183226,0.079604,0.103957,0.142578,0.307459,0.323388,0.318333,0.297577,0.217207,0.054387,0.320648,0.114818,0.240339,0.069673,0.485988,0.311511,0.227898,0.108069,0.0,0.080336,0.216673,0.222576,0.200207,0.198006,0.0,0.789781,0.442593,0.425201,0.327211,0.137063,0.122545,0.024941,0.26663,0.41598,0.416194,0.426343,0.185797,0.039446,0.2799


In [427]:
all_sectors.pivot(index='Line', columns='Review_Date', values='lap_time')

Review_Date,2021-04-15,2021-05-13,2021-06-10,2021-07-08
Line,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AL5 Packaging 1,89.248972,89.725754,90.495004,89.559946
AL6,89.137919,90.323044,84.205598,89.849765
C2 Packaging Line,88.952755,90.451874,90.261138,90.594185
C9 Packaging Line,89.021344,89.747841,90.361488,90.130804
GAMMA1,89.153635,69.325767,68.226756,89.979194
IMA C80/2,88.910325,69.256522,75.232693,89.680834
L18 Packaging Line,88.847371,74.226704,89.407996,89.427519
L25 Packaging Line,88.849848,74.908842,90.011979,89.862601
LINE 01 - UHLMANN 1880,,88.52557,89.636847,89.547483
M18 Filling,89.960538,82.596174,85.446232,90.979137


adding a weekly race_time just for the gapminder presentation

In [44]:
weekly_race_time = df_weekly.groupby(['Line','Date'])['lap_time'].sum().groupby('Line').cumsum().reset_index()
weekly_race_time.rename(columns={'lap_time' : 'race_time'}, inplace=True)
weekly_race_time['race_time'] = weekly_race_time['race_time'] + 88
df_weekly = df_weekly.merge(weekly_race_time[['Line','Date','race_time']], on=(['Line','Date']))
# df_weekly.drop(columns={'race_time'}, inplace=True)
df_weekly

Unnamed: 0,Week,Line,OEE %,WeekOfYear,Year,Date,Plant,OEE Start point,OEE% Target (2022),Unplanned_tech_loss,...,sector_2,sector_3,sector_4,sector_5,sector_6,sector_7,sector_8,sector_9,lap_time,race_time
0,W13-2021,C2 Packaging Line,0.458414,13.0,2021.0,2021-04-04,Maisons-Alfort,0.397503,0.470,0.465009,...,0.097643,0.465009,0.044140,0.0,0.0,0.0,0.0,0.0,0.545881,88.545881
1,W13-2021,L18 Packaging Line,0.465923,13.0,2021.0,2021-04-04,Tours,0.377683,0.547,0.208273,...,0.097643,0.208273,0.134271,0.0,0.0,0.0,0.0,0.0,0.351947,88.351947
2,W13-2021,IMA C80/2,0.581304,13.0,2021.0,2021-04-04,SCOPPITO,0.451031,0.580,0.224561,...,0.097643,0.224561,0.070836,0.0,0.0,0.0,0.0,0.0,0.262768,88.262768
3,W13-2021,AL6,0.367897,13.0,2021.0,2021-04-04,Frankfurt,0.332657,0.450,0.342851,...,0.097643,0.342851,0.184958,0.0,0.0,0.0,0.0,0.0,0.590211,88.590211
4,W13-2021,TR200 Packaging Line,0.596432,13.0,2021.0,2021-04-04,Lisieux,0.483505,0.650,0.163573,...,0.097643,0.163573,0.075189,0.0,0.0,0.0,0.0,0.0,0.223478,88.223478
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
197,W26-2021,GAMMA1,0.491885,26.0,2021.0,2021-07-04,SCOPPITO,0.418683,0.570,0.143339,...,0.055086,0.143339,0.219199,-0.0,-0.0,-0.0,-0.0,-0.0,0.528796,52.685280
198,W26-2021,C9 Packaging Line,0.406623,26.0,2021.0,2021-07-04,Maisons-Alfort,0.528518,0.530,0.452948,...,0.081720,0.452948,0.064467,-0.0,-0.0,-0.0,-0.0,-0.0,0.681253,95.261404
199,W26-2021,L18 Packaging Line,0.411426,26.0,2021.0,2021-07-04,Tours,0.377683,0.547,0.185620,...,0.053623,0.185620,0.183051,-0.0,-0.0,-0.0,-0.0,-0.0,0.503925,77.907326
200,W26-2021,L25 Packaging Line,0.381273,26.0,2021.0,2021-07-04,Tours,0.351564,0.478,0.192098,...,0.019916,0.192098,0.252406,-0.0,-0.0,-0.0,-0.0,-0.0,0.427800,79.633196


In [46]:
gapminder_weekly = df_weekly[['Line','Date','sector_1','sector_2','sector_3','sector_4','lap_time','race_time']]
gapminder_weekly.to_excel(dir +'gapfinder_weekly.xlsx', index=False)

In [47]:
df_weekly[['Line','Date','race_time']][df_weekly.Line.str.contains('TR200')].sort_values(['Line','Date'])

Unnamed: 0,Line,Date,race_time
4,TR200 Packaging Line,2021-04-04,88.223478
15,TR200 Packaging Line,2021-04-11,88.710407
26,TR200 Packaging Line,2021-04-18,88.976982
48,TR200 Packaging Line,2021-04-25,89.573466
55,TR200 Packaging Line,2021-05-02,90.001912
69,TR200 Packaging Line,2021-05-09,90.301571
96,TR200 Packaging Line,2021-05-16,90.69446
106,TR200 Packaging Line,2021-05-23,90.993436
118,TR200 Packaging Line,2021-05-30,91.520976
137,TR200 Packaging Line,2021-06-06,91.683513
