# AL6 Alarm Data Analysis

This notebook imports the individual alarms files and creates one standarised dataframe.  It then pulls in the alarms that have been identified as 'significant' by Sanofi so that we can focus on those specifically.

The output from this notebook is:

- Alarms.csv - all the significant alarms merged together
- devicestau_jams.csv - Label machine jams
- devicemangel_shortage.csv - label machine shortages

The device* alarms are based on information from Sanofi on what causes the Labeler machine state of 'suspend'


The order of the AL6 Line:

First we ASSEMBLE it, after that its been LABELED. Then its PACKAGED and these going into a CARTON. At last the Cartons get PALLETIZED   
It goes from Assembly (Harro Hoefliger) to Labeler (Krones) to Packaging (Schubert Verpacker) to Cartoner (Pester Umverpacker) to Palletizer (Pester)




|IP_TAG Name   |Machine|German|Desired Speed|Alarm File location|
|--------------|-------|------|-------------|-------------------| 
|36630901_SPEED|Assembly|Montage|32 used but 35|Y:\E00_Solostar\E6_Assembly_Line_6\E63_Montage\CSV|   
|36640901_SPEED|Labeler|Etikettierer|500|Y:\E00_Solostar\E6_Assembly_Line_6\E64_Etikettierer\AuditTrail|   
|36650901_SPEED|Packaging (Pacemaker!)|Kartonierer|450 (3pcs) or 500 (5pcs)|Y:\E00_Solostar\E6_Assembly_Line_6\E65_Kartonierer\40 - Reports|   
|36680901_SPEED|Cartoner|Endverpacker (UVP)||Y:\E00_Solostar\E6_Assembly_Line_6\E68_Endverpacker\UVP|   
|36680902_SPEED|Palletizer|Endverpacker (PAL)||Y:\E00_Solostar\E6_Assembly_Line_6\E68_Endverpacker\PAL|       



In [1]:
import pandas as pd
import numpy as np
import datetime
import os
import xlrd
import re

# this needs a settings.cfg file in the same directory
# my modules
import set_config
from common_functions import create_shift_category
from common_functions import create_df_from_file

# call set_config
dir_sanofi_share = set_config.ConfigSectionMap("SectionOne")['sanofi']
dir_local = set_config.ConfigSectionMap("SectionOne")['local']


## AL6_Endverpacker - Cartoner

Endverpacker folder consists of PAL and UVP Alarm files and we want to get both of those

Dir: Y:\E00_Solostar\E6_Assembly_Line_6\E68_Endverpacker\PAL    
Dir: Y:\E00_Solostar\E6_Assembly_Line_6\E68_Endverpacker\UVP

These files are converted from .VAA files (Raza has a VBA script to convert them) into csv and then we can read them:

```

ID	State	StateID	StateText	TimeIn	TimeUserAck	TimePLCAck	MessageID	MessageText	Occur	GroupID	GroupText	Priority	Parameter1	Parameter2	Group	Class	HistoricalID	HasNotes	SortBuffer	TimeInUTC	User	Machine
579	0	$2524	MGG	31/07/2021 23:57	30/12/1899 00:00:00	01/08/2021 00:30	$9051	Maschine wartet auf Produkte	1	$8879	Warnung Maschine	0	$7326	12	42	5	1	0	Alarmhistory_1-8-21--12-0	31/07/2021 21:57		FRAM28556
579	4	$2522	MGK	01/08/2021 00:30	30/12/1899 00:00:00	30/12/1899 00:00:00	$9051	Maschine wartet auf Produkte	1	$8879	Warnung Maschine	0	$7326	12	42	5	2	0	Alarmhistory_1-8-21--12-0	31/07/2021 22:30		FRAM28556
579	0	$2524	MGG	01/08/2021 00:30	30/12/1899 00:00:00	01/08/2021 00:31	$9051	Maschine wartet auf Produkte	1	$8879	Warnung Maschine	0	$7326	12	42	5	3	0	Alarmhistory_1-8-21--12-0	31/07/2021 22:30		FRAM28556
828	4	$2522	MGK	01/08/2021 00:32	30/12/1899 00:00:00	30/12/1899 00:00:00	$9300	Roboter: Palette voll	1	$8896	Warnung Roboter: Kartonhandling	0	$8323	10004	59	5	4	0	Alarmhistory_1-8-21--12-0	31/07/2021 22:32		FRAM28556

```

In [2]:
# PAL alarms - Palletizer

folder = 'Alarms_data'
subfolder = 'AL6_Endverpacker'
subsubfolder = 'PAL'
path = os.path.join(dir_sanofi_share, folder, subfolder, subsubfolder)

Files = []
df = []
PAL_df = pd.DataFrame()


for filename in os.listdir( path ):
    if filename.endswith('.csv'):
        Files.append(filename)

Files.sort(key=str.lower)

# Loop through all the files.
for filename in Files:
    filepath = os.path.join(path, filename)
    # print (path)
    df = pd.read_csv(filepath, encoding='mbcs')
    df['Filename'] = filename
    PAL_df = PAL_df.append(df)


# C:\Users\mark_\Sanofi\Sanofi x McLaren sharing - General\Frankfurt sprint\SFD\Alarms_data\AL6_Endverpacker
PAL_df['Machine'] = 'Palletizer'

In [3]:
# UVP alarms - Cartoner

folder = 'Alarms_data'
subfolder = 'AL6_Endverpacker'
subsubfolder = 'UVP'
path = os.path.join(dir_sanofi_share, folder, subfolder, subsubfolder)

Files = []
df = []
UVP_df = pd.DataFrame()


for filename in os.listdir( path ):
    if filename.endswith('.csv'):
        Files.append(filename)

Files.sort(key=str.lower)

# Loop through all the files.
for filename in Files:
    filepath = os.path.join(path, filename)
    # print (path)
    df = pd.read_csv(filepath, encoding='mbcs')
    df['Filename'] = filename
    UVP_df = UVP_df.append(df)


# C:\Users\mark_\Sanofi\Sanofi x McLaren sharing - General\Frankfurt sprint\SFD\Alarms_data\AL6_Endverpacker
UVP_df['Machine'] = 'Cartoner'

## AL6_Etikettierer - Labeler

Dir: Y:\E00_Solostar\E6_Assembly_Line_6\E64_Etikettierer\AuditTrail

There are PDF files and AuditTrail csv files which hold the same information.
- status 0 = start of alarm
- status 3 = end of alarm

Problem is the ID of the alarm is reused repeatedly and I can see multiple Status 3 returned for same ID without a Status 0 apparently initiating the alarm

csv files looks like this:

```
Nummer;Projekt;Status;Prio;Quali;Datum;Zeit;UTC Versatz;Meldetext;ID;BMK;Parameterwert;Einheit;Alt/Neu-Wert;Benutzer;Auftrag;Charge
00000001;BAS;0;4;L;15.08.2021;22:00:00;+02:00;01-01 Solostar PEN;;TYPE;0001;;N;Bediener;80758594;1F7908A

```

Approach:
- read all the xls files in from folder   
- derive Start


In [4]:
Files = []
df = []
Etikettierer_df = pd.DataFrame()

folder = 'Alarms_data'
subfolder = 'AL6_Etikettierer'
path = os.path.join(dir_sanofi_share, folder, subfolder)

for filename in os.listdir( path ):
    if filename.endswith('.csv'):
        Files.append(filename)

Files.sort(key=str.lower)

# Loop through all the files.
for filename in Files:
    path = os.path.join( dir_sanofi_share, folder, subfolder, filename)
    # print (path)
    df = pd.read_csv(path, sep=';', encoding='utf-16')
    df['Filename'] = filename    
    Etikettierer_df = Etikettierer_df.append(df)

Etikettierer_df['Machine'] = 'Labeler'

# convert dates to datetime format
Etikettierer_df['Start'] = pd.to_datetime(Etikettierer_df['Datum'] + "."+ Etikettierer_df['Zeit'], format='%d.%m.%Y.%H:%M:%S')

# drop the rows with NaN values in ID - thought it was causing problems when trying to merge on ID, can't calc duration for them anyway, and I don't think they are relevant alarms
Etikettierer_df = Etikettierer_df[Etikettierer_df['ID'].notnull()]


## AL6_Montage - Assembly

Dir: Y:\E00_Solostar\E6_Assembly_Line_6\E63_Montage\CSV

- Saved as Text files but standard 'csv' files separated by semi-colon   
- Selecting just the files that begin with 'A' as they appear to be the warnings and alarms.    
- They don't have a header record   
- Don't know what the 'C*' files are   

``` 

Warnung;2021-08-05 02:00:09;2021-08-05 02:00:10;1342;1091-1B14 Standby: Teil nicht auf Abholposition Spur 14;0..Flt[1342];
Warnung;2021-08-05 02:00:09;2021-08-05 02:00:10;1343;1091-1B15 Standby: Teil nicht auf Abholposition Spur 15;0..Flt[1343];
Warnung;2021-08-05 02:00:23;2021-08-05 02:03:48;1198;1462-7B1 Standby: Max. Stau Abführband erreicht;0..Flt[1198];
Warnung;2021-08-05 02:04:20;2021-08-05 02:04:33;1198;1462-7B1 Standby: Max. Stau Abführband erreicht;0..Flt[1198];

```

In [5]:
subfolder = 'AL6_Montage'

Files = []
df = []
Montage_df = pd.DataFrame()
path = os.path.join(dir_sanofi_share, folder, subfolder)

for filename in os.listdir( path ):
    if filename.startswith('A'):
        path = os.path.join( dir_sanofi_share, folder, subfolder, filename )
        if os.path.getsize( path ) > 0:
            Files.append(filename)

Files.sort(key=str.lower)

# Loop through all the files.
for filename in Files:
    path = os.path.join(dir_sanofi_share, folder, subfolder, filename)
    # print (path)
    df = pd.read_csv(path, sep=';', encoding='cp1252', header=None)
    df['Filename'] = filename
    Montage_df = Montage_df.append(df)

Montage_df.columns=['Type','Start','End','Duration','Message Text','Some Code','Not sure','Filename']
Montage_df['Machine'] = 'Assembly'

## AL6_Kartonierer - Packaging

Dir: Y:\E00_Solostar\E6_Assembly_Line_6\E65_Kartonierer\40 - Reports

- Saved as PDF and xlsx files in a report format - warnings and alarms written to sections, if they occurred in that reporting period   
- Selecting just the xlsx files to work with.    
- interate over each file, find 'Alarme' and append row to dataframe until we find next blank row


``` 

					
Alarme					
Meldungstext	Beginn	Ende			
Waage: Ausdrucke können nicht mehr lokal gespeichert werden. Bitte legen Sie ein Speichermedium ein	"10:31:09
28.08.2021"	"11:24:19
28.08.2021"			
 =A+52-AP-U1# AS-i Fehler	"11:22:47
28.08.2021"	"11:23:22
28.08.2021"			
Druckluft überprüfen (Druck zu gering) (=A+05-SP48)	"11:23:28
28.08.2021"	"11:23:40
28.08.2021"			


In [6]:
subfolder = 'AL6_Kartonierer'

Files = []
df = []
Kartonierer_df = pd.DataFrame()

# Loop through all the files.

path = os.path.join (dir_sanofi_share, folder, subfolder)
for filename in os.listdir( path ):

    path = os.path.join(dir_sanofi_share, folder, subfolder, filename)
    df = pd.read_excel(path, na_filter=False)
    # df.set_index('Unnamed: 0', inplace=True)


    df['Filename'] = filename

    # Kartonierer_alarms = pd.DataFrame()

    for i, row in df.iterrows():
        str_temp = str(row[0])
        if (str_temp.find('Alarme') != -1): 
            alarm = True
        if row[0] == '': 
            alarm = False
        if alarm:
            Kartonierer_df = Kartonierer_df.append({'Message Text': row[0],
                                                    'Start': row[1],
                                                    'End': row[2],
                                                    'Filename': row['Filename']}, ignore_index=True)


Kartonierer_df = Kartonierer_df[Kartonierer_df['Message Text'].str.contains('Alarme|Meldu') == False]
# Kartonierer_alarms['Start'].replace('\n',' ', inplace=True)
Kartonierer_df['Start'] = Kartonierer_df['Start'].str[9:21] + ' ' + Kartonierer_df['Start'].str[0:8]
Kartonierer_df['End'] = Kartonierer_df['End'].str[9:21] + ' ' + Kartonierer_df['End'].str[0:8]

Kartonierer_df['Machine'] = 'Packaging'
Kartonierer_df.head()


Unnamed: 0,End,Filename,Message Text,Start,Machine
2,30.03.2021 13:44:44,2691_20210401_114604_AutomaticBatchFinalReport...,Seidenader nicht bereit,30.03.2021 13:44:15,Packaging
3,30.03.2021 13:47:26,2691_20210401_114604_AutomaticBatchFinalReport...,"F4_403# Aufnahmefehler (=E+54-29SV1, =E+54-29SV3)",30.03.2021 13:47:15,Packaging
4,30.03.2021 14:03:54,2691_20210401_114604_AutomaticBatchFinalReport...,Druckluft überprüfen (Druck zu gering) (=A+05-...,30.03.2021 14:03:48,Packaging
5,30.03.2021 14:05:44,2691_20210401_114604_AutomaticBatchFinalReport...,Druckluft überprüfen (Druck zu gering) (=A+05-...,30.03.2021 14:05:33,Packaging
6,30.03.2021 14:08:16,2691_20210401_114604_AutomaticBatchFinalReport...,Druckluft überprüfen (Druck zu gering) (=A+05-...,30.03.2021 14:08:10,Packaging


Calculate the alarm duration and create a common duration column

In [7]:
from datetime import datetime

#Calculate duration in seconds and add as a column to Endverpacker PAL and UVP. Rows without a duration are recorded as 'N/A'
# PAL
a=pd.to_datetime(PAL_df['TimePLCAck'], dayfirst=True)
b=pd.to_datetime(PAL_df['TimeIn'], dayfirst=True)
c=a-b
d=c.dt.total_seconds()
e=d.where(d >0, 'N/A')
PAL_df['Duration']=e

#Convert start time columns to datetime format
PAL_df['TimeIn']=pd.to_datetime(PAL_df['TimeIn'], dayfirst = True)
#Give a universal column name for Start Time (Start)
PAL_df=PAL_df.rename(columns={"TimeIn" : "Start"})


#Etikettierer does not include durations. A column has still been created for merging with 'N/Avalues.
Etikettierer_df['Duration']='N/A'

#Calculate duration in seconds and add as a column to Montage. Rows without a duration are recorded as 'N/A'
a=pd.to_datetime(Montage_df['End'])
b=pd.to_datetime(Montage_df['Start'])
c=a-b
d=c.dt.total_seconds()
e=d.where(d >0, 'N/A')
Montage_df['Duration']=e


# UVP
a=pd.to_datetime(UVP_df['TimePLCAck'], dayfirst=True)
b=pd.to_datetime(UVP_df['TimeIn'], dayfirst=True)
c=a-b
d=c.dt.total_seconds()
e=d.where(d >0, 'N/A')
UVP_df['Duration']=e

#Convert start time columns to datetime format
UVP_df['TimeIn']=pd.to_datetime(UVP_df['TimeIn'], dayfirst = True)
#Give a universal column name for Start Time (Start)
UVP_df=UVP_df.rename(columns={"TimeIn" : "Start"})


#Etikettierer does not include durations. A column has still been created for merging with 'N/Avalues.
# Etikettierer_df['Duration']='N/A'

#Calculate duration in seconds and add as a column to Montage. Rows without a duration are recorded as 'N/A'
a=pd.to_datetime(Montage_df['End'])
b=pd.to_datetime(Montage_df['Start'])
c=a-b
d=c.dt.total_seconds()
e=d.where(d >0, 'N/A')
Montage_df['Duration']=e



f=Etikettierer_df['Datum'] + "."+ Etikettierer_df['Zeit']
g=pd.to_datetime(f,format='%d.%m.%Y.%H:%M:%S')
Etikettierer_df['Start']=g

Montage_df['Start']=pd.to_datetime(Montage_df['Start'])


#Etikettierer_df=Etikettierer_df.rename(columns={"Zeit" : "Start"})

Kartonierer_df['Start'] = pd.to_datetime(Kartonierer_df['Start'], dayfirst=True)
# some end dates are missing so avoid the error with errors='coerce'
Kartonierer_df['End'] = pd.to_datetime(Kartonierer_df['End'], dayfirst=True, errors='coerce')
c = Kartonierer_df['End'] - Kartonierer_df['Start']
d = c.dt.total_seconds()
Kartonierer_df['Duration'] = d


Create a universal column name for Message Text

In [8]:
PAL_df=PAL_df.rename(columns={"MessageText" : "Message Text"})
UVP_df=UVP_df.rename(columns={"MessageText" : "Message Text"})
Etikettierer_df=Etikettierer_df.rename(columns={"Meldetext" : "Message Text"})


Create and populate master alarms dataframe, keeping the useful data

In [9]:
Alarms_df=pd.DataFrame(columns=['Message Text','Start','Duration','Filename','Machine'])
Alarms_df=pd.concat([Alarms_df, Etikettierer_df, PAL_df, UVP_df, Montage_df, Kartonierer_df], join="inner")
Alarms_df=Alarms_df.reset_index()
Alarms_df.drop_duplicates(inplace=True)

A list of unique alarm messages has been extracted, translated and saved as a CSV file.  Read this in and translate the German message to English.
(The translations file is produced by Raza)

In [10]:
filename = 'translations.csv'
folder='IP21'
path = os.path.join(dir_local, folder, filename)
df = pd.read_csv(path ,encoding='UTF-8', header=None, index_col=0)
# df = pd.read_csv(r'C:\Users\Raza-PC\Documents\McLaren\Sanofi\Sanofi\translations.csv',encoding='UTF-8',header=None,index_col=0)
translation=df.to_dict()
# Alarms_df['Message Text'] = Alarms_df['Message Text'].replace(translation[1])
Alarms_df['Message Text (English)'] = Alarms_df['Message Text'].replace(translation[1])


In [11]:
Alarms_df['Message Text (English)'][Alarms_df['Machine'] == 'Packaging']

1725707                                 Seidenader not ready
1725708    F4_403 # recording error (= E + 54-29SV1, = E ...
1725709    Check compressed air (pressure too low) (= A +...
1725710    Check compressed air (pressure too low) (= A +...
1725711    Check compressed air (pressure too low) (= A +...
                                 ...                        
1803392    [Q] F2_206 # Recording error booklet B gripper...
1803393    [Q] F2_206 # Recording error booklet B gripper...
1803394    [Q] F2_206 # Recording error booklet B gripper...
1803395    [Q] VT_608 # Sensor error folding box adhesive...
1803396    [Q] F2_208 # Folding box 1.1 not correctly fil...
Name: Message Text (English), Length: 77690, dtype: object

In [12]:
# this gets rid of (drops) the 'index' col and moves the newly created 'Message Text (English)' to the first col
cols = Alarms_df.columns.tolist()
cols = cols[-1:] + cols[1:-1]
Alarms_df = Alarms_df[cols]

In [13]:
folder = 'Alarms_data'
filename = 'significant_alarm_messages.xlsx'
path = os.path.join(dir_sanofi_share, folder, filename)

temp_df = pd.DataFrame()
Significant_alarms = pd.DataFrame()

for sheet_name in ['Etikettierer','Endverpacker','Montage','Kartonierer']:

    temp_df = pd.read_excel(path, sheet_name = sheet_name)
    Significant_alarms = Significant_alarms.append(temp_df)

In [14]:
Significant_alarms.rename(columns={'Alarm Message (English)':'Message Text (English)'}, inplace=True)
Significant_alarms_merged = Alarms_df.merge(Significant_alarms[['Message Text (English)','Significant']], on='Message Text (English)', how='outer')

In [15]:
Significant_alarms_merged.groupby(['Machine']).count()

Unnamed: 0_level_0,Message Text (English),Message Text,Start,Duration,Filename,Significant
Machine,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Assembly,859262,859262,859262,859262,859262,839597
Cartoner,280379,280379,280379,280379,280379,109461
Labeler,378737,378737,378737,378737,378737,364539
Packaging,77690,77690,77690,77669,77690,57191
Palletizer,207329,207329,207329,207329,207329,206216


In the next few cells I am trying to identify the start and end of Alarms for Etikettierer machine, which doesn't provide alarm duration.

In [154]:
# Significant_alarms_merged = Significant_alarms_merged[Significant_alarms_merged.duplicated()]
# Etikettierer_df[Etikettierer_df['Start'] == '2021-01-07 01:35:42']
Etikettierer_df.iloc[140:150]

Unnamed: 0,Nummer,Projekt,Status,Prio,Quali,Datum,Zeit,UTC Versatz,Message Text,ID,...,Parameterwert,Einheit,Alt/Neu-Wert,Benutzer,Auftrag,Charge,Filename,Machine,Start,Duration
141,142,MMA,0,2,L,06.01.2021,01:12:26,+01:00,Devicemangel im Einlauf,366.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:12:26,
142,143,MMA,3,2,L,06.01.2021,01:12:35,+01:00,Devicemangel im Einlauf,366.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:12:35,
143,144,MMA,0,3,L,06.01.2021,01:13:19,+01:00,Devicesperre wurde geschlossen durch: Transpor...,364.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:13:19,
144,145,MMA,3,3,L,06.01.2021,01:14:43,+01:00,Devicesperre wurde geschlossen durch: Transpor...,364.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:14:43,
145,146,MMA,0,1,Q,06.01.2021,01:15:02,+01:00,GMP: Plausibilitätsfehler Codekontrolle Aggreg...,4057.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:02,
146,147,MMA,0,1,Q,06.01.2021,01:15:02,+01:00,GMP: Plausibilitätsfehler Druckkontrolle Aggre...,4058.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:02,
147,148,MMA,0,3,L,06.01.2021,01:15:02,+01:00,Devicesperre wurde geschlossen durch: Transpor...,364.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:02,
148,149,AC1,0,2,L,06.01.2021,01:15:02,+01:00,Ein Etikett konnte nicht richtig gemessen werden.,1571.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:02,
149,150,MMA,3,1,Q,06.01.2021,01:15:07,+01:00,GMP: Plausibilitätsfehler Druckkontrolle Aggre...,4058.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:07,
150,151,AC1,3,2,L,06.01.2021,01:15:07,+01:00,Ein Etikett konnte nicht richtig gemessen werden.,1571.0,...,,,,Bediener,80742695,1F885A,AL6Audit Trail 000000 2021-01-06 K747B14.csv,Labeler,2021-01-06 01:15:07,


found there are lots of duplicates in the file - remove them here

In [None]:

Significant_alarms_merged = Significant_alarms_merged[Significant_alarms_merged.duplicated(keep='last')]

In [24]:

# write out the Significant_alarms_merged file to the sanofi share
folder='IP21'
filename = 'alarms.csv'
path = os.path.join(dir_local, folder, filename)
Significant_alarms_merged.to_csv(path)

In [25]:
Significant_alarms_merged[Significant_alarms_merged['Message Text (English)'].str.contains('The label roll is empty')]

Unnamed: 0,Message Text (English),Message Text,Start,Duration,Filename,Machine,Significant
338865,The label roll is empty,Die Etikettenrolle ist leer,2021-01-07 01:35:12,,AL6Audit Trail 000000 2021-01-07 K747B14.csv,AL6_Etikettierer,Yes
338866,The label roll is empty,Die Etikettenrolle ist leer,2021-01-07 01:35:25,,AL6Audit Trail 000000 2021-01-07 K747B14.csv,AL6_Etikettierer,Yes
338867,The label roll is empty,Die Etikettenrolle ist leer,2021-01-08 23:25:51,,AL6Audit Trail 000000 2021-01-09 K747B14.csv,AL6_Etikettierer,Yes
338868,The label roll is empty,Die Etikettenrolle ist leer,2021-01-08 23:26:35,,AL6Audit Trail 000000 2021-01-09 K747B14.csv,AL6_Etikettierer,Yes
338869,The label roll is empty,Die Etikettenrolle ist leer,2021-01-09 00:14:04,,AL6Audit Trail 000000 2021-01-09 K747B14.csv,AL6_Etikettierer,Yes
...,...,...,...,...,...,...,...
340574,The label roll is empty,Die Etikettenrolle ist leer,2021-06-24 19:18:24,,AL6Audit Trail 210001 2021-06-24 K747B14.csv,AL6_Etikettierer,Yes
340575,The label roll is empty,Die Etikettenrolle ist leer,2021-09-07 19:53:23,,AL6Audit Trail 210001 2021-09-07 K747B14.csv,AL6_Etikettierer,Yes
340576,The label roll is empty,Die Etikettenrolle ist leer,2021-09-07 19:59:34,,AL6Audit Trail 210001 2021-09-07 K747B14.csv,AL6_Etikettierer,Yes
340577,The label roll is empty,Die Etikettenrolle ist leer,2021-04-28 19:20:54,,AL6Audit Trail 211922 2021-04-28 K747B14.csv,AL6_Etikettierer,Yes


In [162]:
dates = Montage_df[(Montage_df.Start >= '2021-08-01 00:29:00') & (Montage_df.End <= '2021-08-02 00:00:00')]
dates[dates['Message Text'].str.contains('Devicemangel|Devicestau')].sort_values('Start')

Unnamed: 0,Type,Start,End,Duration,Message Text,Some Code,Not sure,Filename,Machine


In [182]:
Etikettierer_df.reset_index(inplace=True)

Devicemangel alarms - get status 0 and 3 rows only

In [206]:
Etikettierer_df[Etikettierer_df['Message Text'].str.contains('Devicemangel')]
# df_changeover3 = pd.DataFrame({'start':df_changeover2.IP_TREND_TIME.iloc[::2].values, 'end':df_changeover2.IP_TREND_TIME.iloc[1::2].values, 'time_diff_mins':df_changeover2.time_diff_mins.iloc[1::2].values})

Devicemangel = Etikettierer_df[(Etikettierer_df['Message Text'].str.contains('Devicemangel')) & (Etikettierer_df.Status.isin([0,3]))]

# Status = pd.DataFrame({'start':Etikettierer_df.Start.iloc[::2].values, 'end':Etikettierer_df.Start.iloc[1::2].values})
Etikettierer_df[Etikettierer_df['Message Text'].str.contains('Devicemangel')].groupby(['Message Text','Status']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Start,Nummer,Projekt,Prio,Quali,Zeit,UTC Versatz,ID,BMK,Parameterwert,Einheit,Alt/Neu-Wert,Benutzer,Auftrag,Charge,Filename,Machine,Duration
Message Text,Status,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
Devicemangel im Einlauf,0,27971,27971,27971,27971,27971,27971,27971,27971,0,0,0,0,27971,27971,27971,27971,27971,27971
Devicemangel im Einlauf,3,28104,28104,28104,28104,28104,28104,28104,28104,0,0,0,0,28104,28104,28104,28104,28104,28104


In [194]:
Etikettierer_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 378737 entries, 0 to 378736
Data columns (total 20 columns):
 #   Column         Non-Null Count   Dtype         
---  ------         --------------   -----         
 0   Start          378737 non-null  datetime64[ns]
 1   Nummer         378737 non-null  int64         
 2   Projekt        378737 non-null  object        
 3   Status         378737 non-null  int64         
 4   Prio           378737 non-null  int64         
 5   Quali          378737 non-null  object        
 6   Zeit           378737 non-null  object        
 7   UTC Versatz    378737 non-null  object        
 8   Message Text   378737 non-null  object        
 9   ID             378737 non-null  object        
 10  BMK            108060 non-null  object        
 11  Parameterwert  3582 non-null    object        
 12  Einheit        0 non-null       object        
 13  Alt/Neu-Wert   3582 non-null    object        
 14  Benutzer       378737 non-null  object        
 15  

In [270]:
Devicemangel.Auftrag = pd.to_numeric(Devicemangel.Auftrag, errors='coerce')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value


In [286]:
Devicemangel.sort_values(['Start','Auftrag','Status'], inplace=True)
# get rows where status = 0 and previous row was status = 3
Devicemangel2 = Devicemangel[(Devicemangel.Status == 0) & (Devicemangel.Status.shift(-1) == 3)]
Devicemangel2 = Devicemangel2.append(Devicemangel[(Devicemangel.Status == 3) & (Devicemangel.Status.shift(1) == 0)])
# Devicemangel

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Devicemangel.sort_values(['Start','Auftrag','Status'], inplace=True)


In [310]:
Devicemangel2.sort_values(['Start','Auftrag','Status'], inplace=True)
Devicemangel3 = pd.DataFrame({'start':Devicemangel2.Start.iloc[::2].values, 'end':Devicemangel2.Start.iloc[1::2].values, 'Name':'Device shortage in the inlet'})

In [363]:
Devicemangel3['Duration'] = (Devicemangel3['end'] - Devicemangel3['start']).dt.total_seconds()

In [325]:
Devicestau = Etikettierer_df[(Etikettierer_df['Message Text'].str.contains('Devicestau')) & (Etikettierer_df.Status.isin([0,3]))]

In [326]:
Devicestau.sort_values(['Start','Auftrag','Status'], inplace=True)
# get rows where status = 0 and previous row was status = 3
Devicestau2 = Devicestau[(Devicestau.Status == 0) & (Devicestau.Status.shift(-1) == 3)]
Devicestau2 = Devicestau2.append(Devicestau[(Devicestau.Status == 3) & (Devicestau.Status.shift(1) == 0)])


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Devicestau.sort_values(['Start','Auftrag','Status'], inplace=True)


In [332]:
Devicestau2.sort_values(['Start','Auftrag','Status'], inplace=True)
Devicestau3 = pd.DataFrame({'start':Devicestau2.Start.iloc[::2].values, 'end':Devicestau2.Start.iloc[1::2].values, 'Name':'Device jam in the outlet'})

In [357]:
Devicestau3['Duration'] = (Devicestau3['end'] - Devicestau3['start']).dt.total_seconds()

In [358]:
Devicestau3.set_index('start', inplace=True)

In [367]:
# write out to csv
folder='IP21_data'
filename = 'devicestau_jams.csv'
path = os.path.join(dir_sanofi_share, folder, filename)
Devicestau3.to_csv(path)

filename = 'devicemangel_shortage.csv'
path = os.path.join(dir_sanofi_share, folder, filename)
Devicemangel3.to_csv(path)


In [361]:
Devicestau2.Status.iloc[::2].values

array([0, 0, 0, ..., 0, 0, 0], dtype=int64)

In [368]:
Devicemangel3

Unnamed: 0,start,end,Name,Duration
0,2021-01-05 06:18:50,2021-01-05 06:22:42,Device shortage in the inlet,232.0
1,2021-01-05 08:46:39,2021-01-05 08:51:33,Device shortage in the inlet,294.0
2,2021-01-05 09:22:21,2021-01-05 10:20:44,Device shortage in the inlet,3503.0
3,2021-01-05 10:53:37,2021-01-05 10:58:16,Device shortage in the inlet,279.0
4,2021-01-05 11:30:36,2021-01-05 11:31:55,Device shortage in the inlet,79.0
...,...,...,...,...
27954,2021-11-05 09:00:41,2021-11-05 09:02:47,Device shortage in the inlet,126.0
27955,2021-11-05 09:03:52,2021-11-05 09:11:10,Device shortage in the inlet,438.0
27956,2021-11-05 09:11:16,2021-11-05 09:18:39,Device shortage in the inlet,443.0
27957,2021-11-05 09:19:02,2021-11-05 09:20:30,Device shortage in the inlet,88.0
