# SUD Box tool: Initial Data Loading

## Goals:
1. Build the initial data model for the tool:
    * Define key tables from the model and links between them
2. Perform the initial data load into the tables
3. Identify data-related issues and ways to address them



### TODO

1.

### Questions
1.

## Setup



### Imports

In [1]:
import pandas as pd
import pyodbc as pyodbc

import logging
import queries

import sys
import os
import logging
import sys
import argparse
import pytz
from os.path import join, split 

from datetime import datetime, timedelta
from tomlkit import parse, dumps, loads
from typing import List, Tuple

import historian
from os import getcwd

from sud_tools import *
from sud_utils import *
from tool_utils import *



### Reading in the configuration file


In [2]:
cfg_file_nm = 'etl_config.toml'
cfg_file_path = join(os.getcwd(), cfg_file_nm)
# checking if config file exists
if not os.path.isfile(cfg_file_path):
    raise ValueError(f'No config file was found at: {cfg_file_path}')
else:
    # reading in the config file
    cfg = loads(open(cfg_file_path).read())

In [3]:
tags_list_path = cfg['job']['path_to_tag_list']
print(f'Path to tag list table: {tags_list_path}')

NonExistentKey: 'Key "path_to_tag_list" does not exist.'

### Setting up the connection to DatalabDB

In [None]:
# datalab integration
datalab_cfg = cfg['datalab_db'].copy()

# setting up run configuration with parameters from cfg file
datalab_db_conn = get_mssql_conn_string(**datalab_cfg['connection'])
datalab_db_access_url = r"{}".format(datalab_cfg['access_token_url']['access_token_url'])
datalab_db_access_token = get_azure_sql_db_access_token(datalab_db_access_url)
print(f'Datalab connection parameters: {datalab_db_conn}')

## Designing data model

>**FACT TABLES**

**SUD_BOX_REJECT** - Fact table containning reject quatities for evary reject cause
1. site_id
2. line_id
3. datetime - 30 min granularity
4. line_recipe
5. line_state_id
6. agile_flag
7. project_flag
8. reject_cause_id
9. reject_qty

**SUD_BOX_TOTALS** - Fact table containnig # boxes produced/extracted
1. site_id
2. line_id
3. datetime - 30 min granularity
4. recipe_id
5. line_state_id
6. agile_flag
7. project_flag
8. good_covers_qty (coming from "produced" tags)
9. good_bases_qty (coming from "produced" tags)
10. produced_covers_qty (coming from "extracted" tags)
11. produced_bases_qty (coming from "extracted" tags)
<!-- 10. rejected_boxes_qty -->

> **DIMENTION TABLES**

1. **SUD_SITES** - we have it in DATLAB_DB
2. **SUD_LINES** - we have it in DATLAB_DB
3. **SUD_LINE_STATE** - we have it in DATLAB_DB
4. **LINE_RECIPE_DIM** - line recipe dimention table. Extracted from sharepoint from Power BI. We need to emulate it for now, based on the table from the URD.
    * line_recipe - value coming from the line recipe tag
    * product_segment
    * recipe_size_nm
    * recipe_cd
    * n_boxes_per_case
    * n_cases_end_of_line
5. **BOX_SIZE_DIM** - box size dimention table. Going to be created in Power BI
    * size_cd
    * size_nm
    * n_boxes_per_case
6. **REJECT_CAUSE_DIM** - we going to need to emulate this, by taking the data from the table in URD and creating csv/xlsx
    * id 
    * tag_nm - is the name of the tag without LXXX_ part
    * cause
    * reason
    * station
    * type
    

    






## Preparing LINE_RECIPIE_DIM

In [11]:
line_recipe_path = cfg['job']['path_to_line_recipe']
print(f'Path to line recipe table: {line_recipe_path}')

# cheking if the line recipe path exist
if not os.path.exists(line_recipe_path):
    print(f'INFO: Didn not find line recipe at {line_recipe_path}.')
    raise ValueError('Was not able to file tags list. Aborting...')
else:
    line_recipe_df = pd.read_excel(line_recipe_path)
    print(f'Shape of the line recipe table: {line_recipe_df.shape}')

Path to line recipe table: F:\Ecosystem Non-OneDrive\Development Area\MVP048 SUD Box tool\Data\Input\line_recipe.xlsx
Shape of the line recipe table: (20, 6)


In [6]:
line_recipe_df.columns

Index(['line_recipe_id', 'product_segment', 'recipe_size_nm', 'recipe_cd',
       'n_boxes_per_case', 'n_cases_end_of_line'],
      dtype='object')

In [7]:
line_recipe_df = line_recipe_df.assign(line_recipe_id = line_recipe_df['line_recipe_id'].str.replace('\xa0',''),
                                        product_segment = line_recipe_df['product_segment'].str.replace('\xa0',''),
                                         recipe_size_nm = line_recipe_df['recipe_size_nm'].str.replace('\xa0',''),
                                         recipe_cd = line_recipe_df['recipe_cd'].str.replace('\xa0',''),
                                         n_boxes_per_case = line_recipe_df['n_boxes_per_case'].str.replace('\xa0',''),
                                         n_cases_end_of_line = line_recipe_df['n_cases_end_of_line'].str.replace('\xa0','')
                                        
                                        )
line_recipe_df

Unnamed: 0,line_recipe_id,product_segment,recipe_size_nm,recipe_cd,n_boxes_per_case,n_cases_end_of_line
0,1,S100_CTray_4CC_Shrink,S100,S1,4,160.0
1,5,S100_Ttray_4CC,S100,S1,4,128.0
2,6,S100_Utray_8cc,S100,S1,8,
3,7,S100_4CC_ShrinkOnly,S100,S1,4,160.0
4,11,S115_CTray_4CC_Shrink,S115,S2,4,128.0
5,13,S115_Ttray_4CC,S115,S2,4,112.0
6,14,S115_Utray_8cc,S115,S2,8,
7,15,S115_4CC_ShrinkOnly,S115,S2,4,128.0
8,21,M115_CTray_4CC_Shrink,M115,M1,4,
9,22,M115_Ttray_4CC,M115,M1,4,84.0


In [8]:
line_recipe_df = line_recipe_df.assign(line_recipe_id =(pd.to_numeric(line_recipe_df['line_recipe_id'])))
line_recipe_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 6 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   line_recipe_id       20 non-null     int64 
 1   product_segment      20 non-null     object
 2   recipe_size_nm       20 non-null     object
 3   recipe_cd            20 non-null     object
 4   n_boxes_per_case     20 non-null     object
 5   n_cases_end_of_line  15 non-null     object
dtypes: int64(1), object(5)
memory usage: 1.1+ KB


In [9]:
line_recipe_df = line_recipe_df.assign(n_boxes_per_case =(pd.to_numeric(line_recipe_df['n_boxes_per_case'])))
line_recipe_df = line_recipe_df.assign(n_cases_end_of_line =(pd.to_numeric(line_recipe_df['n_cases_end_of_line'])))
line_recipe_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 6 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   line_recipe_id       20 non-null     int64  
 1   product_segment      20 non-null     object 
 2   recipe_size_nm       20 non-null     object 
 3   recipe_cd            20 non-null     object 
 4   n_boxes_per_case     20 non-null     int64  
 5   n_cases_end_of_line  15 non-null     float64
dtypes: float64(1), int64(2), object(3)
memory usage: 1.1+ KB


In [11]:
line_recipe_df.to_excel('..\Data\initial loading\LINE_RECIPE_DIM.xlsx', index=False)

## prepaing REJECT_CAUSE_DIM

In [4]:
reject_cause_path = cfg['job']['path_to_reject_cause']
print(f'Path to reject cause table: {reject_cause_path}')

# cheking if the reject cause path exist
if not os.path.exists(reject_cause_path):
    print(f'INFO: Didn not find reject cause table at {reject_cause_path}.')
    raise ValueError('Was not able to find reject cause table. Aborting...')
else:
    reject_cause_df = pd.read_excel(reject_cause_path)
    print(f'Shape of the line recipe table: {reject_cause_df.shape}')

Path to reject cause table: F:\Ecosystem Non-OneDrive\Development Area\MVP048 SUD Box tool\Data\Input\reject_cause.xlsx
Shape of the line recipe table: (96, 7)


In [5]:
reject_cause_df.head()

Unnamed: 0,id,tag_nm,type,machine,station,reason,cause
0,1,LXXX_Cover_Extraction_Turret_CartonsNotExtract...,Cover,Cover machine,Transport Belt,Not extracted,Not extracted
1,2,LXXX_Cover_TransportBelt_CheckExternal_0_0_Rej...,Cover,Cover machine,Transport Belt,Transport belt,Check External
2,3,LXXX_Cover_TransportBelt_CheckInternal_0_0_Rej...,Cover,Cover machine,Transport Belt,Transport belt,Check Internal
3,4,LXXX_Cover_TransportBelt_RobotTrack_Skipped_0_...,Cover,Cover machine,Transport Belt,Transport belt,Robot track
4,5,LXXX_Cover_Forming_PatchErectionCheck_Bad_Coun...,Cover,Cover machine,Patch Control,Patch reject,Patch Erection Check Bad Height


In [7]:
reject_cause_df.loc[reject_cause_df['id'] == 1]

Unnamed: 0,id,tag_nm,type,machine,station,reason,cause


In [8]:
reject_cause_df.columns

Index(['id', 'tag_nm', 'type', 'machine', 'station', 'reason', 'cause'], dtype='object')

In [9]:
reject_cause_df.columns = reject_cause_df.columns.str.strip()
reject_cause_df.columns

Index(['id', 'tag_nm', 'type', 'machine', 'station', 'reason', 'cause'], dtype='object')

In [10]:
reject_cause_df = reject_cause_df.assign(tag_nm=(reject_cause_df['tag_nm'].str.split('_',n=1,expand=True)[1]))
reject_cause_df

Unnamed: 0,id,tag_nm,type,machine,station,reason,cause
0,1,Cover_Extraction_Turret_CartonsNotExtracted_0_...,Cover,Cover machine,Transport Belt,Not extracted,Not extracted
1,2,Cover_TransportBelt_CheckExternal_0_0_Rejected_n,Cover,Cover machine,Transport Belt,Transport belt,Check External
2,3,Cover_TransportBelt_CheckInternal_0_0_Rejected_n,Cover,Cover machine,Transport Belt,Transport belt,Check Internal
3,4,Cover_TransportBelt_RobotTrack_Skipped_0_Rejec...,Cover,Cover machine,Transport Belt,Transport belt,Robot track
4,5,Cover_Forming_PatchErectionCheck_Bad_Counter_R...,Cover,Cover machine,Patch Control,Patch reject,Patch Erection Check Bad Height
...,...,...,...,...,...,...,...
91,92,Upack_CaseCheckweigher_Reject_Weight_NOKReject...,Cover+Base,Phase 2,Case CheckWeigher,Minibea,Weight NOK
92,93,Upack_OLCP_1DCheck_0_NOKRejected_n,Cover+Base,Phase 2,Case CheckWeigher,Minibea,OLCP 1D NOK
93,94,Upack_CaseCheckweigher_Reject_LeakerCheck_0Rej...,Cover+Base,Phase 2,Case CheckWeigher,Minibea,
94,95,Base_Machine_Holder_Not_free_iTrak_Exit_Counte...,Base,Base machine,Station 1,Holder Not Free,Holder Not Free


In [11]:
reject_cause_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 7 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   id       96 non-null     object
 1   tag_nm   96 non-null     object
 2   type     96 non-null     object
 3   machine  96 non-null     object
 4   station  96 non-null     object
 5   reason   94 non-null     object
 6   cause    95 non-null     object
dtypes: object(7)
memory usage: 5.4+ KB


In [39]:
reject_cause_df['tag_nm']

0     Cover_Extraction_Turret_CartonsNotExtracted_0_...
1      Cover_TransportBelt_CheckExternal_0_0_Rejected_n
2      Cover_TransportBelt_CheckInternal_0_0_Rejected_n
3     Cover_TransportBelt_RobotTrack_Skipped_0_Rejec...
4     Cover_Forming_PatchErectionCheck_Bad_Counter_R...
                            ...                        
92    Upack_CaseCheckweigher_Reject_Weight_NOKReject...
93                   Upack_OLCP_1DCheck_0_NOKRejected_n
94    Upack_CaseCheckweigher_Reject_LeakerCheck_0Rej...
95    Base_Machine_Holder_Not_free_iTrak_Exit_Counte...
96    Cover_Forming_PatchErectionCheck_Bad_on_Empty_...
Name: tag_nm, Length: 97, dtype: object

In [12]:
reject_cause_df = reject_cause_df[['id', 'tag_nm','cause','reason','station','type','machine']]
reject_cause_df

Unnamed: 0,id,tag_nm,cause,reason,station,type,machine
0,1,Cover_Extraction_Turret_CartonsNotExtracted_0_...,Not extracted,Not extracted,Transport Belt,Cover,Cover machine
1,2,Cover_TransportBelt_CheckExternal_0_0_Rejected_n,Check External,Transport belt,Transport Belt,Cover,Cover machine
2,3,Cover_TransportBelt_CheckInternal_0_0_Rejected_n,Check Internal,Transport belt,Transport Belt,Cover,Cover machine
3,4,Cover_TransportBelt_RobotTrack_Skipped_0_Rejec...,Robot track,Transport belt,Transport Belt,Cover,Cover machine
4,5,Cover_Forming_PatchErectionCheck_Bad_Counter_R...,Patch Erection Check Bad Height,Patch reject,Patch Control,Cover,Cover machine
...,...,...,...,...,...,...,...
91,92,Upack_CaseCheckweigher_Reject_Weight_NOKReject...,Weight NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
92,93,Upack_OLCP_1DCheck_0_NOKRejected_n,OLCP 1D NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
93,94,Upack_CaseCheckweigher_Reject_LeakerCheck_0Rej...,,Minibea,Case CheckWeigher,Cover+Base,Phase 2
94,95,Base_Machine_Holder_Not_free_iTrak_Exit_Counte...,Holder Not Free,Holder Not Free,Station 1,Base,Base machine


In [13]:
len(reject_cause_df['tag_nm'])

96

In [14]:
len(reject_cause_df['tag_nm'])

96

In [15]:
reject_cause_df['tag_nm'].tolist()

['Cover_Extraction_Turret_CartonsNotExtracted_0_Actual_n',
 'Cover_TransportBelt_CheckExternal_0_0_Rejected_n',
 'Cover_TransportBelt_CheckInternal_0_0_Rejected_n',
 'Cover_TransportBelt_RobotTrack_Skipped_0_Rejected_n',
 'Cover_Forming_PatchErectionCheck_Bad_Counter_Rejected_n',
 'Cover_Forming_PatchErectionCheck_BadAngle_Counter_Rejected_n',
 'Cover_Forming_PatchErectionCheck_NoFeedback_Counter_Rejected_n',
 'Cover_TransportBelt_HoleFlap_0_Check_Rejected_n',
 'Cover_Former1_ExternalGlueCamera_Bad_0_Rejected_n',
 'Cover_Former1_ExternalGlueCamera_NoFeedback_0_Rejected_n',
 'Cover_Former1_InternalGlueCamera_Bad_0_Rejected_n',
 'Cover_Former1_InternalGlueCamera_NoFeedback_0_Rejected_n',
 'Cover_Former1_ Glue_Dry _0_Rejected_n',
 'Cover_Former1_External_CoverNotSeenBeforeGlue_0_Rejected_n',
 'Cover_Former1_Internal_CoverNotSeenBeforeGlue_0_Rejected_n',
 'Cover_Former1_RobotHead1_FlapCheck1_0_Rejected_n',
 'Cover_Former1_RobotHead1_FlapCheck2_0_Rejected_n',
 'Cover_Former1_RobotHead2_Flap

In [16]:
reject_cause_df['tag_nm'].str.replace('\xa0','').tolist()

['Cover_Extraction_Turret_CartonsNotExtracted_0_Actual_n',
 'Cover_TransportBelt_CheckExternal_0_0_Rejected_n',
 'Cover_TransportBelt_CheckInternal_0_0_Rejected_n',
 'Cover_TransportBelt_RobotTrack_Skipped_0_Rejected_n',
 'Cover_Forming_PatchErectionCheck_Bad_Counter_Rejected_n',
 'Cover_Forming_PatchErectionCheck_BadAngle_Counter_Rejected_n',
 'Cover_Forming_PatchErectionCheck_NoFeedback_Counter_Rejected_n',
 'Cover_TransportBelt_HoleFlap_0_Check_Rejected_n',
 'Cover_Former1_ExternalGlueCamera_Bad_0_Rejected_n',
 'Cover_Former1_ExternalGlueCamera_NoFeedback_0_Rejected_n',
 'Cover_Former1_InternalGlueCamera_Bad_0_Rejected_n',
 'Cover_Former1_InternalGlueCamera_NoFeedback_0_Rejected_n',
 'Cover_Former1_ Glue_Dry _0_Rejected_n',
 'Cover_Former1_External_CoverNotSeenBeforeGlue_0_Rejected_n',
 'Cover_Former1_Internal_CoverNotSeenBeforeGlue_0_Rejected_n',
 'Cover_Former1_RobotHead1_FlapCheck1_0_Rejected_n',
 'Cover_Former1_RobotHead1_FlapCheck2_0_Rejected_n',
 'Cover_Former1_RobotHead2_Flap

In [17]:
reject_cause_df['station'].tolist()

['Transport Belt',
 'Transport Belt',
 'Transport Belt',
 'Transport Belt',
 'Patch Control',
 'Patch Control',
 'Patch Control',
 'Transport Belt',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 1',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Former 2',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 1',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 2',
 'Station 3',

In [18]:
reject_cause_df = reject_cause_df.assign(tag_nm = reject_cause_df['tag_nm'].str.replace('\xa0',''),
                                        cause = reject_cause_df['cause'].str.replace('\xa0',''),
                                         reason = reject_cause_df['reason'].str.replace('\xa0',''),
                                         station = reject_cause_df['station'].str.replace('\xa0',''),
                                         type = reject_cause_df['type'].str.replace('\xa0',''),
                                         machine = reject_cause_df['machine'].str.replace('\xa0','')
                                        
                                        )
reject_cause_df

Unnamed: 0,id,tag_nm,cause,reason,station,type,machine
0,1,Cover_Extraction_Turret_CartonsNotExtracted_0_...,Not extracted,Not extracted,Transport Belt,Cover,Cover machine
1,2,Cover_TransportBelt_CheckExternal_0_0_Rejected_n,Check External,Transport belt,Transport Belt,Cover,Cover machine
2,3,Cover_TransportBelt_CheckInternal_0_0_Rejected_n,Check Internal,Transport belt,Transport Belt,Cover,Cover machine
3,4,Cover_TransportBelt_RobotTrack_Skipped_0_Rejec...,Robot track,Transport belt,Transport Belt,Cover,Cover machine
4,5,Cover_Forming_PatchErectionCheck_Bad_Counter_R...,Patch Erection Check Bad Height,Patch reject,Patch Control,Cover,Cover machine
...,...,...,...,...,...,...,...
91,92,Upack_CaseCheckweigher_Reject_Weight_NOKReject...,Weight NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
92,93,Upack_OLCP_1DCheck_0_NOKRejected_n,OLCP 1D NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
93,94,Upack_CaseCheckweigher_Reject_LeakerCheck_0Rej...,,Minibea,Case CheckWeigher,Cover+Base,Phase 2
94,95,Base_Machine_Holder_Not_free_iTrak_Exit_Counte...,Holder Not Free,Holder Not Free,Station 1,Base,Base machine


In [19]:
reject_cause_df = reject_cause_df.assign(id = reject_cause_df['id'].str.replace('\xa0',''))
reject_cause_df

Unnamed: 0,id,tag_nm,cause,reason,station,type,machine
0,1,Cover_Extraction_Turret_CartonsNotExtracted_0_...,Not extracted,Not extracted,Transport Belt,Cover,Cover machine
1,2,Cover_TransportBelt_CheckExternal_0_0_Rejected_n,Check External,Transport belt,Transport Belt,Cover,Cover machine
2,3,Cover_TransportBelt_CheckInternal_0_0_Rejected_n,Check Internal,Transport belt,Transport Belt,Cover,Cover machine
3,4,Cover_TransportBelt_RobotTrack_Skipped_0_Rejec...,Robot track,Transport belt,Transport Belt,Cover,Cover machine
4,5,Cover_Forming_PatchErectionCheck_Bad_Counter_R...,Patch Erection Check Bad Height,Patch reject,Patch Control,Cover,Cover machine
...,...,...,...,...,...,...,...
91,92,Upack_CaseCheckweigher_Reject_Weight_NOKReject...,Weight NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
92,93,Upack_OLCP_1DCheck_0_NOKRejected_n,OLCP 1D NOK,Minibea,Case CheckWeigher,Cover+Base,Phase 2
93,94,Upack_CaseCheckweigher_Reject_LeakerCheck_0Rej...,,Minibea,Case CheckWeigher,Cover+Base,Phase 2
94,95,Base_Machine_Holder_Not_free_iTrak_Exit_Counte...,Holder Not Free,Holder Not Free,Station 1,Base,Base machine


In [20]:
#reject_cause_df.groupby('tag_nm').describe()

In [21]:
reject_cause_df = reject_cause_df.assign(id =(pd.to_numeric(reject_cause_df['id'])))
reject_cause_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 7 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   id       96 non-null     int64 
 1   tag_nm   96 non-null     object
 2   cause    95 non-null     object
 3   reason   94 non-null     object
 4   station  96 non-null     object
 5   type     96 non-null     object
 6   machine  96 non-null     object
dtypes: int64(1), object(6)
memory usage: 5.4+ KB


In [22]:
#line_recipe_df['recipe_size_nm'].tolist()

In [23]:
reject_cause_df.to_excel('..\Data\initial loading\REJECT_CAUSE_DIM.xlsx', index = False)

### getting all tags list

In [18]:
### getting all tags lists

all_tags = cfg['job']['path_to_tag_list']
print(f'Path to all tag table: {all_tags}')

# cheking if the all tag path exist
if not os.path.exists(all_tags):
    print(f'INFO: Didn not find reject cause table at {all_tags}.')
    raise ValueError('Was not able to find reject cause table. Aborting...')
else:
    all_tags_df = pd.read_excel(all_tags)
    print(f'Shape of the line recipe table: {all_tags_df.shape}')

NonExistentKey: 'Key "path_to_tag_list" does not exist.'

In [19]:
all_tags_df

NameError: name 'all_tags_df' is not defined

In [14]:
all_tags_df.columns

Index(['Reject Station ', 'TagName ', 'site', 'enabled'], dtype='object')

In [15]:
all_tags_df.columns = all_tags_df.columns.str.strip()
all_tags_df.columns

Index(['Reject Station', 'TagName', 'site', 'enabled'], dtype='object')

In [16]:
all_tags_df['TagName'].tolist()

['LXXX_Cover_General_ExtractedCartons_Total_Counter_Actual_n\xa0',
 'LXXX_Cover_General_ProducedCovers_Total_Counter_Actual_n\xa0',
 'LXXX_Cover_General_RejectedCovers_0_Counter_Actual_n\xa0',
 'LXXX_Cover_Reshipper_good_inserted_n\xa0',
 'LXXX_Cover_Reshipper_bad_rejected_n\xa0',
 'LXXX_Cover_Extraction_Turret_CartonsNotExtracted_0_Actual_n\xa0',
 'LXXX_Cover_TransportBelt_CheckExternal_0_0_Rejected_n\xa0',
 'LXXX_Cover_TransportBelt_CheckInternal_0_0_Rejected_n\xa0',
 'LXXX_Cover_TransportBelt_RobotTrack_Skipped_0_Rejected_n\xa0',
 'LXXX_Cover_Forming_PatchErectionCheck_Bad_Counter_Rejected_n\xa0',
 'LXXX_Cover_Forming_PatchErectionCheck_BadAngle_Counter_Rejected_n\xa0',
 'LXXX_Cover_Forming_PatchErectionCheck_NoFeedback_Counter_Rejected_n\xa0',
 'LXXX_Cover_Reshipper_BCR_NotRead_Rejected_n\xa0',
 'LXXX_Cover_Reshipper_BCR_MisMatch_Rejected_n\xa0',
 'LXXX_Cover_Reshipper_BCR_NoFeedback_Rejected_n\xa0',
 'LXXX_Cover_TransportBelt_HoleFlap_0_Check_Rejected_n\xa0',
 'LXXX_Cover_Former1_

In [17]:
all_tags_df = all_tags_df.assign(TagName = all_tags_df['TagName'].str.replace('\xa0',''))
all_tags_df

Unnamed: 0,Reject Station,TagName,site,enabled
0,Total Cover Extracted,LXXX_Cover_General_ExtractedCartons_Total_Coun...,Urlati,1
1,Total Cover Produced,LXXX_Cover_General_ProducedCovers_Total_Counte...,Urlati,1
2,Total Cover Rejected,LXXX_Cover_General_RejectedCovers_0_Counter_Ac...,Urlati,1
3,Cover reshipper,LXXX_Cover_Reshipper_good_inserted_n,Urlati,0
4,Cover reshipper,LXXX_Cover_Reshipper_bad_rejected_n,Urlati,0
...,...,...,...,...
261,Pack Counter 2,LXXX_UPack_For_Proficy_Produced_Cases,Amiens,1
262,LineRecipe,LXXX_Upack_LineRecipe,Amiens,1
263,AgileControl,LXXX_Upack_Primary_General_Speed_AgileControl_...,Amiens,1
264,Station 1,LXXX_Base_Machine_Holder_Not_free_iTrak_Exit_C...,Amiens,1


In [18]:
all_tags_df =all_tags_df[['TagName', 'site', 'enabled']]
all_tags_df

Unnamed: 0,TagName,site,enabled
0,LXXX_Cover_General_ExtractedCartons_Total_Coun...,Urlati,1
1,LXXX_Cover_General_ProducedCovers_Total_Counte...,Urlati,1
2,LXXX_Cover_General_RejectedCovers_0_Counter_Ac...,Urlati,1
3,LXXX_Cover_Reshipper_good_inserted_n,Urlati,0
4,LXXX_Cover_Reshipper_bad_rejected_n,Urlati,0
...,...,...,...
261,LXXX_UPack_For_Proficy_Produced_Cases,Amiens,1
262,LXXX_Upack_LineRecipe,Amiens,1
263,LXXX_Upack_Primary_General_Speed_AgileControl_...,Amiens,1
264,LXXX_Base_Machine_Holder_Not_free_iTrak_Exit_C...,Amiens,1


In [20]:


path_to_file = join(split(getcwd())[0], 'Data', 'Input', 'all_tag_list.xlsx')
all_tags_df.to_excel(path_to_file, index=False)
