Jun 1, 2021

# Tan's Requests:

## Request 1: Join RACM #1 and RACM #2 to compare if there are changes in start-date, end-date or both and split into multiple csv files

## Request 2: Add GIM database to the joint RACM data from request 1
  
## Request 3: Join RACM #1 to GIM; Join RACM #2 to GIM
* Have the option to export one csv with all information or split into multiple csvs - one for each country

## Plan:
### 1. Create functions that can be called to perform a specific task 
Functions: 
    a. To create "Key" by joining UPN and ConFig columns. Reason: each row of data has unique UPN and ConFig #s, if the record is repeated twice, it means that there is a change in etiher the start or end date. 
    b. To set index to Key for dataframe
    c. Join the 2020 dataframe to the 2021 dataframe
    d. Compare the Start Date with End Date 
    e. Separate data for each country
    f. Export data into excel. Each country will have its own excel (92) + (1) master record
### 2. Request 1:Join RACM dataset 1 and RACM dataset 2 to compare if start date and/or end date changed and split into multiple csv files
    a. Import datasets 
    b. Create a column Key to join with the second dataset.
    c. Create a column IPN by comning UPN with ConFig. This will be a temporary product registration code.
    d. Retain RACM string data for process checking
    e. Remove the empty column
    f. Change the date format
    g. Remove the leading "0" from UPN so that it can be set as index for joining GIM data. GIM["ItemId'] does not have the leading "0". 
    h. Join the two datasets. 
    i. Export the data as one file
    j. Export the data as multiple files (one per country)
    k. Create a dataframe for only rows that have different start-date and/or end_date
### 3. Request 2: Add GIM database to the joint RACM data from request 1
    a. Import GIM data
    b. Change te Catalog Number column to string
    c. Set CatalogNumber as index but also keep the CatalogNumber column in the data
    d. Select columns of interest
    e. Left join GIM with the combined RACM so RACM does not get dropped in case GIM data is missing
    f. Export the data as one file
### 4. Request 3: Join 2020 to GIM and Join GIM 2021
Both RACM datasets are joined to GIM individually by the following steps:
    a. RACM: Set UPN as index
    b. GIM: Set CatalogNumber as index
    c. Join the RACM with GIM by index
    d. Export data as individual files

# 2. List of Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os, math
from collections import Counter

# 3. List of Functions

In [2]:
def date_change(ddmmyyyy):
    # This function changes dd-mm-yyyy to mm-dd-yyyy format
    x = ddmmyyyy.split("/")
    return x[1]+"/"+x[0]+"/"+x[2]

In [3]:
def read_csv(file_path):
    df = pd.read_csv(file_path, sep='|', names=['Country Code', 'UPN', 'ConFig', 'Empty', 'Start Date', 'End Date'])
    # Create ConFig column
    df["Key"] = df["Country Code"] + df["UPN"] + df["ConFig"]
    df["IPN"] = df["UPN"] + '-' + df["ConFig"]
    # Add the original data
    df_ = pd.read_csv(file_path, names = ["Original Data"])
    df["Original Data"] = df_["Original Data"]
    del df["Empty"]
    # Change the date format
    df["Start Date"] = df["Start Date"].apply(date_change)
    df["End Date"] = df["End Date"].apply(date_change)
    # Remove "0" in front of any UPN that starts with "0" before combining with GIM columns because GIM[Catalog]\
    # does not have the leading "0". 
    df["UPN"] = [s.lstrip("0") for s in df["UPN"]]
    # Set key as index and drop the key
    df = df.set_index("Key", drop = True)
    return df

In [4]:
def choose_country_code(row):
    if row["Country Code_20"]=="":
        return row["Country Code_21"]
    else:
        return row["Country Code_20"]

def choose_UPN(row):
    if row["UPN_20"]=="":
        return row["UPN_21"]
    else:
        return row["UPN_20"]
    
def choose_IPN(row):
    if row["IPN_20"]=="":
        return row["IPN_21"]
    else:
        return row["IPN_20"]

def frequency_graph(s):
    """
    This finction will generate a bar graph. X: Country, Y: Frequency (number of time the data is processed)
    input : s: pandas.Series
    """
    s = Dec2020_Apr2021_RAC["Country Code"]
    x = Counter(list(s))
    y = list(x.items())
    y.sort(key=lambda x: -x[1])
    plt.figure(figsize=(25,5))
    plt.bar([val[0] for val in y], [val[1] for val in y])
    plt.xlabel('Country Code')
    plt.ylabel('Frequency')
    plt.grid(True)
    plt.show()

# 4. Import Data

## 4.1 RACM Data #1

Create dataframes for each excel, label the columns, and remove empty columns

In [5]:
# Import and create column names ## OLDEST FILE
RACM_A = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\RACMUPDATE_31DEC2021.txt", sep='|', names=['Country Code', 'UPN', 'ConFig', 'Empty', 'Start Date', 'End Date'])
# Create Key and IPN columns
RACM_A["Key"] = RACM_A["Country Code"]+ RACM_A["UPN"]+ "|" + RACM_A["ConFig"]
RACM_A["IPN"] = RACM_A["UPN"] + '-' + RACM_A["ConFig"]
# Add the original data
RACM_A_ = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\RACMUPDATE_31DEC2021.txt", names = ["Original Data"])
RACM_A["Original Data"] = RACM_A_["Original Data"]
del RACM_A["Empty"]
# Change the date format
RACM_A["Start Date"] = RACM_A["Start Date"].apply(date_change)
RACM_A["End Date"] = RACM_A["End Date"].apply(date_change)
# Remove "0" in front of any UPN that starts with "0" before combining with GIM columns because GIM[Catalog] does not have \
# the leading "0". 
RACM_A["UPN"] = [s.lstrip("0") for s in RACM_A["UPN"]]
# Set key as index and drop the key
RACM_A = RACM_A.set_index("Key", drop = True)
RACM_A

Unnamed: 0_level_0,Country Code,UPN,ConFig,Start Date,End Date,IPN,Original Data
Key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
AE077.0082|C1,AE,77.0082,C1,04/05/2021,03/17/2023,077.0082-C1,AE|077.0082|C1||05/04/2021|17/03/2023
AR077.0082|C1,AR,77.0082,C1,12/23/2020,12/31/9999,077.0082-C1,AR|077.0082|C1||23/12/2020|31/12/9999
AT077.0082|C1,AT,77.0082,C1,12/23/2020,05/26/2025,077.0082-C1,AT|077.0082|C1||23/12/2020|26/05/2025
AT077.0082|C2,AT,77.0082,C2,10/28/2021,12/31/9999,077.0082-C2,AT|077.0082|C2||28/10/2021|31/12/9999
AU077.0082|C1,AU,77.0082,C1,12/23/2020,12/31/9999,077.0082-C1,AU|077.0082|C1||23/12/2020|31/12/9999
...,...,...,...,...,...,...,...
TRSSUP300STR|S2,TR,SSUP300STR,S2,12/13/2021,12/31/9999,SSUP300STR-S2,TR|SSUP300STR|S2||13/12/2021|31/12/9999
USSSUP300STR|S1,US,SSUP300STR,S1,10/19/2020,12/31/9999,SSUP300STR-S1,US|SSUP300STR|S1||19/10/2020|31/12/9999
USSSUP300STR|S2,US,SSUP300STR,S2,04/08/2021,12/31/9999,SSUP300STR-S2,US|SSUP300STR|S2||08/04/2021|31/12/9999
XISSUP300STR|S2,XI,SSUP300STR,S2,05/20/2021,12/31/9999,SSUP300STR-S2,XI|SSUP300STR|S2||20/05/2021|31/12/9999


## 4.2 RACM Data #2

In [6]:
# Import and create column names ## NEWEST FILE
RACM_B = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\RACMUPDATE_21MAR2022.txt", sep='|', names=['Country Code', 'UPN', 'ConFig', 'Empty', 'Start Date', 'End Date'])
# Create Key and IPN columns
RACM_B["Key"] = RACM_B["Country Code"] + RACM_B["UPN"] + "|" + RACM_B["ConFig"]
RACM_B["IPN"] = RACM_B["UPN"] + '-' + RACM_B["ConFig"]
# Add the original data
RACM_B_ = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\RACMUPDATE_21MAR2022.txt", names = ["Original Data"])
RACM_B["Original Data"] = RACM_B_["Original Data"]
del RACM_B["Empty"]
# Change the date format
RACM_B["Start Date"] = RACM_B["Start Date"].apply(date_change)
RACM_B["End Date"] = RACM_B["End Date"].apply(date_change)
# Remove "0" infront of any UPN that starts with "0" before combining with GIM columns because GIM[Catalog] does not have the leading "0". 
RACM_B["UPN"] = [s.lstrip("0") for s in RACM_B["UPN"]]
# Set key as index and drop the key
RACM_B = RACM_B.set_index("Key", drop = True)
RACM_B

Unnamed: 0_level_0,Country Code,UPN,ConFig,Start Date,End Date,IPN,Original Data
Key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
AE077.0082|C1,AE,77.0082,C1,04/05/2021,01/26/2025,077.0082-C1,AE|077.0082|C1||05/04/2021|26/01/2025
AR077.0082|C1,AR,77.0082,C1,12/23/2020,12/31/9999,077.0082-C1,AR|077.0082|C1||23/12/2020|31/12/9999
AT077.0082|C1,AT,77.0082,C1,12/23/2020,05/26/2025,077.0082-C1,AT|077.0082|C1||23/12/2020|26/05/2025
AT077.0082|C2,AT,77.0082,C2,10/28/2021,12/31/9999,077.0082-C2,AT|077.0082|C2||28/10/2021|31/12/9999
AU077.0082|C1,AU,77.0082,C1,12/23/2020,12/31/9999,077.0082-C1,AU|077.0082|C1||23/12/2020|31/12/9999
...,...,...,...,...,...,...,...
TRSSUP300STR|S2,TR,SSUP300STR,S2,12/13/2021,12/31/9999,SSUP300STR-S2,TR|SSUP300STR|S2||13/12/2021|31/12/9999
USSSUP300STR|S1,US,SSUP300STR,S1,10/19/2020,12/31/9999,SSUP300STR-S1,US|SSUP300STR|S1||19/10/2020|31/12/9999
USSSUP300STR|S2,US,SSUP300STR,S2,04/08/2021,12/31/9999,SSUP300STR-S2,US|SSUP300STR|S2||08/04/2021|31/12/9999
XISSUP300STR|S2,XI,SSUP300STR,S2,05/20/2021,12/31/9999,SSUP300STR-S2,XI|SSUP300STR|S2||20/05/2021|31/12/9999


## 4.3 GIM Data

In [7]:
# Import GIM data
GIM = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\GIM_21MAR2022.csv", sep=',')
#Change the Catalog Number column to string
GIM["CatalogNumber_index"] = GIM["CatalogNumber"].astype(str)
#Set CatalogNumber as index but also keep the CatalogNumber column in the data
GIM = GIM.set_index(["CatalogNumber_index"], drop = False)
# Select columns of interest
GIM = GIM[["ItemId", "CatalogNumber", "ItemType", "LongDescription"]]
GIM['ItemId']=GIM['ItemId']

GIM

#GIM['CatalogNumber'].unique()
#for i in GIM['CatalogNumber']:
    #print(i)


Unnamed: 0_level_0,ItemId,CatalogNumber,ItemType,LongDescription
CatalogNumber_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
H965100430,1076194,H965100430,12-Disposable,GUIDER 7F PRE-SHAPED 40 90CM
H965100440,1076195,H965100440,12-Disposable,GUIDER 8F PRE-SHAPED 40 90CM
H965100480,1076196,H965100480,12-Disposable,GUIDER 8F 90CM MULTI PURPOSE
H965100500,1076197,H965100500,12-Disposable,GUIDER 6F STRAIGHT 90CM
H965100510,1076198,H965100510,12-Disposable,GUIDER 7F STRAIGHT 90CM
...,...,...,...,...
FDE52535,4348136,FDE52535,11-Implant,SURPASS EVOLVE ELITE 5.25MM X 35MM - CE
INC-15123-125,4349125,INC-15123-125,,
INC-15123-146,4349126,INC-15123-146,,
INC-15123-153,4349127,INC-15123-153,,


In [56]:
### Print GIM
# path = "C:\\Users\\kpham\\Desktop\\QB_Project\\GIM_code.csv"
# GIM.to_csv(path, index = True)

# 6. Request 1:Join RACM dataset 1 and RACM dataset 2 to compare if start date and/or end date changed. Split into multiple csv files.

## 6.1 Combine 2020 and 2021 Data

Purpose: to create a linkage between GIM and RACM. GIM: GIM ID, Item Type and Long Description to be added as additional columns to RACM

In [8]:
# Join two dataframes(RAC_122020 & RAC_042021) with RAC_122020 on the left and RAC_042021 on the right
RACM_AB = RACM_A.join(RACM_B, how='outer', lsuffix="_20", rsuffix="_21")

#################################

# Add a new column to capture the Start_Date change in 2020 & 2021
RACM_AB['Start Date Changed'] = (RACM_AB['Start Date_20']!=RACM_AB['Start Date_21'])

# Add a new column to capture the End_Date change in 2020 & 2021
RACM_AB['End Date Changed'] = (RACM_AB['End Date_20']!=RACM_AB['End Date_21'])

# Add a new column to capture if a product was removed from 2021, but existed in 2020
RACM_AB['Product Removed'] = RACM_AB['UPN_21'].isna()

# Add a new column to capture if a product was added to 2021, but was not in 2020
RACM_AB['Product Added'] = RACM_AB['UPN_20'].isna()

#Add a new column to capture if the raw data has changed between two files
RACM_AB['RAW Data Changed'] = (RACM_AB['Original Data_20'] != RACM_AB['Original Data_21'])

# Select only columns with either start_date change, end_date change or both. This is for information only.
df_RACM_AB_Comparison = RACM_AB[RACM_AB['Start Date Changed']|RACM_AB['End Date Changed']]

####################################3
# Create a column name Index to retain the information when exporting the data. Note: indexes in general will not be exported
RACM_AB['Index'] = RACM_AB.index
RACM_AB["Index"].astype(str)

#Create Country Code column that have data from both 2020 and 2021
RACM_AB["Country Code"] = RACM_AB["Index"].apply(lambda y: y[0:2])

RACM_AB["Index"] = RACM_AB["Index"].apply(lambda y: y[2:])
RACM_AB


Unnamed: 0_level_0,Country Code_20,UPN_20,ConFig_20,Start Date_20,End Date_20,IPN_20,Original Data_20,Country Code_21,UPN_21,ConFig_21,...,End Date_21,IPN_21,Original Data_21,Start Date Changed,End Date Changed,Product Removed,Product Added,RAW Data Changed,Index,Country Code
Key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
AE077.0082|C1,AE,77.0082,C1,04/05/2021,03/17/2023,077.0082-C1,AE|077.0082|C1||05/04/2021|17/03/2023,AE,77.0082,C1,...,01/26/2025,077.0082-C1,AE|077.0082|C1||05/04/2021|26/01/2025,False,True,False,False,True,077.0082|C1,AE
AE077.0193|C1,AE,77.0193,C1,04/05/2021,07/23/2021,077.0193-C1,AE|077.0193|C1||05/04/2021|23/07/2021,AE,77.0193,C1,...,01/26/2025,077.0193-C1,AE|077.0193|C1||05/04/2021|26/01/2025,False,True,False,False,True,077.0193|C1,AE
AE077.0193|C3,,,,,,,,AE,77.0193,C3,...,01/26/2025,077.0193-C3,AE|077.0193|C3||31/01/2022|26/01/2025,True,True,False,True,True,077.0193|C3,AE
AE80030|C2,AE,80030,C2,06/15/2016,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,AE,80030,C2,...,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,False,False,False,False,False,80030|C2,AE
AE80031|C2,AE,80031,C2,06/15/2016,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,AE,80031,C2,...,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,False,False,False,False,False,80031|C2,AE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ZASSTD300STR|S2,ZA,SSTD300STR,S2,12/14/2021,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,ZA,SSTD300STR,S2,...,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSTD300STR|S2,ZA
ZASSUP215PRE|S2,ZA,SSUP215PRE,S2,12/14/2021,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,ZA,SSUP215PRE,S2,...,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP215PRE|S2,ZA
ZASSUP215STR|S2,ZA,SSUP215STR,S2,12/14/2021,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,ZA,SSUP215STR,S2,...,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP215STR|S2,ZA
ZASSUP300PRE|S2,ZA,SSUP300PRE,S2,12/14/2021,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,ZA,SSUP300PRE,S2,...,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP300PRE|S2,ZA


In [9]:

#RACM_AB["Index"] = RACM_AB["Index"].apply(lambda x: x.lstrip("0"))
RACM_AB["Index_GIM"] = RACM_AB["Index"].apply(lambda z: z.split("|", 1)[0])
RACM_AB["Index_GIM"] = [s.lstrip("0") for s in RACM_AB["Index_GIM"]] # Remove the leading "0" since GIM does not ahev "0"
RACM_AB = RACM_AB.set_index("Index_GIM")

In [10]:
RACM_AB

Unnamed: 0_level_0,Country Code_20,UPN_20,ConFig_20,Start Date_20,End Date_20,IPN_20,Original Data_20,Country Code_21,UPN_21,ConFig_21,...,End Date_21,IPN_21,Original Data_21,Start Date Changed,End Date Changed,Product Removed,Product Added,RAW Data Changed,Index,Country Code
Index_GIM,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
77.0082,AE,77.0082,C1,04/05/2021,03/17/2023,077.0082-C1,AE|077.0082|C1||05/04/2021|17/03/2023,AE,77.0082,C1,...,01/26/2025,077.0082-C1,AE|077.0082|C1||05/04/2021|26/01/2025,False,True,False,False,True,077.0082|C1,AE
77.0193,AE,77.0193,C1,04/05/2021,07/23/2021,077.0193-C1,AE|077.0193|C1||05/04/2021|23/07/2021,AE,77.0193,C1,...,01/26/2025,077.0193-C1,AE|077.0193|C1||05/04/2021|26/01/2025,False,True,False,False,True,077.0193|C1,AE
77.0193,,,,,,,,AE,77.0193,C3,...,01/26/2025,077.0193-C3,AE|077.0193|C3||31/01/2022|26/01/2025,True,True,False,True,True,077.0193|C3,AE
80030,AE,80030,C2,06/15/2016,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,AE,80030,C2,...,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,False,False,False,False,False,80030|C2,AE
80031,AE,80031,C2,06/15/2016,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,AE,80031,C2,...,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,False,False,False,False,False,80031|C2,AE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SSTD300STR,ZA,SSTD300STR,S2,12/14/2021,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,ZA,SSTD300STR,S2,...,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSTD300STR|S2,ZA
SSUP215PRE,ZA,SSUP215PRE,S2,12/14/2021,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,ZA,SSUP215PRE,S2,...,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP215PRE|S2,ZA
SSUP215STR,ZA,SSUP215STR,S2,12/14/2021,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,ZA,SSUP215STR,S2,...,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP215STR|S2,ZA
SSUP300PRE,ZA,SSUP300PRE,S2,12/14/2021,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,ZA,SSUP300PRE,S2,...,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,False,False,False,False,False,SSUP300PRE|S2,ZA


In [11]:
# RACM_AB["IPN"] = RACM_AB["Index"].apply(lambda z: z.replace("|", "-"))#.apply(lambda y: y[2:]).apply(lambda z: z.replace("|", "-"))
# RACM_AB

In [12]:
#######edit with Kieu/Tan 06/02/2021#########

# Create IPN and UPN columns that have data from both 2020 and 2021
RACM_AB["IPN"] = RACM_AB["Index"].apply(lambda z: z.replace("|", "-"))#.apply(lambda y: y[2:]).apply(lambda z: z.replace("|", "-"))

RACM_AB["UPN"] = RACM_AB["Index"].apply(lambda z: z.split("|", 1)[0])#.apply(lambda x: x.lstrip("0"))
RACM_AB["ConFig"] = RACM_AB["Index"].apply(lambda z: z.split("|", 1)[1])
#RACM_AB.drop(columns = ["IPN_20", "UPN_20", "UPN_21", "IPN_21"], inplace = True)

RACM_AB['AB_Index'] = RACM_AB.index

# Get a list of all unique countries for exporting by each country individually for graphing
#RACM_AB['Country Code'] = RACM_AB.apply(choose_country_code, axis=1)
country_list = RACM_AB["Country Code"].unique()
print(len(country_list))
country_list

RACM_AB['Country Code'] = RACM_AB['Country Code'].astype(str)
List_of_Countries = RACM_AB['Country Code'].unique()

# Change data from object to string to retain the leading "0"
RACM_AB['UPN_20'] = RACM_AB['UPN_20'].apply(str)
RACM_AB['UPN_21'] = RACM_AB['UPN_21'].apply(str)
RACM_AB['IPN_20'] = RACM_AB['IPN_20'].apply(str)
RACM_AB['IPN_21'] = RACM_AB['IPN_21'].apply(str)

RACM_AB

92


Unnamed: 0_level_0,Country Code_20,UPN_20,ConFig_20,Start Date_20,End Date_20,IPN_20,Original Data_20,Country Code_21,UPN_21,ConFig_21,...,End Date Changed,Product Removed,Product Added,RAW Data Changed,Index,Country Code,IPN,UPN,ConFig,AB_Index
Index_GIM,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
77.0082,AE,77.0082,C1,04/05/2021,03/17/2023,077.0082-C1,AE|077.0082|C1||05/04/2021|17/03/2023,AE,77.0082,C1,...,True,False,False,True,077.0082|C1,AE,077.0082-C1,077.0082,C1,77.0082
77.0193,AE,77.0193,C1,04/05/2021,07/23/2021,077.0193-C1,AE|077.0193|C1||05/04/2021|23/07/2021,AE,77.0193,C1,...,True,False,False,True,077.0193|C1,AE,077.0193-C1,077.0193,C1,77.0193
77.0193,,,,,,,,AE,77.0193,C3,...,True,False,True,True,077.0193|C3,AE,077.0193-C3,077.0193,C3,77.0193
80030,AE,80030,C2,06/15/2016,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,AE,80030,C2,...,False,False,False,False,80030|C2,AE,80030-C2,80030,C2,80030
80031,AE,80031,C2,06/15/2016,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,AE,80031,C2,...,False,False,False,False,80031|C2,AE,80031-C2,80031,C2,80031
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SSTD300STR,ZA,SSTD300STR,S2,12/14/2021,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,ZA,SSTD300STR,S2,...,False,False,False,False,SSTD300STR|S2,ZA,SSTD300STR-S2,SSTD300STR,S2,SSTD300STR
SSUP215PRE,ZA,SSUP215PRE,S2,12/14/2021,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,ZA,SSUP215PRE,S2,...,False,False,False,False,SSUP215PRE|S2,ZA,SSUP215PRE-S2,SSUP215PRE,S2,SSUP215PRE
SSUP215STR,ZA,SSUP215STR,S2,12/14/2021,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,ZA,SSUP215STR,S2,...,False,False,False,False,SSUP215STR|S2,ZA,SSUP215STR-S2,SSUP215STR,S2,SSUP215STR
SSUP300PRE,ZA,SSUP300PRE,S2,12/14/2021,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,ZA,SSUP300PRE,S2,...,False,False,False,False,SSUP300PRE|S2,ZA,SSUP300PRE-S2,SSUP300PRE,S2,SSUP300PRE


# 7. Request 2: Add GIM database to the joint RACM data from request 1

## 7.1 Join two dataframes(Dec2020_Apr2021_RAC & GIM)

In [13]:
GIM

Unnamed: 0_level_0,ItemId,CatalogNumber,ItemType,LongDescription
CatalogNumber_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
H965100430,1076194,H965100430,12-Disposable,GUIDER 7F PRE-SHAPED 40 90CM
H965100440,1076195,H965100440,12-Disposable,GUIDER 8F PRE-SHAPED 40 90CM
H965100480,1076196,H965100480,12-Disposable,GUIDER 8F 90CM MULTI PURPOSE
H965100500,1076197,H965100500,12-Disposable,GUIDER 6F STRAIGHT 90CM
H965100510,1076198,H965100510,12-Disposable,GUIDER 7F STRAIGHT 90CM
...,...,...,...,...
FDE52535,4348136,FDE52535,11-Implant,SURPASS EVOLVE ELITE 5.25MM X 35MM - CE
INC-15123-125,4349125,INC-15123-125,,
INC-15123-146,4349126,INC-15123-146,,
INC-15123-153,4349127,INC-15123-153,,


In [14]:
RACM_AB

Unnamed: 0_level_0,Country Code_20,UPN_20,ConFig_20,Start Date_20,End Date_20,IPN_20,Original Data_20,Country Code_21,UPN_21,ConFig_21,...,End Date Changed,Product Removed,Product Added,RAW Data Changed,Index,Country Code,IPN,UPN,ConFig,AB_Index
Index_GIM,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
77.0082,AE,77.0082,C1,04/05/2021,03/17/2023,077.0082-C1,AE|077.0082|C1||05/04/2021|17/03/2023,AE,77.0082,C1,...,True,False,False,True,077.0082|C1,AE,077.0082-C1,077.0082,C1,77.0082
77.0193,AE,77.0193,C1,04/05/2021,07/23/2021,077.0193-C1,AE|077.0193|C1||05/04/2021|23/07/2021,AE,77.0193,C1,...,True,False,False,True,077.0193|C1,AE,077.0193-C1,077.0193,C1,77.0193
77.0193,,,,,,,,AE,77.0193,C3,...,True,False,True,True,077.0193|C3,AE,077.0193-C3,077.0193,C3,77.0193
80030,AE,80030,C2,06/15/2016,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,AE,80030,C2,...,False,False,False,False,80030|C2,AE,80030-C2,80030,C2,80030
80031,AE,80031,C2,06/15/2016,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,AE,80031,C2,...,False,False,False,False,80031|C2,AE,80031-C2,80031,C2,80031
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SSTD300STR,ZA,SSTD300STR,S2,12/14/2021,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,ZA,SSTD300STR,S2,...,False,False,False,False,SSTD300STR|S2,ZA,SSTD300STR-S2,SSTD300STR,S2,SSTD300STR
SSUP215PRE,ZA,SSUP215PRE,S2,12/14/2021,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,ZA,SSUP215PRE,S2,...,False,False,False,False,SSUP215PRE|S2,ZA,SSUP215PRE-S2,SSUP215PRE,S2,SSUP215PRE
SSUP215STR,ZA,SSUP215STR,S2,12/14/2021,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,ZA,SSUP215STR,S2,...,False,False,False,False,SSUP215STR|S2,ZA,SSUP215STR-S2,SSUP215STR,S2,SSUP215STR
SSUP300PRE,ZA,SSUP300PRE,S2,12/14/2021,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,ZA,SSUP300PRE,S2,...,False,False,False,False,SSUP300PRE|S2,ZA,SSUP300PRE-S2,SSUP300PRE,S2,SSUP300PRE


In [16]:
#Join two dataframes(RACM_AB & GIM) with RACM_AB on the left and GIM on the right

RACM_AB_GIM = RACM_AB.join(GIM)
RACM_AB_GIM["index"] = RACM_AB_GIM.index

RACM_AB_GIM.info()

<class 'pandas.core.frame.DataFrame'>
Index: 448718 entries, 100FPP to SSUP300STR
Data columns (total 30 columns):
 #   Column              Non-Null Count   Dtype 
---  ------              --------------   ----- 
 0   Country Code_20     447179 non-null  object
 1   UPN_20              448718 non-null  object
 2   ConFig_20           447179 non-null  object
 3   Start Date_20       447179 non-null  object
 4   End Date_20         447179 non-null  object
 5   IPN_20              448718 non-null  object
 6   Original Data_20    447179 non-null  object
 7   Country Code_21     448686 non-null  object
 8   UPN_21              448718 non-null  object
 9   ConFig_21           448686 non-null  object
 10  Start Date_21       448686 non-null  object
 11  End Date_21         448686 non-null  object
 12  IPN_21              448718 non-null  object
 13  Original Data_21    448686 non-null  object
 14  Start Date Changed  448718 non-null  bool  
 15  End Date Changed    448718 non-null  bool  
 16

## 7.2 Export data as a single file

In [17]:
path = r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\21MAR2022\RACM_AB_31DEC2021_21MAR2022.csv"
RACM_AB_GIM.to_csv(path, index = True)

RACM_AB_GIM.info()

<class 'pandas.core.frame.DataFrame'>
Index: 448718 entries, 100FPP to SSUP300STR
Data columns (total 30 columns):
 #   Column              Non-Null Count   Dtype 
---  ------              --------------   ----- 
 0   Country Code_20     447179 non-null  object
 1   UPN_20              448718 non-null  object
 2   ConFig_20           447179 non-null  object
 3   Start Date_20       447179 non-null  object
 4   End Date_20         447179 non-null  object
 5   IPN_20              448718 non-null  object
 6   Original Data_20    447179 non-null  object
 7   Country Code_21     448686 non-null  object
 8   UPN_21              448718 non-null  object
 9   ConFig_21           448686 non-null  object
 10  Start Date_21       448686 non-null  object
 11  End Date_21         448686 non-null  object
 12  IPN_21              448718 non-null  object
 13  Original Data_21    448686 non-null  object
 14  Start Date Changed  448718 non-null  bool  
 15  End Date Changed    448718 non-null  bool  
 16

Unnamed: 0,Country Code,UPN,ConFig,Start Date,End Date,IPN,Original Data,ItemId,CatalogNumber,ItemType,LongDescription,index
AE077.0082|C1,AE,77.0082,C1,04/05/2021,01/26/2025,077.0082-C1,AE|077.0082|C1||05/04/2021|26/01/2025,,,,,AE077.0082|C1
AE077.0193|C1,AE,77.0193,C1,04/05/2021,01/26/2025,077.0193-C1,AE|077.0193|C1||05/04/2021|26/01/2025,,,,,AE077.0193|C1
AE077.0193|C3,AE,77.0193,C3,01/31/2022,01/26/2025,077.0193-C3,AE|077.0193|C3||31/01/2022|26/01/2025,,,,,AE077.0193|C3
AE80030|C2,AE,80030,C2,06/15/2016,12/31/9999,80030-C2,AE|80030|C2||15/06/2016|31/12/9999,,,,,AE80030|C2
AE80031|C2,AE,80031,C2,06/15/2016,12/31/9999,80031-C2,AE|80031|C2||15/06/2016|31/12/9999,,,,,AE80031|C2
...,...,...,...,...,...,...,...,...,...,...,...,...
ZASSTD300STR|S2,ZA,SSTD300STR,S2,12/14/2021,12/31/9999,SSTD300STR-S2,ZA|SSTD300STR|S2||14/12/2021|31/12/9999,,,,,ZASSTD300STR|S2
ZASSUP215PRE|S2,ZA,SSUP215PRE,S2,12/14/2021,12/31/9999,SSUP215PRE-S2,ZA|SSUP215PRE|S2||14/12/2021|31/12/9999,,,,,ZASSUP215PRE|S2
ZASSUP215STR|S2,ZA,SSUP215STR,S2,12/14/2021,12/31/9999,SSUP215STR-S2,ZA|SSUP215STR|S2||14/12/2021|31/12/9999,,,,,ZASSUP215STR|S2
ZASSUP300PRE|S2,ZA,SSUP300PRE,S2,12/14/2021,12/31/9999,SSUP300PRE-S2,ZA|SSUP300PRE|S2||14/12/2021|31/12/9999,,,,,ZASSUP300PRE|S2


# 8. Request 3: Join 2020 to GIM and Join GIM 2021

## 8.1 Reset Index

In [14]:
#Reset index
RACM_A = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\09MAR2022\RACMUPDATE_9MAR2022.txt", sep='|', names=['Country Code', 'UPN', 'ConFig', 'Empty', 'Start Date', 'End Date'])

#Take away leading "0" from the UPN. GIM data does not have leading "0" in their CatalogNumber(UPN)
RACM_A["UPN"]= RACM_A["UPN"].apply(lambda x: x.lstrip("0"))
RACM_A["Original Data"] = RACM_A_["Original Data"]

#Rest the index. This will help join with GIM data 
RACM_A.set_index("UPN",inplace = True)


RACM_A

Unnamed: 0_level_0,Country Code,ConFig,Empty,Start Date,End Date,Original Data
UPN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
77.0082,AE,C1,,05/04/2021,26/01/2025,AE|077.0082|C1||05/04/2021|26/01/2025
77.0082,AR,C1,,23/12/2020,31/12/9999,AR|077.0082|C1||23/12/2020|31/12/9999
77.0082,AT,C1,,23/12/2020,26/05/2025,AT|077.0082|C1||23/12/2020|26/05/2025
77.0082,AT,C2,,28/10/2021,31/12/9999,AT|077.0082|C2||28/10/2021|31/12/9999
77.0082,AU,C1,,23/12/2020,31/12/9999,AU|077.0082|C1||23/12/2020|31/12/9999
...,...,...,...,...,...,...
SSUP300STR,TR,S2,,13/12/2021,31/12/9999,TR|SSUP300STR|S2||13/12/2021|31/12/9999
SSUP300STR,US,S1,,19/10/2020,31/12/9999,US|SSUP300STR|S1||19/10/2020|31/12/9999
SSUP300STR,US,S2,,08/04/2021,31/12/9999,US|SSUP300STR|S2||08/04/2021|31/12/9999
SSUP300STR,XI,S2,,20/05/2021,31/12/9999,XI|SSUP300STR|S2||20/05/2021|31/12/9999


## 8.2 Join RACM Dataset # 1 to GIM

In [15]:
RACM_A_GIM = RACM_A.join(GIM)
RACM_A_GIM


Unnamed: 0,Country Code,ConFig,Empty,Start Date,End Date,Original Data,ItemId,CatalogNumber,ItemType,LongDescription
100FPP,AR,S1,,04/11/2014,31/12/9999,AR|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AR,S2,,06/11/2014,31/12/9999,AR|100FPP|S2||06/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AT,S1,,04/11/2014,31/12/9999,AT|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AU,S1,,04/11/2014,31/12/9999,AU|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,BE,S1,,04/11/2014,31/12/9999,BE|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
...,...,...,...,...,...,...,...,...,...,...
SSUP300STR,TR,S2,,13/12/2021,31/12/9999,TR|SSUP300STR|S2||13/12/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,US,S1,,19/10/2020,31/12/9999,US|SSUP300STR|S1||19/10/2020|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,US,S2,,08/04/2021,31/12/9999,US|SSUP300STR|S2||08/04/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,XI,S2,,20/05/2021,31/12/9999,XI|SSUP300STR|S2||20/05/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM


In [38]:
## 8.3 Join RACM Dataset # 2 to GIM

In [77]:
#Reset index
RACM_B = pd.read_csv(r"C:\Users\kpham\Desktop\RACM_DataCodeFiles\04JAN2022\RACM_B\RACMUPDATE_31DEC2021.txt", sep='|', names=['Country Code', 'UPN', 'ConFig', 'Empty', 'Start Date', 'End Date'])

#Take away leading "0" from the UPN. GIM data does not have leading "0" in their CatalogNumber(UPN)
RACM_B["UPN"]= RACM_B["UPN"].apply(lambda x: x.lstrip("0"))

####Added in 05JAN2022 to have the original data show up..... 
RACM_B["Original Data"] = RACM_B_["Original Data"]
###########


#Rest the index. This will help join with GIM data 
RACM_B.set_index("UPN",inplace = True)


RACM_B

Unnamed: 0_level_0,Country Code,ConFig,Empty,Start Date,End Date,Original Data
UPN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
77.0082,AE,C1,,05/04/2021,17/03/2023,AE|077.0082|C1||05/04/2021|17/03/2023
77.0082,AR,C1,,23/12/2020,31/12/9999,AR|077.0082|C1||23/12/2020|31/12/9999
77.0082,AT,C1,,23/12/2020,26/05/2025,AT|077.0082|C1||23/12/2020|26/05/2025
77.0082,AT,C2,,28/10/2021,31/12/9999,AT|077.0082|C2||28/10/2021|31/12/9999
77.0082,AU,C1,,23/12/2020,31/12/9999,AU|077.0082|C1||23/12/2020|31/12/9999
...,...,...,...,...,...,...
SSUP300STR,TR,S2,,13/12/2021,31/12/9999,TR|SSUP300STR|S2||13/12/2021|31/12/9999
SSUP300STR,US,S1,,19/10/2020,31/12/9999,US|SSUP300STR|S1||19/10/2020|31/12/9999
SSUP300STR,US,S2,,08/04/2021,31/12/9999,US|SSUP300STR|S2||08/04/2021|31/12/9999
SSUP300STR,XI,S2,,20/05/2021,31/12/9999,XI|SSUP300STR|S2||20/05/2021|31/12/9999


In [78]:
RACM_B_GIM = RACM_B.join(GIM)
RACM_B_GIM

Unnamed: 0,Country Code,ConFig,Empty,Start Date,End Date,Original Data,ItemId,CatalogNumber,ItemType,LongDescription
100FPP,AR,S1,,04/11/2014,31/12/9999,AR|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AR,S2,,06/11/2014,31/12/9999,AR|100FPP|S2||06/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AT,S1,,04/11/2014,31/12/9999,AT|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,AU,S1,,04/11/2014,31/12/9999,AU|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
100FPP,BE,S1,,04/11/2014,31/12/9999,BE|100FPP|S1||04/11/2014|31/12/9999,1111606,100FPP,12-Disposable,SURPASS 3MM X 15MM - CE
...,...,...,...,...,...,...,...,...,...,...
SSUP300STR,TR,S2,,13/12/2021,31/12/9999,TR|SSUP300STR|S2||13/12/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,US,S1,,19/10/2020,31/12/9999,US|SSUP300STR|S1||19/10/2020|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,US,S2,,08/04/2021,31/12/9999,US|SSUP300STR|S2||08/04/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM
SSUP300STR,XI,S2,,20/05/2021,31/12/9999,XI|SSUP300STR|S2||20/05/2021|31/12/9999,4284913,SSUP300STR,12-Disposable,SYNCHRO SELECT-14 SUPPORT ST 300CM


In [69]:
GIM

Unnamed: 0_level_0,ItemId,CatalogNumber,ItemType,LongDescription
CatalogNumber_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
H965100430,1076194,H965100430,12-Disposable,GUIDER 7F PRE-SHAPED 40 90CM
H965100440,1076195,H965100440,12-Disposable,GUIDER 8F PRE-SHAPED 40 90CM
H965100480,1076196,H965100480,12-Disposable,GUIDER 8F 90CM MULTI PURPOSE
H965100500,1076197,H965100500,12-Disposable,GUIDER 6F STRAIGHT 90CM
H965100510,1076198,H965100510,12-Disposable,GUIDER 7F STRAIGHT 90CM
...,...,...,...,...
FD17D35021,4347362,FD17D35021,,
FD17D35022,4347363,FD17D35022,,
FD17D35023,4347364,FD17D35023,,
FD17D35024,4347365,FD17D35024,,


## 8.4 Export RACM Data with GIM Data

In [17]:
# Export RACM + GIM Dataset ASingle File
path = "C:\\Users\\kpham\\Desktop\\RACM_DataCodeFiles\\09MAR2022\\RACM_GIM_09MAR2022.csv"
RACM_A_GIM.to_csv(path, index = False)

In [49]:
# Export RACM + GIM Dataset B
path = "C:\\Users\\kpham\\Desktop\\RACM_11JUN2021\\Export\\RACM_B_GIM_10JUN2021.csv"
RACM_B_GIM.to_csv(path, index = False)

In [79]:
# Export RACM + GIM Dataset B Multiple Files
for X in List_of_Countries:
    path = "C:\\Users\\kpham\\Desktop\\RACM_DataCodeFiles\\04JAN2022\\RACM_2021_SPLIT\\" + X + "_GIM.csv"
    X = RACM_B_GIM[(RACM_B_GIM["Country Code"]==X)] 
    X.to_csv(path, index = False)  

In [20]:
# Export RACM_AB_GIM Dataset Multiple Files
for X in List_of_Countries:
    path = "C:\\Users\\kpham\\Desktop\\DL2_Comparision\\Export\\AB_GIM_CountrySplit\\" + X + "_GIM.csv"
    X = RACM_AB_GIM[(RACM_AB_GIM["Country Code"]==X)] 
    X.to_csv(path, index = False)  

In [None]:
# Export RACM Dataset #2 Multiple Files
for X in List_of_Countries:
    path = "C:\\Users\\kcheung1\\Code\\Project - Quickbase\\Export_all\\" + X + ".csv"
    X = RAC_042021_GIM[(RAC_042021_GIM["Country Code"]==X)] 
    X.to_csv(path, index = False)  

THE END