## Workday Data Issue Semi-automation Tool (for action D only)

This is a semi-automation tool which to facilate to performa the manual data deletion for workday data issues.   
  
**Attention:** 
1. This tool only handles the action type for `D`    
1. `U` will be handled by another notebook   
1. For action `US`, `DS` or `R`, it is out of scope. We will not use tool to handle them. 


**Steps**

+ Step 1: **Check** if there's any open entries in the box folder. 
https://ibm.ent.box.com/file/731566115825

+ Step 2: **Validation** for cnums and field names
+ _Step 3: if there's open ticket in box file, we will **create the jira ticket** to handle._  
https://jsw.ibm.com/browse/ODMODMR-91

+ _Step 4: **Update the box file** spreadsheet with correct jira ticket number, and check the status from "Not Started" to "in processing"_

+ Step 5: Read the spreadsheet and **generate the contents** for jira task (for action = 'D' and status = 'in processing')

+ Step 6: Prepare **REK input** csv file

+ Step 7: **upload the CSV** file to MF server 

+ _Step 8: Execute the **RDSU load process** to load the REK table_
+ _Step 9: Execute the **CCQ change process** to perform the deletion_
+ _Step 10: **Confirm** the result_
+ _Step 11: Update the JIRA task and box spreadsheet to **close** the ticket_

In [None]:
# Parameters
user = 'C943511'  # the userid in the JCL ODMSUBRJ, can be changed to Hans's ID e.g.
user_email = 'IBMCN.JIANGLEI -'
#user = 'NL62958'  # the userid in the JCL ODMSUBRJ, can be changed to Hans's ID e.g.
file_id = '731566115825'  # the box file id which holds the request spreadsheet
tbname = 'ODMT_EMPLOYEE'  # you can use the odm developer view if you don't have the access to E01
#tbname = 'ODMH_EMPLOYEE'  # you can use the odm developer view if you don't have the access to E01
action = 'D'
file_id = '761862313652'  # for test purpose


In [None]:
import sys
sys.path.append('/odm_modules')
from common_func import odm_conn
sys.path.append('/app')
#from BOX import box_oauth as box
from BOX import box_jwt as box
import pandas as pd
import numpy as np
import datetime 
def cell_format(v):                                                             
    return str(v) if isinstance(v, (int)) else str(int(v)) if isinstance(v,(float)) else str(v.date()) if isinstance(v, (datetime.datetime)) else v.decode('utf-8') if isinstance(v, (bytes)) else '' if v is None else str(v).strip()                 

### Step 1 Read the file from box folder to check if there's any open items. 

+ If there's open entries, then show the open entries and do some pre-validation before processing
+ if there's no open entries, then STOP here. 

In [None]:
client = box.get_box_client()
xlsx_file = client.file(file_id).get().name
print('File name for the data change request is:\n\t{}'.format(xlsx_file))
xlsx_content = client.file(file_id).content()
df = pd.read_excel(xlsx_content, sheet_name = 'Manual Action request', header =1 ).fillna('').applymap(cell_format)
df = df.loc[df.RCNUM != '',:]
request_df = df.loc[(df.status.isin(['Not Started', ''])) & (df.action == action)]
print('there are {} pending requests from Workday team'.format(request_df.shape[0]))
request_df

### Step 2 Validataion (only validate those ticket which are in status 'not started')

we perform the validation before raise the jira task  
+ step 2.1 CNUM validation


In [None]:
# step 2.1 Cnum validation to check if the cnum could be found in ODM
cnums_from_file = set(request_df.RCNUM.unique())
cnum_list = ','.join(["'{}'".format(cnum.strip()) for cnum in cnums_from_file])
if len(cnums_from_file) !=0: 
    sql = 'SELECT RCNUM, DUPDATE, CACTIVE FROM ODMPRD.{} WHERE RCNUM IN ({})  '.format(tbname, cnum_list)
    #print(sql)
    with odm_conn.odm_adhoc('prod') as odmprd_adhoc:
        result = odmprd_adhoc(sql)
    result_df = pd.DataFrame(result)
    cnums_result = set(result_df.RCNUM.str.strip())
    diff = cnums_from_file - cnums_result
    if len(diff) != 0: 
        print('ERROR!!!!: The following cnum could not be found in ODM database, please request WD COE team to check:\n {}'.format(cnums_from_file - cnums_result))
    else: 
        print('CNUM VALIDATION PASSED!')
        print('IMPACTED {} CNUMS :'.format(len(cnums_result)))
        for cnum in cnums_result: 
            print('\t{}'.format(cnum))

else:
    print('no pending requests in the spreadsheet!')

### _Step 3:  Create the JIRA task to address data change request_

if there're open tickets in box file, we will create the jira ticket to handle.   
Make sure the jira task is under the following epic   
https://jsw.ibm.com/browse/ODMODMR-91

**Please create separate jira task for U and D respectively**


###  _Step 4: Update the box file spreadsheet_
+ Put the jira task number in the spreadsheet  
https://ibm.ent.box.com/file/731566115825
+ Put your name in the spreadsheet
+ Change the status from "Not Started" to "In processing"

### Step 5: Read the spreadsheet and generate the contents for jira task (for action = 'D' and status = 'in processing') 

In [None]:
# step 5.1 
# read all the ticket for deletion requests
client = box.get_box_client()
xlsx_file = client.file(file_id).get().name
xlsx_content = client.file(file_id).content()
df = pd.read_excel(xlsx_content, sheet_name = 'Manual Action request', header =1 ).fillna('').applymap(cell_format)
df = df.loc[df.RCNUM != '',:]
request_df = df.loc[(df.status == 'In Processing') & (df.action == action)]
print('there are {} pending {} requests from Workday team'.format(request_df.shape[0], action))
# get the Category information and task number
cat_df = pd.read_excel(xlsx_content, sheet_name = 1, header = 1, dtype = str).fillna('')
jira_nbrs = request_df.RTC_task_number.unique()
#rtc_nums = ['RA' + rtc_num.split('-')[1].zfill(5) for rtc_num in rtc_nums]
#rtc_num = rtc_nums[0]
#print('\nthe jira number is {}'.format(rtc_num))
request_df

In [None]:
print('there are {} jira ticket(s) for {} action'.format(len(jira_nbrs), action))
for jira_nbr in jira_nbrs: 
    print('https://jsw.ibm.com/browse/{}'.format(jira_nbr))
if len(jira_nbrs) >= 2: 
    while True: 
        print('please tell which ticket to be handled this time? ({})'.format(', '.join(jira_nbrs)))
        jira_nbr = input()
        if jira_nbr in jira_nbrs:
            print('{} will be handled this time'.format(jira_nbr))
            break
        else: 
            print('{} is not a jira ticket number in status "in-processing", please try again'.format(jira_nbr))
if len(jira_nbrs) == 1: 
    jira_nbr = jira_nbrs[0]

rtc_num = 'RA' + jira_nbr.split('-')[1].zfill(5) 
request_df_current = request_df.loc[request_df.RTC_task_number == jira_nbr]
request_df_current
    

#### STEP 5.2 Copy the contents into jira task
also download the box spreadsheet and upload the spreadsheet to jira task


#### STEP 5.3 Update the JIRA task and request for BPO approval before proceding to step 6


In [None]:
content_template = '''
[Zendesk tickets]
{0}

[Issue Categories]
{1}

[CNUM to be deleted]
{2}

[Query before deletion] ({3} CNUMs in total)
{4}



&nbsp;

{5}
{6}
&nbsp;

&nbsp;


     {7} record(s) selected. 


---------------------------------------
We are going to existing REK process to perform the deletion, 
the deleted data will be backuped in 2 datasets. No dry run is needed.. 
'''

In [None]:
#0
zen_list = '/n'.join(list(request_df_current.COE_zendesk_number.unique())) 
cat_dict = cat_df.loc[:, ['cat', 'description']].set_index('cat')['description'].to_dict()
cat_list = request_df_current.issue_category.unique()
# 1
cat_list_desc = '\n'.join([cat + ':\n' + cat_dict.get(cat, '')  for cat in cat_list])
#2
cnum_list = '\n'.join(list(request_df_current.RCNUM.unique())) 
#3

cnum_cnt = len(request_df_current.RCNUM.unique())  
#4
sql_before = '''select RCNUM
, CACTIVE
, DUPDATE
, CCOUNTRY
, CCOUNTRQ
, RSERNUM 
FROM ODMPRD.{} 
WHERE RCNUM IN ({})'''.format(tbname, ',\n'.join(["'{}'".format(cnum) for cnum in list(request_df_current.RCNUM.unique())])) 
with odm_conn.odm_adhoc('prod') as odmprd_adhoc: 
    result = odmprd_adhoc(sql_before)
    result_df = pd.DataFrame(result)
    #print(request_df)
    result_df = result_df.applymap(cell_format)

#5
col_list = ''.join(['|'+col for col in result_df.columns])

#6 
value_list = '\n'.join([''.join(['|'+item for item in row]) for row in result_df.values])
# 7
value_cnt = result_df.shape[0]
content = content_template.format(zen_list, cat_list_desc, cnum_list, cnum_cnt, sql_before, col_list, value_list, value_cnt)
print(content)


-------------------------------------------------------
### STEP 6 Prepare REK input csv file

In [None]:
# step 6.1 unload the current REK table
cols_list = 'CMODEL	CCOUNTRY	CCOUNTRQ	RSERNUM	NCOUNTRY	NCOUNTRQ	NSERNUM	RRTCTASK	FACTION	CLANGUAG	FDISCONT	QSORTSEQ	TCOUNTRY'
cols_list = cols_list.split()
with odm_conn.odm_adhoc('prod') as odmprd_adhoc: 
    result = odmprd_adhoc('select  {} from odmprd.odmt_emp_keychange order by QSORTSEQ'.format(','.join(cols_list)))
    df_rek = pd.DataFrame(result)
df_rek['C'] = 'R'
df_rek = df_rek.reindex(columns = ['C', *cols_list])
next_seq = int(df_rek.QSORTSEQ.max()) + 1
df_rek

In [None]:
# step 6.2 add the new entries for deletion request
ser = {}
for idx, row in result_df.iterrows():
    cnum = row.RCNUM
    ser['C'] = 'R'
    ser['CMODEL'] = 'IBM'
    ser['CCOUNTRY'] = row.CCOUNTRY
    ser['CCOUNTRQ'] = row.CCOUNTRQ
    ser['RSERNUM'] = row.RSERNUM
    ser['RRTCTASK'] = jira_nbr.split('-')[-1]
    ser['FACTION'] = 'D'
    ser['QSORTSEQ'] = str(next_seq)
    ser['FDISCONT'] = ''   #######
    ser['TCOUNTRY'] = 'zen desk ticket: {}'.format(''.join(request_df_current.loc[request_df_current.RCNUM == cnum].COE_zendesk_number))
    df_rek = df_rek.append(ser, ignore_index = True).fillna('')
df_rek = df_rek.applymap(cell_format)

In [None]:
# step 6.3 write csv file and add the first line
now = str(datetime.datetime.today().date())
df_rek.to_csv('temp.csv', index = False)
rek_filename = 'REK_{}.csv'.format(''.join(now.split('-')))
with open(rek_filename, 'w') as fw: 
    fw.write('T,REK - Employee Primary key change (ODMT_EMP_KEYCHANGE),,,,,,,,,,,,\n')
    fw.write(open('temp.csv', 'r').read())
print('{} \nfile is created in local machine which is ready to be uploaded to MF server'.format(rek_filename))

### STEP 7 upload the CSV file to MF server

In [None]:
# step 7 upload the csv file to stfmvs1 server
from common_func import odm_ftp
with odm_ftp.odm_ftp_conn('put') as odm_put_file:
    server_file = 'ODMAP.RES.ZZ.RDMCSV.IN.EM'
    odm_put_file(fm=rek_filename, to= server_file)
    print( 'file {} is uploade in server side...'.format(server_file))

+++++++
### STEP 8 _Execute the RDSU load process to load the REK table_

### STEP 9 _Execute the CCQ change process to perform the deletion_

### STEP 10 _Confirm the result_


### STEP 11 _Update the JIRA task and box spreadsheet to close the ticket_

