## Testing Lambda Deployment

**Author:** Shaun Khoo  
**Date:** 10 Jan 2022  
**Context:** Created Python scripts for the dummy API, deployed using AWS CLI, now testing the API endpoint  
**Objective:** Test the API endpoint, develop some simple scripts to convert data into JSON format

**Note:** Referencing [this tutorial](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-upload-code)

In [1]:
import requests

In [62]:
resp = requests.get('https://d1b3viqczc.execute-api.us-east-1.amazonaws.com/default/dummy-api',
                     headers = {'x-api-key': 'ministryofmanpower2022'},
                     params = {'mcf_url': 'https://www.mycareersfuture.gov.sg/job/public/data-scientist-government-technology-agency-d4beb5aee362d4d7d340abdd4ea63d7a'})

In [64]:
import re
mcf_url = 'https://www.mycareersfuture.gov.sg/job/public/data-scientist-government-technology-agency-d4beb5aee362d4d7d340abdd4ea63d7a'
regex_matches = re.search('\\-{1}([a-z0-9]{32})\\?', mcf_url + "?")
regex_matches.group(1)

'd4beb5aee362d4d7d340abdd4ea63d7a'

In [63]:
resp.json()

{'mcf_job_id': 'MCF-2021-0077475',
 'mcf_job_title': '3422-  Clinic Assistant / Reception【 O&G clinic/ Nurse / Novena / Orchard/ 5.5day】',
 'mcf_job_desc': "<ul>\n  <li><strong>O&amp;G Clinic</strong></li>\n  <li><strong>Location: Central - Novena / Orchard</strong></li>\n  <li><strong>Working days 5.5 days from Mon to Sat.</strong></li>\n  <li><strong>Official hours : 8.30 to 5.30pm / 8.30am to 12.30pm</strong></li>\n  <li><strong>Career Development Opportunities</strong></li>\n  <li><strong>Fast-track Career Progression</strong></li>\n</ul>\n<p><strong>Interested applicants can send your resume to ✉winnie_lee@thesupremehr.com and allow our Consultants to match you with our Clients. No Charges will be incurred by Candidates for any service rendered.</strong></p>\n<p><strong>Job Description</strong></p>\n<ul>\n  <li>Assist the Clinic Executive to manage the daily counter and clinic operations</li>\n  <li>Assist in supervising the registration and billing processes..</li>\n  <li>Review 

In [None]:
import json
import random
import pandas as pd
import os
os.chdir('..')

Reading in the test set with the predictions

In [None]:
test_enhanced = pd.read_csv('Notebooks/Exported Files/Test_Predictions.csv')

Generating a random set of indices to pick a subset

In [None]:
selected_indices = random.choices(test_enhanced.index.tolist(), k = 50)

In [None]:
selected_data = test_enhanced.loc[selected_indices, ['MCF_Job_Ad_ID', 'Predicted_SSOC_2020', 'SSOC_5D_Top_10_Preds', 'SSOC_5D_Top_10_Preds_Proba']]

Generating the output JSON

In [None]:
output_json = []

for i, row in selected_data.iterrows():
    
    predictions = []
    
    for pred_ssoc, proba in zip(row['SSOC_5D_Top_10_Preds'].split(','), row['SSOC_5D_Top_10_Preds_Proba'].split(',')):
        
        prediction = {
            'SSOC_Code': pred_ssoc,
            'Prediction_Confidence': f"{round(float(proba)*100, 2):.2f}%",
        }
        
        predictions.append(prediction)
    
    output_json.append({
        'MCF_Job_Ad_ID': row['MCF_Job_Ad_ID'],
        'predictions': predictions
    })

Exporting it for the 'feelinglucky' part of the website

In [None]:
with open('Deployments/lambda/dummy-api/dummy_data.json', 'w') as outfile:
    json.dump(output_json, outfile)

Reading in the SSOC 2020 detailed definitions file

In [None]:
ssoc_desc_raw = pd.read_excel('Data/Reference/SSOC2020 Detailed Definitions.xlsx', skiprows = 4)
ssoc_desc = ssoc_desc_raw[ssoc_desc_raw['SSOC 2020'].apply(lambda x: (len(x) == 5) and ('X' != x[0]))].reset_index(drop = True)

Generating the output JSON

In [None]:
ssoc_desc_json = {}
for i, row in ssoc_desc.iterrows():
    ssoc_json = {
        'title': row['SSOC 2020 Title'],
        'description': row['Detailed Definitions']
    }
    ssoc_desc_json[row['SSOC 2020']] = ssoc_json

Exporting it for the API response

In [None]:
with open('Deployments/lambda/dummy-api/ssoc_desc.json', 'w') as outfile:
    json.dump(ssoc_desc_json, outfile)