****Important – Do not use in production, for demonstration purposes only – please review the legal notices before continuing****

# Azure Translator

Azure translator translates text from one language to another using the Translator REST API.


### Overview
*Safety Incident Reports Dataset*: The safety incident reports JSON files in languages like spanish and portuguese are translated to english using the azure translator.

### Notebook Organization 
+ Fetch the injury report JSON files from a folder in spanish and portugese.

+ Translate the JSON files to english by sending a post request to the azure translator service.

+ Store the translated JSON files to a folder.


## Importing Relevant Libraries

In [1]:
import pandas as pd
import requests
import json
import os
from os import listdir
from os.path import isfile, join
import GlobalVariables as gv

## Create Local Folders

- *input-json-files* is the folder from where the input JSON files are provided to be translated into english.
- *translated_json* consists of all the translated json files.

In [2]:
local_path = os.path.join(os.getcwd(), "input-json-files//")
# *translated_json* will contain all the translated json files
if (not os.path.isdir(os.getcwd()+"/translated_json")):
    os.makedirs(os.getcwd()+"/translated_json")
output_path = os.path.join(os.getcwd(), "translated_json//")

## Translator Resource

In [3]:
# Translator resource
# Endpoint parameters for querying the translator to return the translated JSON
url = gv.SAFETY_INCIDENT_TRANSLATION_ENDPOINT
headers = {'Ocp-Apim-Subscription-Key': gv.SAFETY_INCIDENT_TRANSLATION_KEY,"Content-Type":"application/json; charset=UTF-8"}
# Provide english as the language
lang = "en"

In [4]:
# Total files in the input folder
files = [f for f in listdir(os.getcwd()+"/input-json-files") if isfile(join(os.getcwd()+"/input-json-files", f))]
len(files)
# Loop through all the JSON files and translate them one by one
for file in files:
    with open(local_path+file) as f:
        # Reading the data from the JSON file
        data = f.readlines()
        data = [json.loads(line) for line in data]
        pd_data = {k.replace('_', ' ') : v for k, v in data[0].items()}
        esp_data = json.dumps(pd_data, ensure_ascii=False).encode('utf8')
        print("\nOriginal JSON\n")
        print(esp_data.decode())
        esp_decod = esp_data.decode('utf-8')
    # Sending a post request to the translator
    resp = requests.post(url+lang, json=[{'Text':esp_decod}], headers = headers)
    resp_text = json.loads(resp.text) 
    en_val=resp_text[0]['translations'][0]['text']
    try:
        en_dict = json.loads(en_val)

    except:
        en_str = f"{{{en_val}}}"
        en_dict = json.loads(en_str)
    print("\nTranslated JSON\n")
    print(en_dict)
    # Save the translated text to a json file
    with open(output_path+file[:-5]+"-translated"+".json", 'w') as outfile:
        json.dump(en_dict, outfile)


Original JSON

{"Data do evento": "20/8/2019", "Localização": "Kyoto", "Empregador": "Wide World Importers", "Amputação": "1", "Parte do corpo": "unhas", "Narrativa Final": "Um empregado estava carregando uma peça em uma imprensa quando atuava, resultando na amputação dos dois dedos do meio esquerdo.", "CaseId": "202081080", "Fonte": "prensas", "Evento": "Preso em equipamentos ou máquinas em execução durante a operação regular", "Natureza": "amputações", "Hospitalização": "1"}

Translated JSON

{'Event Date': '8/20/2019', 'Location': 'Kyoto', 'Employer': 'Wide World Importers', 'Amputation': '1', 'Body Part': 'Nails', 'Final Narrative': 'An employee was carrying a piece in a press while acting, resulting in the amputation of the two fingers of the left middle.', 'CaseId': '202081080', 'Source': 'presses', 'Event': 'Stuck in equipment or machines running during regular operation', 'Nature': 'amputations', 'Hospitalization': '1'}

Original JSON

{"Fecha del evento": "1/1/2015", "Hospita