<a href="https://colab.research.google.com/github/joamilab/Bootcamp-DIO-Azure-AI/blob/main/Bootcamp_DIO_Azure_Tradutor_de_Documentos.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tradutor de documentos PT-BR para EN

Tradutor de documentos no formato .docx do português brasileiro para o inglês.

Tradução realizada com o serviço de tradução do Azure AI Services.

In [1]:
! pip install python-docx



In [2]:
import requests
import os
import json
from docx import Document

In [3]:
key = 'YOUR-KEY'
endpoint_text = 'ENDPOINT-TEXT'
endpoint_document = 'ENDPOINT-DOCUMENT'
location = 'SERVICES-LOCATION'

In [4]:
original_lang = 'pt-br'
target_lang = 'en'

In [5]:
def translate_text(text, target_language):
  '''
    Translates text into the target language with the translator Azure AI Service.

    Parameters:
      text (str): The text to be translated.
      target_language (str): The target language.

    Returns:
      str: The translated text.
  '''

  path = '/translate'
  constructed_url = endpoint_text + path

  headers = {
      'Ocp-Apim-Subscription-Key': key,
      'Ocp-Apim-Subscription-Region': location,
      'Content-type': 'application/json',
      'X-ClientTraceId': str(os.urandom(16))
  }

  body = [{
      'text': text
  }]
  params = {
      'api-version': '3.0',
      'from': original_lang,
      'to': [target_language]
  }
  request = requests.post(constructed_url, params=params, headers=headers, json=body)
  response = request.json()

  return response[0]['translations'][0]['text']

In [6]:
def translate_document(path):
  '''
    Translates a document with the translator Azure AI Service.

    Parameters:
      path (str): The path to the document.

    Returns:
      Document: The translated document.
  '''
  document = Document(path)

  full_text = []
  for paragraph in document.paragraphs:
    translated_paragraph = translate_text(paragraph.text, target_lang)
    full_text.append(translated_paragraph)

  translated_doc = Document()
  for line in full_text:
    translated_doc.add_paragraph(line)

  path_translated = path.replace('.docx', f'_{target_lang}.docx')
  translated_doc.save(path_translated)

  return translated_doc

### Teste da função de tradução de texto

In [7]:
def test_translate_text(text_original, text_translated, target_language):
  '''
    Tests the translate_text function.

    Parameters:
      text_original (str): The original text.
      text_translated (str): The translated text.
      target_language (str): The target language.

    Returns:
      boolean: The result of the test.
  '''

  translated = translate_text(text_original, target_language)

  if translated == text_translated:
    print('Test passed!')
    print(translated)
    return True
  else:
    print('Test failed!')
    return False

In [8]:
text_original_test = 'Ser ou não ser. Eis a questão.'
text_translated_test = 'To be or not to be. That is the question.'

test_translate_text(text_original_test, text_translated_test, target_lang)

Test passed!
To be or not to be. That is the question.


True

### Testa a função de tradução de documento

In [9]:
def test_translate_document(path):
  '''
    Tests the translate_document function.

    Parameters:
      path (str): The path to the document.

    Returns:
      boolean: The result of the test.
  '''

  translated = translate_document(path)

  if translated:
    print('Test passed!')
    return True
  else:
    print('Test failed!')
    return False

In [10]:
document_original = '/content/artigo-ciencia-dados.docx'
test_translate_document(document_original)

Test passed!


True