<h1>Azure File Translation</h1>

The AzureFileTranslateTool is a wrapper around the Azure Text Translation API that translates text from documents in various formats to a target language.

This tool supports:

PDF, DOCX, PPTX (Powerpoint), XLSX (Excel), XML, HTML, and TXT files.

<h2>Prerequisites</h2>

An Azure text translation resource is required for this tool, so make sure that you have a valid Key and Endpoint/

<h3>Install the required dependencies</h3>

In [None]:
%pip install langchain langchain-community unstructured azure-ai-translation-text 

<h3>Running the tool</h3>

First we set up the environment with using your Azure resource.

All examples will use the sentence "Hello, my name is Dale"

In [3]:
import os
from langchain_community.tools.azure_ai_services.azure_file_translation import (
    AzureFileTranslateTool,
)


# Set your Azure API credentials (Replace with actual values or set via environment)
os.environ["AZURE_TRANSLATE_API_KEY"]
os.environ["AZURE_TRANSLATE_ENDPOINT"]
os.environ["AZURE_REGION"]

validated_values = AzureFileTranslateTool.validate_environment({})

In [4]:
# Initialize the tool
tool = AzureFileTranslateTool(**validated_values)

<h3>Example 1 - Simple text transaltion</h3>

In [None]:
# Input text to translate
input_text = "Hello, world!"

# Translate to French
translated_text = tool._translate_text(input_text, target_language="fr")

print(f"Original: {input_text}")
print(f"Translated: {translated_text}")

<h3>Example 2 - Text file transaltion</h3>

In [None]:
# Create a sample text file
with open("test_Azure.txt", "w", encoding="utf-8") as f:
    f.write("Hello my name is Azure")

# Read and translate content from the file
text_content = tool._read_text("example.txt")
translated_content = tool._translate_text(text_content, target_language="es")

print("Translated Content:")
print(translated_content)

# Clean up the sample file
os.remove("example.txt")

<h3>Example 3 - PDF file transaltion</h3>

In [None]:
# Path to your PDF document
pdf_path = "test_azure.pdf"

# Extract text from the PDF
pdf_content = tool._read_text_from_file(pdf_path)

# Translate the extracted text to German
translated_pdf_content = tool._translate_text(pdf_content, target_language="es")

print("Translated PDF Content:")
print(translated_pdf_content)