# Multilingual Email Translation and Model Comparison for Mistral Large 1 & Large 2

## Introduction

This notebook demonstrates the process of translating generated customer emails into multiple languages using Mistral Large 1 & Large 2. It leverages Amazon's Bedrock service to access these models and compare their performance across various languages. The primary goals are:

1. To translate a set of customer emails into Japanese, Korean, Hindi, and Arabic.
2. To compare the performance of two Mistral AI models: mistral-large-2407-v1:0 and mistral-large-2402-v1:0.
3. To analyze the output quality and token usage for each model and language combination.

Mistral Large 2 is more capable at generating text in multiple languages. To learn more about Mistral Large 2's language performance [click this link](https://mistral.ai/news/mistral-large-2407/).

In [None]:
import boto3
import logging
from botocore.config import Config
from botocore.exceptions import ClientError

Mistral Large 2 is only available in us-west-2.

In [None]:
config = Config(read_timeout=2000)
bedrock_client = boto3.client(service_name='bedrock-runtime', region_name="us-west-2", config=config)

In [None]:
mistral_large_2 = 'mistral.mistral-large-2407-v1:0'
mistral_large_1 = 'mistral.mistral-large-2402-v1:0'

In [None]:
INFERENCE_CONFIG = {"temperature": 0.0, "maxTokens": 4000, "topP": 0.1}

Emails are generated emails prompted by the author.

In [None]:
emails= """
"I recently bought your RGB gaming keyboard and absolutely love the customizable lighting features! Can you guide me on how to set up different profiles for each game I play?"
"I'm trying to use the macro keys on the gaming keyboard I just purchased, but they don't seem to be registering my inputs. Could you help me figure out what might be going wrong?"
"I'm considering buying your gaming keyboard and I'm curious about the key switch types. What options are available and what are their main differences?"
"I wanted to report a small issue where my keyboard's space bar is a bit squeaky. However, your quick-start guide was super helpful and I fixed it easily by following the lubrication tips. Just thought you might want to know!"
"My new gaming keyboard stopped working within a week of purchase. None of the keys respond, and the lights don't turn on. I need a solution or a replacement as soon as possible."
"I've noticed that the letters on the keys of my gaming keyboard are starting to fade after several months of use. Is this covered by the warranty?"
"I had an issue where my keyboard settings would reset every time I restarted my PC. I figured out it was due to a software conflict and resolved it by updating the firmware. Just wanted to ask if there are any new updates coming soon?"
"I've been having trouble with the keyboard software not saving my configurations, and it's starting to get frustrating. What can be done to ensure my settings are saved permanently?"
"""

This function creates a standardized prompt for translating a set of predefined customer emails into a specified language. It takes a language as input and returns a formatted string containing the emails and translation instructions.

In [None]:
def generate_prompt(language):
    return f"""
emails={emails}
Translate the following customer emails into {language}. Your responses must be numbered, only in {language}, and must adhere to only translating the emails.
"""

Here, we're putting our Mistral Large 1 & 2 to the test. We take our translation task and run it through Large 1 & 2, seeing how each one handles it. We then gather up their responses and how much processing power they used, packaging it all neatly for easy comparison.

In [None]:
def compare_models(prompt):
    models = {
        'large_2': mistral_large_2,
        'large_1': mistral_large_1
    }
    
    results = {}
    
    for model_name, model_id in models.items():
        messages = [{"role": "user", "content": [{"text": prompt}]}]
        response = bedrock_client.converse(
            messages=messages,
            modelId=model_id,
            inferenceConfig=INFERENCE_CONFIG
        )
        
        generated_text = response['output']['message']['content'][0]['text']
        usage_data = response['usage']
        
        results[model_name] = {
            'generated_text': generated_text,
            'usage_data': usage_data,
            'full_response': response
        }
    
    return results

## Showcasing Our Results

Here, we present the outputs from Mistral Large 1 and 2. Our function displays the results of each model's translation efforts in a clear, readable format. It provides a sample of the translated text along with usage statistics for each model.

We apply this process to four languages: Japanese, Korean, Hindi, and Arabic. For each language, we create a prompt, run it through both Mistral Large 1 and 2, and then display their respective outputs and performance metrics. This approach allows us to compare the models' capabilities across different languages.

In [None]:
def print_results_and_compare(results, language):
    logger.info(f"\n{'=' * 80}")
    logger.info(f"RESULTS FOR {language.upper()}")
    logger.info(f"{'=' * 80}")

    for model_name, data in results.items():
        logger.info(f"\n{model_name.upper()} Model:")
        logger.info("Generated text:")
        logger.info(f"{data['generated_text'][:500]}...")  # Print first 500 characters
        logger.info(f"\nUsage data: {data['usage_data']}")
        logger.info("-" * 80)

    logger.info("\nUSAGE COMPARISON:")
    for model, data in results.items():
        logger.info(f"  {model}: {data['usage_data']}")

# Example usage
languages = ['Japanese', 'Korean', 'Hindi', 'Arabic']

for language in languages:
    prompt = generate_prompt(language)
    results = compare_models(prompt)
    print_results_and_compare(results, language)