# Multilingual and cross-lingual prompting

In this notebook, we explore how to design prompts for handling multiple languages and enabling cross-lingual interactions using LLMs. The goal is to build prompts that can operate seamlessly across various linguistic contexts—such as greeting users in different languages, adapting responses based on the detected input language, performing translations, and handling non-Latin scripts with cultural nuance.

This kind of prompting is crucial for building inclusive applications like international chatbots, educational tools, and translation services that must support users from diverse linguistic backgrounds.

In [1]:
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### Initialize the language model
We instantiate a lightweight GPT model from OpenAI using LangChain.

In [2]:
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

## Multilingual prompt design
Let's begin by crafting a multilingual prompt that greets users in various languages. This demonstrates multilingual generation capability and how to use templates that dynamically adjust based on language input.

In [3]:
# Define a multilingual prompt template
multilingual_greeting = PromptTemplate(
    input_variables=["language"],
    template="Greet the user in {language} and provide a short introduction about the weather in a country where this language is spoken."
)

# Languages to test the prompt with
languages = ["English", "Spanish", "French", "German", "Hebrew", "Arabic", "Hindi", "Marathi"]

# Run the prompt for each language
for lang in languages:
    prompt = multilingual_greeting.format(language=lang)
    response = llm.invoke(prompt)
    print(f"{lang}:")
    print(response.content)
    print()

English:
Hello! In the United States, the weather can vary greatly depending on the region and time of year. For example, in the summer, places like Florida experience warm, humid conditions, while northern states like Minnesota can have cool, pleasant temperatures. In winter, the northeast often sees heavy snowfall, while the southwest enjoys milder, drier weather. Overall, the diverse climate across the country offers a little something for everyone!

Spanish:
¡Hola! Espero que estés teniendo un buen día. Hoy, hablemos del clima en España. En general, España cuenta con una variedad de climas, desde el cálido y seco del sur en Andalucía, hasta el fresco y húmedo del norte en Galicia. Durante el verano, muchas regiones experimentan temperaturas altas, especialmente en el interior, mientras que las costas suelen ser más moderadas. En invierno, las montañas de los Pirineos reciben nieve, convirtiéndose en un destino popular para el esquí. ¡Cada estación trae su propio encanto!

French:
B

This block defines a prompt that adapts to a specified language. The model is asked to produce a culturally contextual greeting and a weather comment based on the language. It illustrates multilingual control through prompt variables.

## Language detection and adaptation
Now we test whether the model can infer the language from user input and respond in that same language. This is essential for natural multilingual interfaces.

In [4]:
# Define the adaptive prompt that detects language and replies in it
language_adaptive_prompt = PromptTemplate(
    input_variables=["user_input"],
    template="""Detect the language of the following input and respond in the same language:
    User input: {user_input}
    Your response (in the detected language):"""
)

# Sample multilingual inputs
inputs = [
    "Hello, how are you?",
    "Hola, ¿cómo estás?",
    "Bonjour, comment allez-vous ?",
    "こんにちは、お元気ですか？",
    "Здравствуйте, как дела?"
]

# Test the adaptive language response
for user_input in inputs:
    prompt = language_adaptive_prompt.format(user_input=user_input)
    response = llm.invoke(prompt)
    print(f"Input: {user_input}")
    print("Response:")
    print(response.content)
    print()

Input: Hello, how are you?
Response:
Hello! I'm doing well, thank you. How about you?

Input: Hola, ¿cómo estás?
Response:
¡Hola! Estoy bien, gracias. ¿Y tú?

Input: Bonjour, comment allez-vous ?
Response:
Bonjour ! Je vais bien, merci. Et vous ?

Input: こんにちは、お元気ですか？
Response:
こんにちは！私は元気です。あなたはいかがですか？

Input: Здравствуйте, как дела?
Response:
Здравствуйте! У меня всё хорошо, спасибо. Как дела у вас?



The model automatically detects the language and adapts the response accordingly. This dynamic multilingual capability makes the prompt suitable for chatbots or virtual assistants used globally.

## Cross-lingual translation
Next, we create prompts that translate text between different language pairs. This demonstrates how to use structured prompts to guide LLMs for translation tasks.

In [5]:
# Define the prompt template for translation
translation_prompt = PromptTemplate(
    input_variables=["source_lang", "target_lang", "text"],
    template="""Translate the following text from {source_lang} to {target_lang}:
    {source_lang} text: {text}
    {target_lang} translation:"""
)

# Translation test cases
translations = [
    {"source_lang": "English", "target_lang": "French", "text": "The quick brown fox jumps over the lazy dog."},
    {"source_lang": "Spanish", "target_lang": "German", "text": "La vida es bella."},
    {"source_lang": "Japanese", "target_lang": "English", "text": "桜の花が満開です。"}
]

# Execute and print translations
for t in translations:
    prompt = translation_prompt.format(**t)
    response = llm.invoke(prompt)
    print(f"From {t['source_lang']} to {t['target_lang']}:")
    print(f"Original: {t['text']}")
    print("Translation:")
    print(response.content)
    print()

From English to French:
Original: The quick brown fox jumps over the lazy dog.
Translation:
La rapide renarde brune saute par-dessus le chien paresseux.

From Spanish to German:
Original: La vida es bella.
Translation:
Das Leben ist schön.

From Japanese to English:
Original: 桜の花が満開です。
Translation:
The cherry blossoms are in full bloom.



This prompt provides straightforward, reliable translation using the model. These kinds of templates are flexible and can be used in multilingual applications or content localization systems. It works well across diverse language pairs, including non-Latin and Latin scripts.

## Handling non-Latin scripts with transliteration
Languages like Japanese, Hindi, and Russian use non-Latin scripts. This prompt handles such input by returning the original, its script type, a Latin transliteration, and an English translation.

In [6]:
# Prompt to handle non-Latin scripts with extra metadata
non_latin_prompt = PromptTemplate(
    input_variables=["text", "script"],
    template="""Provide the following information for the given text:
    1. The original text
    2. The name of the script/writing system
    3. A transliteration to Latin alphabet
    4. An English translation

    Text: {text}
    Script: {script}
    """
)

# Test samples in various scripts
non_latin_texts = [
    {"text": "こんにちは、世界", "script": "Japanese"},
    {"text": "Здравствуй, мир", "script": "Cyrillic"},
    {"text": "नमस्ते दुनिया", "script": "Devanagari"}
]

# Generate and print the outputs
for text in non_latin_texts:
    prompt = non_latin_prompt.format(**text)
    response = llm.invoke(prompt)
    print(response.content)
    print()

1. The original text: こんにちは、世界
2. The name of the script/writing system: Japanese
3. A transliteration to Latin alphabet: Konnichiwa, Sekai
4. An English translation: Hello, World

1. The original text: Здравствуй, мир  
2. The name of the script/writing system: Cyrillic  
3. A transliteration to Latin alphabet: Zdravstvuy, mir  
4. An English translation: Hello, world  

1. The original text: नमस्ते दुनिया
2. The name of the script/writing system: Devanagari
3. A transliteration to Latin alphabet: Namaste Duniya
4. An English translation: Hello, world



The model outputs rich information: it identifies the script, provides a transliteration (Romanization), and translates the phrase into English. This is especially valuable in applications like language learning or multilingual search engines.

## Improving translation quality and cultural sensitivity
Here we enhance translation quality by including cultural context and idiomatic interpretation. Literal translations can be misleading, so this prompt encourages the model to provide both direct and culturally adapted translations.

In [7]:
# Create a prompt that handles idioms and cultural nuances
cultural_translation_prompt = PromptTemplate(
    input_variables=["source_lang", "target_lang", "text"],
    template="""Translate the following text from {source_lang} to {target_lang}, paying special attention to cultural context and idiomatic expressions. Provide:
    1. A direct translation
    2. A culturally adapted translation (if different)
    3. Explanations of any cultural nuances or idioms

    {source_lang} text: {text}
    {target_lang} translation and explanation:"""
)

# Sample idiomatic or culturally rich texts
cultural_texts = [
    {"source_lang": "English", "target_lang": "Japanese", "text": "It's raining cats and dogs."},
    {"source_lang": "French", "target_lang": "English", "text": "Je suis dans le pétrin."},
    {"source_lang": "Spanish", "target_lang": "German", "text": "Cuesta un ojo de la cara."}
]

# Run the prompts and print results
for text in cultural_texts:
    prompt = cultural_translation_prompt.format(**text)
    response = llm.invoke(prompt)
    print(f"From {text['source_lang']} to {text['target_lang']}:")
    print(f"Original: {text['text']}")
    print("Translation and Explanation:")
    print(response.content)
    print()

From English to Japanese:
Original: It's raining cats and dogs.
Translation and Explanation:
1. **Direct Translation**: 猫や犬が降っている (Neko ya inu ga futte iru)

2. **Culturally Adapted Translation**: バケツをひっくり返したように降っている (Baketsu o hikkuri kaeshita you ni futte iru)

3. **Explanation of Cultural Nuances and Idioms**:
   - The direct translation "猫や犬が降っている" is not idiomatic in Japanese and would likely confuse native speakers, as it literally means "cats and dogs are falling."
   - The culturally adapted translation "バケツをひっくり返したように降っている" literally means "it's raining as if a bucket has been turned over." This expression conveys the intensity and heaviness of the rain, similar to the English idiom. 
   - In Japanese culture, there are several idiomatic expressions related to rain, and this specific one effectively communicates the idea of a heavy downpour in a way that is familiar and relatable to Japanese speakers.

From French to English:
Original: Je suis dans le pétrin.
Translation and E

The model now distinguishes between direct and culturally adapted translations, offering explanations for idioms. Idioms often do not translate directly, so we ask the model to provide equivalent expressions and explanations. This approach is more aligned with professional human translation practices and highly valuable for publishing, education, and localization.

These techniques are foundational for building inclusive, language-aware applications capable of communicating with users across the globe. With prompt engineering and fine-tuning, such models can further specialize in specific linguistic or cultural domains.