# Translate text between languages

Automatically translate content into multiple languages using LLMs.


## Problem

You have content that needs to be available in multiple languages—product descriptions, documentation, user-generated content. Manual translation is slow and expensive.

| Content type | Volume | Target |
|--------------|--------|--------|
| Product descriptions | 10,000 items | 5 languages |
| Support articles | 500 docs | 3 languages |
| User reviews | Ongoing | Spanish, French |


## Solution

**What's in this recipe:**
- Translate text using OpenAI models
- Create multiple language columns from one source
- Handle batch translation efficiently

You add computed columns for each target language. Translations are generated automatically when you insert new content and cached for future queries.


### Setup


In [None]:
%pip install -qU pixeltable openai


In [None]:
import os
import getpass

if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key: ')


In [None]:
import pixeltable as pxt
from pixeltable.functions.openai import chat_completions


In [None]:
# Create a fresh directory
pxt.drop_dir('translate_demo', force=True)
pxt.create_dir('translate_demo')


### Create translation pipeline


In [None]:
# Create table for content
content = pxt.create_table(
    'translate_demo.content',
    {'title': pxt.String, 'text_en': pxt.String}
)


In [None]:
# Add Spanish translation column
spanish_prompt = 'Translate the following text to Spanish. Return only the translation, no explanations:\n\n' + content.text_en

content.add_computed_column(
    response_es=chat_completions(
        messages=[{'role': 'user', 'content': spanish_prompt}],
        model='gpt-4o-mini'
    )
)
content.add_computed_column(text_es=content.response_es.choices[0].message.content)


In [None]:
# Add French translation column
french_prompt = 'Translate the following text to French. Return only the translation, no explanations:\n\n' + content.text_en

content.add_computed_column(
    response_fr=chat_completions(
        messages=[{'role': 'user', 'content': french_prompt}],
        model='gpt-4o-mini'
    )
)
content.add_computed_column(text_fr=content.response_fr.choices[0].message.content)


### Translate content


In [None]:
# Insert sample content
sample_content = [
    {
        'title': 'Welcome Message',
        'text_en': 'Welcome to our platform! We are excited to have you here.'
    },
    {
        'title': 'Product Description',
        'text_en': 'This lightweight laptop features a 14-inch display and all-day battery life.'
    },
    {
        'title': 'Support Article',
        'text_en': 'To reset your password, click the forgot password link on the login page.'
    },
]

content.insert(sample_content)


In [None]:
# View all translations
content.select(content.title, content.text_en, content.text_es, content.text_fr).collect()


In [None]:
# Pretty print one example
row = content.where(content.title == 'Welcome Message').collect()[0]
print(f"English:  {row['text_en']}")
print(f"Spanish:  {row['text_es']}")
print(f"French:   {row['text_fr']}")


## Explanation

**How it works:**

Each target language is a computed column with a translation prompt. When you insert new content:
1. The English text is processed
2. Translation prompts are generated for each language
3. All translations run in parallel
4. Results are cached—no re-translation needed

**Adding more languages:**

```python
# Add German translation
german_prompt = 'Translate to German:\n\n' + content.text_en
content.add_computed_column(
    response_de=chat_completions(messages=[{'role': 'user', 'content': german_prompt}], model='gpt-4o-mini')
)
content.add_computed_column(text_de=content.response_de.choices[0].message.content)
```

**Cost optimization:**

| Strategy | Benefit |
|----------|---------|
| Use `gpt-4o-mini` | Lower cost per translation |
| Cache results | No re-translation on queries |
| Batch inserts | Efficient processing |


## See also

- [Summarize text](./text-summarize.ipynb) - Text summarization with LLMs
- [Extract structured data](./vision-structured-output.ipynb) - Get JSON from LLM responses
