![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use watsonx, and `granite-20b-multilingual` to support multiple languages translation

#### Disclaimers

- Use only Projects and Spaces that are available in watsonx context.


## Notebook content

This notebook contains the steps and code to demonstrate support for language translation in watsonx. It introduces commands for defining prompt and model testing.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

#### About IBM `granite-20b-multilingual` model

The Granite 20 Billion Multilingual (granite-20b-multilingual) model has been trained using over 2.6 trillion tokens and further fine-tuned using a collection of instruction-tuning datasets. The model underwent extensive pre-training utilizing multilingual common crawl data, enabling it to effectively handle the following languages:
- **English**, 
- **German**, 
- **Spanish**, 
- **French**,
- **Portuguese**.

The table below lists the <a href="https://github.com/nlp-uoregon/mlmm-evaluation" target="_blank" rel="xMMLU">xMMLU</a> and <a href="https://arxiv.org/pdf/2306.05685.pdf" target="_blank" rel="xMT-Bench">xMT-Bench</a> benchmarks used to show the performance in 5 languages.

| Benchmark | Average | English | German | Spanish | French | Portuguese |
|:---------:|:-------:|:-------:|:------:|:-------:|:------:|:----------:|
| xMMLU     | 38.41   | 40.58   | 37.91  | 38.04   | 37.58  | 37.95      |
| xMT-Bench | 5.34    | 5.59    | 5.18   | 5.17    | 5.19   | 5.58       |


For additional information about the `granite-20b-multilingual` model, please refer to the provided <a href="https://dataplatform.cloud.ibm.com/wx/samples/models/ibm/granite-20b-multilingual?context=wx" target="_blank" rel="link"> link.</a>

## Learning goal

The goal of this notebook is to demonstrate how to translate multiple languages using IBM `granite-20b-multilingual` watsonx model based on query provided by the user.


## Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Foundation Models on watsonx](#models)
- [Translate the text based on the query](#translate)
- [Summary](#summary)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://cloud.ibm.com/catalog/services/watson-machine-learning" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href="https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/wml-plans.html?context=wx&audience=wdp" target="_blank" rel="noopener no referrer">here</a>).


### Install dependecies

In [None]:
!pip install -U ibm-watsonx-ai | tail -n 1

### Defining the WML credentials
This cell defines the WML credentials required to work with watsonx Foundation Model inferencing.

**Action:** Provide the IBM Cloud user API key. For details, see <a href="https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui" target="_blank" rel="noopener no referrer">documentation</a>.

In [2]:
import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    url="https://us-south.ml.cloud.ibm.com",
    api_key=getpass.getpass("Please enter your WML api key (hit enter): "),
)

### Defining the project id
The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

In [3]:
import os

try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

<a id="models"></a>
## Foundation Models on `watsonx.ai`

#### List available models

All available models are listed within the <a href="https://ibm.github.io/watsonx-ai-python-sdk/fm_helpers.html#ibm_watsonx_ai.foundation_models.get_model_specs" target="_blank" rel="get_model_specs">`get_model_specs`</a> function.

Additionally, models can be passed as `ModelTypes`. For further details, please consult the <a href="https://ibm.github.io/watsonx-ai-python-sdk/fm_model.html#ibm_watsonx_ai.foundation_models.utils.enums.ModelTypes" target="_blank" rel="noopener no referrer">documentation</a>.


In [4]:
from ibm_watsonx_ai.foundation_models import get_model_specs

[model["model_id"] for model in get_model_specs(credentials["url"])["resources"]]

['bigcode/starcoder',
 'bigscience/mt0-xxl',
 'codellama/codellama-34b-instruct-hf',
 'eleutherai/gpt-neox-20b',
 'google/flan-t5-xl',
 'google/flan-t5-xxl',
 'google/flan-ul2',
 'ibm-mistralai/mixtral-8x7b-instruct-v01-q',
 'ibm/granite-13b-chat-v1',
 'ibm/granite-13b-chat-v2',
 'ibm/granite-13b-instruct-v1',
 'ibm/granite-13b-instruct-v2',
 'ibm/granite-20b-multilingual',
 'ibm/mpt-7b-instruct2',
 'meta-llama/llama-2-13b-chat',
 'meta-llama/llama-2-70b-chat']

You need to specify `model_id` that will be used for inferencing:

In [5]:
model_id = "ibm/granite-20b-multilingual"

### Defining the model parameters

You might need to adjust model `parameters` for different models or tasks, to do so please refer to <a href="https://ibm.github.io/watsonx-ai-python-sdk/fm_model.html#metanames.GenTextParamsMetaNames" target="_blank" rel="noopener no referrer">documentation</a>.

In [6]:
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.SAMPLE,
    GenParams.MAX_NEW_TOKENS: 100,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.TEMPERATURE: 0.5,
    GenParams.TOP_K: 50,
    GenParams.TOP_P: 1,
    GenParams.STOP_SEQUENCES: ["\n"]
}

**Warning:** Delete `GenParams.STOP_SEQUENCES: ["\n"]` parameter if you intend to utilize the model for multiple translations on one prompt.

### Initialize the model
Initialize the `Model` class with previous set params.

In [7]:
from ibm_watsonx_ai.foundation_models import ModelInference

model = ModelInference(
    model_id=model_id, 
    params=parameters, 
    credentials=credentials,
    project_id=project_id)

### Model's details

In [8]:
model.get_details()

{'model_id': 'ibm/granite-20b-multilingual',
 'label': 'granite-20b-multilingual',
 'provider': 'IBM',
 'source': 'IBM',
 'short_description': 'The Granite model series is a family of IBM-trained, dense decoder-only models, which are particularly well-suited for generative tasks.',
 'long_description': 'Granite models are designed to be used for a wide range of generative and non-generative tasks with appropriate prompt engineering. They employ a GPT-style decoder-only architecture, with additional innovations from IBM Research and the open community.',
 'tier': 'class_1',
 'number_params': '20b',
 'min_shot_size': 1,
 'task_ids': ['question_answering',
  'summarization',
  'retrieval_augmented_generation',
  'classification',
  'generation',
  'extraction'],
 'tasks': [{'id': 'question_answering', 'ratings': {'quality': 3}},
  {'id': 'summarization', 'ratings': {'quality': 4}},
  {'id': 'retrieval_augmented_generation', 'ratings': {'quality': 3}},
  {'id': 'classification', 'ratings':

<a id="translate"></a>

## Translate the text based on the query

### English to Spanish translation:

Define query for the model with at least one example, specifically for English to Spanish translation.

**Note:** Model works the best with at least one translation example.

In [9]:
english_to_spanish_query = """Translate the following text from English to Spanish:

Input: So far, I have not been terribly encouraged by the stance adopted by the Commission.
Output: Hasta ahora no me ha animado mucho la postura adoptada por la Comisión.

Input: I am very pleased to see that the joint resolution adopts the suggestion we made.
"""

**Warning:** ensure that there is a line break (newline) at the conclusion of the prompt.

### Generat the English to Spanish translation using IBM `granite-20b-multilingual` model.

In [10]:
translation_result = model.generate_text(english_to_spanish_query)

### Translation result

In [11]:
print(translation_result)

Output: Estoy muy contento de ver que la resolución conjunta adopta la sugerencia que hicimos.


### English to French translation:

Define query for the model with at least one example, specifically for English to French translation.

**Note:** Model works the best with at least one translation example.

In [12]:
english_to_french_query = """Translate the following text from English to French:

Input: Finally, I welcome paragraph 16 which calls for a review of the way we deal with human rights issues in Parliament.
Output: Enfin, je me réjouis du paragraphe 16 qui appelle à une révision de la manière dont nous abordons les questions relatives aux droits de l'homme au sein du Parlement.

Input: I remember very well that we discussed it in a session in Luxembourg.
Output: Je me souviens très bien que nous en avions parlé lors d'une séance à Luxembourg.

Input: If we do not greatly increase the use of intelligent technology, we will not achieve our targets.
"""

**Warning:** ensure that there is a line break (newline) at the conclusion of the prompt.

### Generat the English to French translation using IBM `granite-20b-multilingual` model.

In [13]:
translation_result = model.generate_text(english_to_french_query)

### Translation result

In [14]:
print(translation_result)

Output: Si nous ne faisons pas un usage beaucoup plus important de la technologie intelligente, nous ne parviendrons pas à atteindre nos objectifs.


<a id="summary"></a>
## Summary and next steps

 You successfully completed this notebook!.
 
 You learned how to translate multiple languages with IBM `granite-20b-multilingual` model on watsonx. 
 
 Check out our _<a href="https://ibm.github.io/watson-machine-learning-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Authors: 

 **Mateusz Szewczyk**, Software Engineer at Watson Machine Learning.

Copyright © 2024 IBM. This notebook and its source code are released under the terms of the MIT License.