# Text generation with LumiOpen/Poro-34B model

We are going to start with a so-called "base model", pre-trained [**LumiOpen/Poro-34B**](https://huggingface.co/LumiOpen/Poro-34B). The task is to generate text with it. We are going to use API provided by Aitta inference platform. We need a Python client `aitta-client` to be able to use it. It has been already installed to this workspace. To see it's documentation, visit [PyPi](https://pypi.org/project/aitta-client/).

We also need the API key. You can create it after logging into [Aitta](https://staging-aitta.2.rahtiapp.fi/public).

<details open>
<summary>Introduction for API key creation</summary>
<br>
1. Log in to the web frontend  
<br>
2. Navigate to the model page of the model for which to generate the token  
<br>
3. Open the tab titled 'API Key'  
<br>
4. Generate and copy the token   
</details>

After this we call your API key an "access token". We use it to configure Aitta client. Then it is possible to load model for usage. 

*You can save you API key for a safe place to be used in the future since it is valid over 80 days. Note that API keys are **model specific** at the moment.*

**Let's start by loading library `aitta-client` and configuring Client-instance.**

In [None]:
# Install the aitta-client library
!pip install --upgrade aitta-client

# Alternatively, you can install all the libraries listed in the requirements.txt file
# !pip install -r requirements.txt

In [None]:
# Set your personal model-specific API key here  
api_key = "<API-key>"

# Security Note:  
# In a typical setup, API keys should be stored securely using environment variables or secret management tools.  
# However, since this temporary Jupyter notebook expires in 4 hours and does not support environment variables,  
# we define the API key directly in this cell.  

# To keep your API key safe, consider removing or clearing this cell after execution.

In [None]:
# Import needed libraries and modules
from aitta_client import Model, Client, StaticAccessTokenSource

# configure Client instance with API URL and access token
token_source = StaticAccessTokenSource(api_key)
client = Client("https://api-staging-aitta.2.rahtiapp.fi", token_source)

## Loading the model and some Jupyter notebook tips

At the moment, only model-specific tokens are available. We can still look and see which models are available through created API key. 

**See available methods for the created client-instance**  
* You can see available methods for created **client-instance** in Jupyter notebook code shell using `Tab` completion. Type the instance name followed by dot - like this `client.` -  and press `Tab-button` to see available methods. We use method `get_model_list` to see available models. 

**View method paramaters:**
*  Option 1: To get detailed information about the method, including its parameters and docstring, you can append a `?` to the method name. For example write `client.get_model_list?` to a code cell and run it.
* Option 2: When you type a method, like `client.get_model_list(`, and then press `Shift + Tab`, Jupyter will display a tooltip showing the method signature, including its parameters, expected types, and a short description if available. This is really helpful to quickly see what the method requires.


In [None]:
### Try out here ####

In [None]:
# Use the get_model_list method to retrieve the list of models
model_list = client.get_model_list()

# Iterate through the model list and print the model names/IDs
for model_id in model_list:
    print(model_id.id)  

In [None]:
# Load the LumiOpen/Poro model
model = Model.load("LumiOpen/Poro", client)

print(model.description)

In [None]:
# declare inputs and parameters for a text completion inference task
inputs = {
    'input': 'Suomen paras kaupunki on'
}

params = {
    'do_sample': True,
    'max_new_tokens': 20
}

print(f"INPUT:\n{inputs}")

# start the inference and wait for completion
result = model.start_and_await_inference(inputs, params)
print(f"OUTPUT:\n{result}")


## Fine-Tuning generation parameters

You can customize text generation by adding parameters to the `params` dictionary in the example code. The following options are currently supported when using the `start_and_await_inference` method and align with those used in [Hugging Face’s Transformers module](https://huggingface.co/docs/transformers/main_classes/text_generation) for text generation:

**Controlling output length**
* max_new_tokens
* min_new_tokens
* min_length
* max_length

**Adjusting the generation strategy**
* do_sample: 
* num_beams
* top_k

**Modifying model output behavior**
* temperature
* top_p
 
For more details on how these parameters work—including their minimum and maximum values, data types, and how certain parameters override others—refer to the [Hugging Face’s documentation](https://huggingface.co/docs/transformers/main_classes/text_generation).

In [None]:
# declare inputs and parameters for a text completion inference task
inputs = {
    'input': 'Suomen paras kaupunki on'
}

params = {
    'do_sample': True,
    'max_new_tokens': 20,
    #### Add parameters here to test how they affect generated response ####
}

print(f"INPUT:\n{inputs}")

# start the inference and wait for completion
result = model.start_and_await_inference(inputs, params)
print(f"OUTPUT:\n{result}")
