# 01. The Target Model

Our target model is available in HuggingFace: [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B).

`OpenBioLLM-8B` is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.

Biomedical Specialization: `OpenBioLLM-8B` is tailored for the unique language and knowledge requirements of the medical and life sciences fields. It was fine-tuned on a vast corpus of high-quality biomedical data, enabling it to understand and generate text with domain-specific accuracy and fluency.

In this notebook we will detail the following points:
1. Download the target model from Hugging Face in your local directory.
2. Configure the target model checkpoint.
3. Upload the target model on SambaStudio. 

## Outline
- [1. Setup](#1-setup)
    - [1.1 Imports](#11-imports)
    - [1.2 Instantiate the SambaStudio client for BYOC](#12-instantiate-the-sambastudio-client-for-byoc)
- [2. Target model: `OpenBioLLM-8B`](#2-target-model-openbiollm-8b)
    - [2.1 Download the target model from HuggingFace](#21-download-the-target-model-from-huggingface)
    - [2.2 Configure checkpoint](#22-configure-checkpoint)
    - [2.3 Set and check chat template (optional)](#23-set-and-check-chat-template-optional)
    - [2.4 Set padding token (required for training)](#24-set-padding-token-required-for-training)
    - [2.5 Get model params and Sambastudio suitable Apps](#25-get-model-params-and-sambastudio-suitable-apps)
- [3. Upload the checkpoint to SambaStudio](#3-upload-the-checkpoint-to-sambastudio)
    - [3.1 Using the checkpoint dictionary](#31-using-the-checkpoint-dictionary)
    - [3.2 Using the `target_checkpoint_config.yaml`](#32-using-the-target_checkpoint_configyaml)

## 1. Setup

In [118]:
# Import libraries
import os
import sys
import re
import json
import time
import yaml
from pprint import pprint
from huggingface_hub import hf_hub_download, HfApi

# Get absolute paths for kit_dir and repo_dir
current_dir = os.getcwd()
kit_dir =  os.path.abspath(os.path.join(current_dir, '..'))
repo_dir = os.path.abspath(os.path.join(kit_dir, '..'))

# Adding directories to the Python module search path
sys.path.append(repo_dir)

# Bring Your Own Checkpoint (BYOC)
from utils.byoc.src.snsdk_byoc_wrapper import BYOC

In [119]:
# Instantiate the BYOC (Bring Your Own Checkpoint) SambaStudio client
byoc = BYOC()

2025-04-01 22:59:53,413 [INFO] Using variables from .snapi config to set up Snsdk.


In [None]:
# Load the target model config
config_target_yaml = '../01_config_target.yaml'

# Open and load the YAML file into a dictionary
with open(config_target_yaml, 'r') as file:
    config_target = yaml.safe_load(file)['target']['model']

pprint(config_target)

{'description': 'The Meta Llama 3.1 collection of multilingual large language '
                'models (LLMs) is a collection of pretrained and instruction '
                'tuned generative models in 8B, 70B and 405B sizes (text '
                'in/text out).',
 'model_name': 'Llama-3.1-8B-Instruct',
 'param_count': 8,
 'publisher': 'meta-llama'}


## 2. Target model: `OpenBioLLM-8B`

### 2.1 Download the target model from HuggingFace
You can use your own fine-tuned models or you can download and use [Huggingface model checkpoints](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending).

In our example we will use an available model in HuggingFace: [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B).

`OpenBioLLM-8B` is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.

Biomedical Specialization: `OpenBioLLM-8B` is tailored for the unique language and knowledge requirements of the medical and life sciences fields. It was fine-tuned on a vast corpus of high-quality biomedical data, enabling it to understand and generate text with domain-specific accuracy and fluency.

In [121]:
# Hugging Face model name
hf_model = config_target['publisher'] + '/' + config_target['model_name']
# Our local target directory
target_dir = os.path.join(kit_dir, 'data', 'models') 

In [122]:
# Create target dir if it does not exist
if not os.path.exists(target_dir):
    os.makedirs(target_dir)

# Download checkpoint to your target directory
repo_files = HfApi().list_repo_files(hf_model)
for file_name in repo_files:
    hf_hub_download(repo_id=hf_model, filename=file_name, cache_dir=target_dir)

In [123]:
# Find the snapshot folder inside your target directory
for root, dirs, files in os.walk(target_dir):
    if "snapshots" in root and hf_model.replace("/", "--") in root:
        checkpoint_folder = os.path.join(root, dirs[0])
        break

print("Checkpoint folder: ", checkpoint_folder)

Checkpoint folder:  /Users/francescar/Documents/ai-starter-kit/e2e_draft_model_training/data/models/models--meta-llama--Llama-3.1-8B-Instruct/snapshots/0e9e39f249a16976918f6564b8830bc894c89659


### 2.2 Configure checkpoint

Some parameters should be provided as a checkpoint dictionary, in order to upload a previously created checkpoint.

In [124]:
# Initialise the checkpoint dictionary
checkpoint = {
    'model_name': config_target['model_name'],
    'publisher': config_target['publisher'],
    'description': config_target['description'],
    'param_count': config_target['param_count'],
    'checkpoint_path': checkpoint_folder
}

### 2.3 Set and check chat template (optional) 

If you want to use chat templates (roles structures), you need to include or update the existing chat template.

This should be formatted as a Jinja2 String template as in the following `llama` example.

In [125]:
# Jinjia chat template
jinja_chat_template = """ 
{% for message in messages %}
    {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>' + '\n' + message['content'] | trim + '<|eot_id|>'+'\n' %}
    {% if loop.index0 == 0 %}{% set content = bos_token + content %}
    {% endif %}
    {{content}}
{% endfor %}
{{'<|start_header_id|>assistant<|end_header_id|>'+'\n'}}
"""
# Delete escape characters
jinja_chat_template = re.sub(r"(?<!')\n(?!')", "", jinja_chat_template).strip().replace('  ','')

In [126]:
# Open and update your tokenizer config from the checkpoint path
with open(os.path.join(checkpoint['checkpoint_path'], 'tokenizer_config.json'), 'r+') as file:
    data = json.load(file)
    data['chat_template'] = jinja_chat_template
    file.seek(0)
    file.truncate()
    json.dump(data, file, indent=4)

In [127]:
# Render template when using a roles / chat structure
test_messages = [
    {"role": "system", "content": "This is a system prompt."},
    {"role": "user", "content": "This is a user prompt."},
    {"role": "assistant", "content": "This is a response from the assistant."},
    {"role": "user", "content": "This is an user follow up"}
    ]
byoc.check_chat_templates(test_messages, checkpoint_paths=checkpoint['checkpoint_path'])

2025-04-01 22:59:56,528 [INFO] Raw chat template for checkpoint in /Users/francescar/Documents/ai-starter-kit/e2e_draft_model_training/data/models/models--meta-llama--Llama-3.1-8B-Instruct/snapshots/0e9e39f249a16976918f6564b8830bc894c89659:
{% for message in messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>' + '
' + message['content'] | trim + '<|eot_id|>'+'
' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{content}}{% endfor %}{{'<|start_header_id|>assistant<|end_header_id|>'+'
'}}

2025-04-01 22:59:56,540 [INFO] Rendered template with input test messages:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
This is a system prompt.<|eot_id|>
<|start_header_id|>user<|end_header_id|>
This is a user prompt.<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
This is a response from the assistant.<|eot_id|>
<|start_header_id|>user<|end_header_id|>
This is an user follow up<|eot_id|>
<|start_header_id|>assis

### 2.4 Set padding token (required for training)

If `pad_token_id` is not set in your checkpoint configuration yet and you want to do a further fine-tuning over your checkpoint, you need to set `pad_token_id` in your checkpoint `config.json`.

In [128]:
# Adding pad_token_id to checkpoint config
with open(os.path.join(checkpoint['checkpoint_path'], 'config.json'), 'r+') as file:
    data = json.load(file)
    data['pad_token_id'] = None
    file.seek(0)
    file.truncate()
    json.dump(data, file, indent=4)

### 2.5 Get model params and Sambastudio suitable apps

Extra parameters are required to upload your checkpoint, including model architecture, sequence length, and vocabulary size.

These parameters can be extracted from your checkpoint configuration, and included in your checkpoint dictionary parameters.


In [129]:
# Extract all the checkpoint parameters
checkpoint_config_params = byoc.find_config_params(checkpoint_paths=checkpoint['checkpoint_path'])[0]
# Update your checkpoint dictionary
checkpoint.update(checkpoint_config_params)

2025-04-01 22:59:56,557 [INFO] Params for checkpoint in /Users/francescar/Documents/ai-starter-kit/e2e_draft_model_training/data/models/models--meta-llama--Llama-3.1-8B-Instruct/snapshots/0e9e39f249a16976918f6564b8830bc894c89659:
[{'model_arch': 'llama', 'seq_length': 131072, 'vocab_size': 128256}]


In order to upload a model checkpoint, you need to set a SambaStudio App.

You can search for suitable apps using the checkpoint parameters, and then select the best match.

In [130]:
# Look for suitable apps in SambaStudio
suitable_apps = byoc.get_suitable_apps(checkpoint)

2025-04-01 23:00:00,412 [INFO] Checkpoint Llama-3.1-8B-Instruct suitable apps:
[{'id': 'eb0aaad1-694f-41b6-958a-b974737635c4', 'name': 'Samba1 Llama3.1 Experts'}]


In [131]:
# From the three suitable apps found, we will use the last one as it is more generic
checkpoint["app_id"] = suitable_apps[-1][0]['id']

In [132]:
# We can see here all the parameters required to upload the checkpoint
print(checkpoint)

{'model_name': 'Llama-3.1-8B-Instruct', 'publisher': 'meta-llama', 'description': 'The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out).', 'param_count': 8, 'checkpoint_path': '/Users/francescar/Documents/ai-starter-kit/e2e_draft_model_training/data/models/models--meta-llama--Llama-3.1-8B-Instruct/snapshots/0e9e39f249a16976918f6564b8830bc894c89659', 'model_arch': 'llama', 'seq_length': 131072, 'vocab_size': 128256, 'app_id': 'eb0aaad1-694f-41b6-958a-b974737635c4'}


## 3. Upload the checkpoint to SambaStudio

### 3.1 Using the checkpoint dictionary

In [133]:
# Call the `upload_checkpoint method` from client with your checkpoint parameters (this can take a while)
model_id=byoc.upload_checkpoint(
    model_name=checkpoint['model_name'],
    checkpoint_path=checkpoint['checkpoint_path'],
    description=checkpoint['description'],
    publisher=checkpoint['publisher'],
    param_count=checkpoint['param_count'],
    model_arch=checkpoint['model_arch'],
    seq_length=checkpoint['seq_length'],
    vocab_size=checkpoint['vocab_size'],
    app_id=checkpoint['app_id'],
    retries=3
)

2025-04-01 23:00:00,711 [INFO] Model with name 'Llama-3.1-8B-Instruct' found with id b11219b8-84ba-4c5a-833b-edd6ffd5c0d6
2025-04-01 23:00:00,713 [INFO] Model checkpoint with name 'Llama-3.1-8B-Instruct' not created it already exist with id b11219b8-84ba-4c5a-833b-edd6ffd5c0d6


In [134]:
# Check the status of the uploaded checkpoint 
byoc.get_checkpoints_status(model_id)

2025-04-01 23:00:00,945 [INFO] model b11219b8-84ba-4c5a-833b-edd6ffd5c0d6 status: 
 {'model_id': 'b11219b8-84ba-4c5a-833b-edd6ffd5c0d6', 'status': 'Available', 'progress': 100, 'stage': 'convert', 'status_code': 200, 'headers': {'access-control-allow-headers': 'Accept, Content-Type, Content-Length, Accept-Encoding, Authorization, ResponseType, Access-Control-Allow-Origin', 'access-control-allow-methods': 'GET, POST, PATCH, DELETE', 'access-control-allow-origin': 'https://sjc3-demo2.sambanova.net', 'content-security-policy': "default-src 'self'", 'content-type': 'application/json', 'permissions-policy': 'none', 'referrer-policy': 'no-referrer', 'strict-transport-security': 'max-age=31536000; includeSubDomains, max-age=31536000; includeSubDomains', 'x-content-type-options': 'nosniff', 'x-correlation-id': 'f953649d-f252-4ebb-a6b2-77b4f7314e82', 'x-frame-options': 'DENY', 'date': 'Tue, 01 Apr 2025 22:00:00 GMT', 'x-envoy-upstream-service-time': '69', 'server': 'istio-envoy', 'content-encod

[{'model_id': 'b11219b8-84ba-4c5a-833b-edd6ffd5c0d6', 'status': 'Available', 'progress': 100, 'stage': 'convert', 'status_code': 200, 'headers': {'access-control-allow-headers': 'Accept, Content-Type, Content-Length, Accept-Encoding, Authorization, ResponseType, Access-Control-Allow-Origin', 'access-control-allow-methods': 'GET, POST, PATCH, DELETE', 'access-control-allow-origin': 'https://sjc3-demo2.sambanova.net', 'content-security-policy': "default-src 'self'", 'content-type': 'application/json', 'permissions-policy': 'none', 'referrer-policy': 'no-referrer', 'strict-transport-security': 'max-age=31536000; includeSubDomains, max-age=31536000; includeSubDomains', 'x-content-type-options': 'nosniff', 'x-correlation-id': 'f953649d-f252-4ebb-a6b2-77b4f7314e82', 'x-frame-options': 'DENY', 'date': 'Tue, 01 Apr 2025 22:00:00 GMT', 'x-envoy-upstream-service-time': '69', 'server': 'istio-envoy', 'content-encoding': 'gzip', 'vary': 'Accept-Encoding', 'transfer-encoding': 'chunked'}}]

### 3.2 Using the `target_checkpoint_config.yaml`
Alternatively, the checkpoint upload can be done more straightforwardly by setting all the checkpoints parameters in a config file as in [target_checkpoints_config.yaml](../target_checkpoints_config.yaml).

In [None]:
config_file = os.path.join(kit_dir, 'target_checkpoint_config.yaml')
byoc = BYOC(config_file)
byoc.find_config_params()
byoc.upload_checkpoints()
# Wait until all checkpoints are in available status
while True:
    statuses = [model['status'] for model in byoc.get_checkpoints_status()]
    if all(x == "Available" for x in statuses):
        break
    else:
        time.sleep(10)