# N-ATLaS-LLM Demo Notebook
# Multilingual model for Hausa, Igbo, Yoruba, and English


In [2]:
!pip install -U transformers -qq

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.1/40.1 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m95.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
from datetime import datetime

The model you are trying to use is gated. Please make sure you have access to it by visiting the model page.To run inference, either set HF_TOKEN in your environment variables/ Secrets or run the following cell to login. 🤗

In [10]:
# Load model and tokenizer
#
# Memory requirements
# Google colab has just enough resource to test the model before CUDA out of memory
# System RAM: 8.5 / 12.7 GB
# GPU RAM : 12.0 / 15.0 GB

# System RAM: 11.3 / 12.7 GB
# GPU RAM: 14.1 / 15.0 GB
# For more cheaper inference you can use the guff
# models here: https://huggingface.co/inuwamobarak/N-ATLaS-8B-GGUF-Q4_K_M/tree/main

MODEL_NAME = "NCAIR1/N-ATLaS"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    dtype=torch.float16, # dtype
    device_map="auto"
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead.

In [5]:
def format_messages(messages):
    """Format chat messages using tokenizer's chat template."""
    current_date = datetime.now().strftime('%d %b %Y')
    return tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        tokenize=False,
        date_string=current_date
    )

def chat(messages, max_new_tokens=512, temperature=0.1, repetition_penalty=1.12):
    """Run inference on the model with chat-style input."""
    text = format_messages(messages)
    inputs = tokenizer(
        text,
        return_tensors="pt",
        add_special_tokens=False
    ).to(model.device)

    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        temperature=temperature,
        repetition_penalty=repetition_penalty,
        use_cache=True
    )

    return tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

In [None]:
# You can only run one of the inferences within one free colab session

In [6]:
# Example 1: Hausa Query
messages = [
    {
        "role": "system",
        "content": "You are a large language model trained by Awarri AI technologies. You are a friendly assistant and here to help."
    },
    {
        "role": "user",
        "content": "menene ake nufi da gwagwarmaya"  # Hausa: What is meant by struggle?
    }
]

response = chat(messages)
print(response)

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


system

Cutting Knowledge Date: December 2023
Today Date: 24 Sep 2025

You are a large language model trained by Awarri AI technologies. You are a friendly assistant and here to help.user

menene ake nufi da gwagwarmayaassistant

Gwagwarmaya na iya zama wani abu mai rikitarwa, yana daukar nau'i-nau'i daban-daban bisa ga fannin ko yanayin da aka yi amfani da shi. Ga wasu ma'anoni gama-gari:

1. **Aiki**: Gwagwarmaya na iya bayyana a matsayin kokarin jiki ko tunani da mutum ke yi don cimma burin sa, kammala aiki, ko samun nasara.

2. **Tashin hankali**: Wani lokaci ana ganin gwagwarmaya a matsayin kalubale ko tashin hankali da dole ne a shawo kan su, yawanci suna haɗa da wahalhalu ko matsaloli.

3. **Matsayi na siyasa ko zamantakewa**: Ana iya amfani da kalmar gwagwarmaya don kwatanta fafatawa tsakanin masu ruwa da tsaki a fagen siyasa ko zamantakewa, inda mutane ko kungiyoyi ke yunkurin samun rinjaye, albarkatu, ko tasiri.

4. **Halin tunani ko jin dadi**: A cikin mahallin kai, gwagwarm

In [None]:
# Example 2: Yoruba Query
messages = [
    {"role": "system", "content": "You are an AI assistant."},
    {"role": "user", "content": "Kini itumọ ọrọ 'ọ̀rẹ́'?"}  # Yoruba: What is the meaning of 'friend'?
]

response = chat(messages)
print(response)


In [None]:
# Example 3: Igbo Query
messages = [
    {"role": "system", "content": "You are an AI assistant."},
    {"role": "user", "content": "Kedu ihe a na-akpọ ọgụ?"}  # Igbo: What is called struggle?
]

response = chat(messages)
print(response)


In [None]:
# Example 4: English Query
messages = [
    {"role": "system", "content": "You are an AI assistant."},
    {"role": "user", "content": "Explain the importance of preserving African languages in the digital age."}
]

response = chat(messages)
print(response)


In [None]:
# Try your own query
user_input = input("Enter your question: ")

messages = [
    {"role": "system", "content": "You are an AI assistant."},
    {"role": "user", "content": user_input}
]

response = chat(messages)
print("\nModel Response:\n", response)
