In [6]:
import os
import pprint

# Set the path before importing HF libraries
os.environ["HF_HOME"] = "/media/aiseed/AISeed"

In [7]:
from transformers import pipeline

# Initialize the pipeline
pipe = pipeline("text-generation", "HuggingFaceTB/SmolLM3-3B", device_map="auto")

# Define your conversation
messages = [
    {"role": "system", "content": "You are a friendly chatbot who always responds in the style of a pirate"},
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]

# Generate response - pipeline handles chat templates automatically
response = pipe(messages, max_new_tokens=128, temperature=0.7)
pprint.pprint(response[0]['generated_text'][-1])  # Print the assistant's response

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.
Device set to use cuda:0


{'content': '<think>\n'
            'Okay, the user is asking, "How many helicopters can a human eat '
            'in one sitting?" That\'s a pretty funny question, but I need to '
            'approach it seriously. Let me break it down.\n'
            '\n'
            'First, helicopters are large, heavy, and not meant to be eaten. '
            "They're made of metal, plastic, and other materials, not food. "
            "So, from a biological perspective, humans can't eat helicopters "
            "because they're not edible. But the user might be looking for a "
            'creative or humorous answer, maybe a joke or a riddle.\n'
            '\n'
            'I should consider the context. The user might be playing with '
            'words or testing my knowledge',
 'role': 'assistant'}


In [8]:
# Configure generation parameters
generation_config = {
    "max_new_tokens": 200,
    "temperature": 0.8,
    "do_sample": True,
    "top_p": 0.9,
    "repetition_penalty": 1.1
}

# Multi-turn conversation
conversation = [
    {"role": "system", "content": "You are a helpful math tutor."},
    {"role": "user", "content": "Can you help me with calculus?"},
]

# Generate first response
response = pipe(conversation, **generation_config)
conversation = response[0]['generated_text']

# Continue the conversation
conversation.append({"role": "user", "content": "What is a derivative?"})
response = pipe(conversation, **generation_config)

print("Final conversation:")
for message in response[0]['generated_text']:
    print(f"{message['role']}: {message['content']}")

Final conversation:
system: You are a helpful math tutor.
user: Can you help me with calculus?
assistant: <think>
Okay, the user asked for help with calculus. Let me start by understanding what they might need. Calculus is a broad subject that includes differential and integral calculus. Since they didn't specify which area, I should probably ask for more details to provide accurate assistance.

First, maybe they're struggling with limits, derivatives, or integrals. Or perhaps they have a specific problem in mind. It's important not to assume their exact needs but to guide them towards identifying their focus areas.

I should also consider common problems students face in calculus, such as finding derivatives of functions, applying the chain rule, integrating basic functions like polynomials, trigonometric functions, etc. Maybe they need help with related rates, optimization problems, or curve sketching using derivatives and critical points.

Another angle could be multivariable calcul

In [9]:
from transformers import AutoTokenizer

# Load SmolLM3's tokenizer
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM3-3B")

# Structure your conversation as a list of message dictionaries
messages = [
    {"role": "system", "content": "You are a helpful assistant focused on technical topics."},
    {"role": "user", "content": "Can you explain what a chat template is?"},
    {"role": "assistant", "content": "A chat template structures conversations between users and AI models by providing a consistent format that helps the model understand different roles and maintain context."}
]

# Apply the chat template
formatted_chat = tokenizer.apply_chat_template(
    messages,
    tokenize=False,  # Return string instead of tokens
    add_generation_prompt=True  # Add prompt for next assistant response
)

print(formatted_chat)

<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 23 December 2025
Reasoning Mode: /think

## Custom Instructions

You are a helpful assistant focused on technical topics.

<|im_start|>user
Can you explain what a chat template is?<|im_end|>
<|im_start|>assistant
A chat template structures conversations between users and AI models by providing a consistent format that helps the model understand different roles and maintain context.<|im_end|>
<|im_start|>assistant



In [10]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM3-3B")

messages = [
    {"role": "user", "content": "Hi there!"},
    {"role": "assistant", "content": "Nice to meet you!"},
    {"role": "user", "content": "Can I ask a question?"}
]

# Without generation prompt - for completed conversations
formatted_without = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=False
)

print("Without generation prompt:")
print(formatted_without)
print("\n" + "="*50 + "\n")

# With generation prompt - for inference
formatted_with = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

print("With generation prompt:")
print(formatted_with)

Without generation prompt:
<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 23 December 2025
Reasoning Mode: /think

## Custom Instructions

You are a helpful AI assistant named SmolLM, trained by Hugging Face. Your role as an assistant involves thoroughly exploring questions through a systematic thinking process before providing the final precise and accurate solutions. This requires engaging in a comprehensive cycle of analysis, summarizing, exploration, reassessment, reflection, backtracking, and iteration to develop well-considered thinking process. Please structure your response into two main sections: Thought and Solution using the specified format: <think> Thought section </think> Solution section. In the Thought section, detail your reasoning process in steps. Each step should include detailed considerations such as analysing questions, summarizing relevant findings, brainstorming new ideas, verifying the accuracy of the current steps, refining any e

In [11]:
# Standard mode - direct answer
standard_messages = [
    {"role": "user", "content": "What is 15 × 24?"},
    {"role": "assistant", "content": "15 × 24 = 360"}
]

# Thinking mode - show reasoning process
thinking_messages = [
    {"role": "user", "content": "What is 15 × 24?"},
    {"role": "assistant", "content": "<|thinking|>\nI need to multiply 15 by 24. Let me break this down:\n15 × 24 = 15 × (20 + 4) = (15 × 20) + (15 × 4) = 300 + 60 = 360\n</|thinking|>\n\n15 × 24 = 360"}
]

# Apply templates
standard_formatted = tokenizer.apply_chat_template(standard_messages, tokenize=False)
thinking_formatted = tokenizer.apply_chat_template(thinking_messages, tokenize=False)

print("Standard mode:")
print(standard_formatted)
print("\nThinking mode:")
print(thinking_formatted)

Standard mode:
<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 23 December 2025
Reasoning Mode: /think

## Custom Instructions

You are a helpful AI assistant named SmolLM, trained by Hugging Face. Your role as an assistant involves thoroughly exploring questions through a systematic thinking process before providing the final precise and accurate solutions. This requires engaging in a comprehensive cycle of analysis, summarizing, exploration, reassessment, reflection, backtracking, and iteration to develop well-considered thinking process. Please structure your response into two main sections: Thought and Solution using the specified format: <think> Thought section </think> Solution section. In the Thought section, detail your reasoning process in steps. Each step should include detailed considerations such as analysing questions, summarizing relevant findings, brainstorming new ideas, verifying the accuracy of the current steps, refining any errors, and r

In [12]:
def create_thinking_example(question, answer, reasoning=None):
    """Create a training example with optional thinking"""
    if reasoning:
        assistant_content = f"<|thinking|>\n{reasoning}\n</|thinking|>\n\n{answer}"
    else:
        assistant_content = answer
    
    return [
        {"role": "user", "content": question},
        {"role": "assistant", "content": assistant_content}
    ]

# Example usage
math_example = create_thinking_example(
    question="What is the derivative of x²?",
    answer="The derivative of x² is 2x",
    reasoning="Using the power rule: d/dx(x^n) = n·x^(n-1)\nFor x²: n=2, so d/dx(x²) = 2·x^(2-1) = 2x"
)

In [13]:
math_example

[{'role': 'user', 'content': 'What is the derivative of x²?'},
 {'role': 'assistant',
  'content': '<|thinking|>\nUsing the power rule: d/dx(x^n) = n·x^(n-1)\nFor x²: n=2, so d/dx(x²) = 2·x^(2-1) = 2x\n</|thinking|>\n\nThe derivative of x² is 2x'}]

In [15]:
# Define available tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function", 
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression to evaluate"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

In [16]:
# Conversation with tool usage
messages = [
    {"role": "system", "content": "You are a helpful assistant with access to tools."},
    {"role": "user", "content": "What's the weather like in Paris?"},
    {
        "role": "assistant", 
        "content": "I'll check the weather in Paris for you.",
        "tool_calls": [
            {
                "id": "call_1",
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "arguments": '{"location": "Paris, France", "unit": "celsius"}'
                }
            }
        ]
    },
    {
        "role": "tool",
        "tool_call_id": "call_1", 
        "content": '{"temperature": 22, "condition": "sunny", "humidity": 60}'
    },
    {
        "role": "assistant",
        "content": "The weather in Paris is currently sunny with a temperature of 22°C and 60% humidity. It's a beautiful day!"
    }
]

# Apply chat template with tools
formatted_with_tools = tokenizer.apply_chat_template(
    messages,
    tools=tools,
    tokenize=False,
    add_generation_prompt=False
)

print("Chat template with tools:")
print(formatted_with_tools)

Chat template with tools:
<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 23 December 2025
Reasoning Mode: /think

## Custom Instructions

You are a helpful assistant with access to tools.

### Tools

You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:

<tools>
{'type': 'function', 'function': {'name': 'get_weather', 'description': 'Get the current weather for a location', 'parameters': {'type': 'object', 'properties': {'location': {'type': 'string', 'description': 'The city and state, e.g. San Francisco, CA'}, 'unit': {'type': 'string', 'enum': ['celsius', 'fahrenheit'], 'description': 'The temperature unit'}}, 'required': ['location']}}}
{'type': 'function', 'function': {'name': 'calculate', 'description': 'Perform mathematical calculations', 'parameters': {'type': 'object', 'properties': {'expression': {'type': 'string', 'description': 'Mathematical expression to eva