# Using LLMs in a conversational style

This section explores how we can create a simple LLM chat assistant. I.e., to programmatically write a simple chatbot using the [transformers](https://huggingface.co/docs/transformers/v4.17.0/en/installation) Python library to interact with in a conversational style similar to well-known chat assistants such as [ChatGPT](https://chatgpt.com/), [Claude](https://claude.ai/), [Gemini](https://gemini.google.com/app), and [DeepSeek](https://chat.deepseek.com/sign_in).

## 1. Import libraries

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

import warnings
warnings.filterwarnings("ignore")

  from .autonotebook import tqdm as notebook_tqdm


## 2. Model setup

In [2]:
# Pick a model (uncomment the one you wish to use)
# model_id = "HuggingFaceTB/SmolLM2-135M" # base model
model_id = "HuggingFaceTB/SmolLM2-135M-Instruct" # fine-tuned assistant model
# model_id = "HuggingFaceTB/SmolLM3-3B-Base" # base model
# model_id = "HuggingFaceTB/SmolLM3-3B" # fine-tuned assistant model
# model_id = "meta-llama/Llama-3.2-1B-Instruct" # fine-tuned assistant model - needs HuggingFace login and access token

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Set pad_token_id to eos_token_id to avoid unncessary warning messages
if tokenizer.pad_token_id is None:
    tokenizer.pad_token = tokenizer.eos_token
    model.config.pad_token_id = tokenizer.eos_token_id

## 3. Initialise inference pipeline

In [3]:
# Build text-generation inference pipeline
chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")

Device set to use mps:0


## 4. Create function to generate response

In [4]:
# Initialise conversation history (keep track of all messages)
history = []

# Function to handle chatting (predicting next tokens and returning response)
def chat(user_input):
    global history
    history.append({"role": "user", "content": f"{user_input}"})
    response = chatbot(history, max_new_tokens=100, do_sample=True, top_k=20, temperature=0.7)[0]["generated_text"]
    response_text = response[len(response)-1]['content']
    history.append({"role": "assistant", "content": f"{response_text}"})    
    return response_text

## 5. Call chatbot interaction

In [None]:
# Simulate chatbot interaction
while True:
    user_input = input("You: ")
    if user_input.lower() in ["quit", "exit"]: # to exit
        break
    print("Bot:", chat(user_input))
    print()

You:  Who is Railen Ackerby?


Bot: Railen Ackerby, a remarkable author and author of children's books, is a celebrated author, illustrator, and educator. Born in Wales, Ackerby grew up in a small town in Wales, where he learned to draw early. He later studied art at the Royal School of Art in Wales before moving to the United Kingdom to work as an illustrator and illustrator assistant for children's books.

Ackerby's book-length works include "The Magic Tree House," "D



In [None]:
print(history)