# Chat with Your SFT-Trained GPT Model

This notebook allows you to interact with your fine-tuned GPT model in a conversational format. 
The model has been trained using Supervised Fine-Tuning (SFT) to respond appropriately to user messages.

## Instructions

1. **Load your trained model**: Update the model path in the first cell to point to your SFT-trained model
2. **Start chatting**: Use the chat interface to have conversations with your model
3. **Experiment**: Try different conversation styles, questions, and topics

## Features

- **Single-turn conversations**: Ask questions and get responses
- **Multi-turn conversations**: Maintain context across multiple exchanges
- **Adjustable parameters**: Control temperature and response length
- **Conversation history**: View the full conversation context

Enjoy chatting with your AI assistant! ü§ñ


In [None]:
# Import necessary libraries
import torch
import sft
import gpt
from transformers import AutoTokenizer
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ Libraries imported successfully!")


In [None]:
# Configuration - UPDATE THESE PATHS FOR YOUR MODEL

# Path to your SFT-trained model checkpoint
MODEL_PATH = "models/sft-models/sft-gpt-6000-step.pth"

# Model configuration (should match your training configuration)
MODEL_CONFIG = { 
    "vocab_size": 50262,
    "context_length": 1024,
    "emb_dim": 512,
    "n_heads": 8,
    "n_layers": 12,
    "drop_rate": 0.1,
}

# Generation parameters
MAX_NEW_TOKENS = 150
TEMPERATURE = 0.7

print(f"üìÅ Model path: {MODEL_PATH}")
print(f"‚öôÔ∏è  Model config: {MODEL_CONFIG}")
print(f"üéõÔ∏è  Max tokens: {MAX_NEW_TOKENS}, Temperature: {TEMPERATURE}")


In [None]:
# Load tokenizer and model
print("üîÑ Loading tokenizer...")
tokenizer = gpt.setup_tokenizer()
print(f"‚úÖ Tokenizer loaded! Vocab size: {tokenizer.vocab_size}")

print("üîÑ Loading model...")
model = sft.load_pretrained_model(MODEL_PATH, MODEL_CONFIG)

# Move to GPU if available
device = 'cuda:5' if torch.cuda.is_available() else 'cpu'
model = model.to(device)
print(f"‚úÖ Model loaded and moved to {device}!")


## Single-Turn Chat Interface

Use the function below to have individual conversations with your model. Each call is independent.


In [None]:
def chat_once(user_message, max_tokens=MAX_NEW_TOKENS, temperature=TEMPERATURE):
    """
    Have a single conversation turn with the model.

    Args:
        user_message: Your message to the model
        max_tokens: Maximum number of tokens to generate
        temperature: Sampling temperature (higher = more creative)

    Returns:
        Model's response
    """
    print(f"üë§ You: {user_message}")
    print("ü§ñ Assistant: ", end="")

    response = sft.generate_chat_response(
        model=model,
        tokenizer=tokenizer,
        user_message=user_message,
        max_new_tokens=max_tokens,
        temperature=temperature
    )

    print(response)
    print("\n" + "="*50 + "\n")
    return response

# Test the chat function
print("üß™ Testing chat function...")
test_response = chat_once("Hello! How are you today?")
print("‚úÖ Chat function working!")


## Multi-Turn Chat Interface

Use the class below to maintain conversation context across multiple turns.


In [None]:
class ChatSession:
    """
    A chat session that maintains conversation history.
    """

    def __init__(self, model, tokenizer, max_tokens=MAX_NEW_TOKENS, temperature=TEMPERATURE):
        self.model = model
        self.tokenizer = tokenizer
        self.max_tokens = max_tokens
        self.temperature = temperature
        self.conversation_history = []

    def add_message(self, role, content):
        """Add a message to the conversation history."""
        self.conversation_history.append({"role": role, "content": content})

    def chat(self, user_message):
        """
        Send a message and get a response, maintaining conversation context.

        Args:
            user_message: Your message to the model

        Returns:
            Model's response
        """
        # Add user message to history
        self.add_message("user", user_message)

        print(f"üë§ You: {user_message}")
        print("ü§ñ Assistant: ", end="")

        # Generate response using conversation history
        response = sft.generate_multi_turn_response(
            model=self.model,
            tokenizer=self.tokenizer,
            conversation_history=self.conversation_history[:-1],  # Exclude the current user message
            max_new_tokens=self.max_tokens,
            temperature=self.temperature
        )

        # Add assistant response to history
        self.add_message("assistant", response)

        print(response)
        print("\n" + "="*50 + "\n")
        return response

    def clear_history(self):
        """Clear the conversation history."""
        self.conversation_history = []
        print("üóëÔ∏è  Conversation history cleared!")

    def show_history(self):
        """Display the full conversation history."""
        print("üìú Conversation History:")
        print("="*50)
        for i, msg in enumerate(self.conversation_history):
            role_emoji = "üë§" if msg["role"] == "user" else "ü§ñ"
            print(f"{role_emoji} {msg['role'].title()}: {msg['content']}")
            if i < len(self.conversation_history) - 1:
                print("\n")
        print("="*50)

    def set_parameters(self, max_tokens=None, temperature=None):
        """Update generation parameters."""
        if max_tokens is not None:
            self.max_tokens = max_tokens
        if temperature is not None:
            self.temperature = temperature
        print(f"‚öôÔ∏è  Updated parameters: max_tokens={self.max_tokens}, temperature={self.temperature}")

# Create a chat session
chat_session = ChatSession(model, tokenizer)
print("‚úÖ Chat session created!")


## Interactive Chat Examples

Try these example conversations to test your model!


In [None]:
# Example 1: Single-turn conversation
print("üéØ Example 1: Single-turn conversation")
chat_once("What is machine learning?")


In [None]:
# Example 2: Multi-turn conversation
print("üéØ Example 2: Multi-turn conversation")

# Start a new conversation
chat_session.clear_history()

# First turn
chat_session.chat("Hi! I'm learning about AI. Can you help me?")

# Second turn (maintains context)
chat_session.chat("What's the difference between supervised and unsupervised learning?")

# Third turn (continues the conversation)
chat_session.chat("Can you give me an example of each?")

# Show the full conversation history
chat_session.show_history()


In [None]:
# Example 3: Creative conversation
print("üéØ Example 3: Creative conversation")

# Adjust parameters for more creative responses
chat_session.set_parameters(max_tokens=200, temperature=0.9)

# Start a creative conversation
chat_session.clear_history()
chat_session.chat("Write a short story about a robot learning to paint.")
chat_session.chat("What happens next in the story?")

# Show the creative conversation
chat_session.show_history()


## Your Turn to Chat!

Now it's your turn to experiment with your model. Try different types of conversations:

- **Questions**: Ask about various topics
- **Creative tasks**: Ask for stories, poems, or creative writing
- **Problem solving**: Present problems and ask for solutions
- **Role-playing**: Have the model take on different personas
- **Multi-turn**: Build complex conversations with context

Use the functions below to chat with your model!


In [None]:
# Single-turn chat - try your own messages!
chat_once("Your message here!")


In [None]:
# Multi-turn chat - start a conversation!
chat_session.clear_history()
chat_session.chat("Your first message here!")


In [None]:
# Continue the conversation
chat_session.chat("Your follow-up message here!")


In [None]:
# Adjust parameters for different response styles
print("üéõÔ∏è  Adjusting parameters...")

# More conservative responses
chat_session.set_parameters(max_tokens=100, temperature=0.3)
chat_session.chat("Explain quantum computing simply.")

# More creative responses
chat_session.set_parameters(max_tokens=200, temperature=1.0)
chat_session.chat("Write a haiku about programming.")


In [None]:
# View your conversation history
chat_session.show_history()


## Tips for Better Conversations

### Parameter Tuning
- **Temperature (0.1 - 1.0)**: Lower = more focused, Higher = more creative
- **Max tokens (50 - 500)**: Controls response length

### Conversation Strategies
- **Be specific**: Clear questions get better answers
- **Provide context**: Multi-turn conversations work better with context
- **Experiment**: Try different conversation styles and topics
- **Iterate**: Adjust parameters based on the responses you get

### Common Issues
- **Short responses**: Increase `max_tokens`
- **Repetitive responses**: Adjust `temperature`
- **Off-topic responses**: Provide more context or be more specific

## Model Information

Your model was trained using:
- **Architecture**: GPT with RoPE positional encoding
- **Training**: Supervised Fine-Tuning on conversational data
- **Special tokens**: `<|user|>`, `<|assistant|>`, `<|end|>`
- **Context length**: 1024 tokens

Enjoy exploring your AI assistant! üöÄ
