# Introduction to LLMs
This is the first notebook for the LLM section of Comp 255.  It provides a quick intro to some of the basics of using LLMs.

* Setup required environment
* Pre-trained model experimentation
* Instruction-tuned model experimentation

In [None]:
from llamabot import SimpleBot
pretrained_model = 'llama3.2:1b-text-q5_K_S'
sft_model = "qwen2.5:1.5b"

## Pre-trained models
Right now we're using a simple "pre-trained" version of the Mistral model.  It's just been trained on the language modeling objective; it learns to predict the next word.  As a result, you can see the ouput just continues the input text. 

In [None]:


completer = SimpleBot(
    system_prompt='You are a helpful bot',
  model_name=f"ollama_chat/{pretrained_model}",
  temperature=0.0,
  num_predict=50
)

response = completer('What is the capital of France?')

### Temperature 
Temperature controls how much "randomness" there is in prediction.  Low temperatures makes the model predict likely tokens, resulting in sequences closer to its training data.  High temperatures means the model will predict tokens less like its training data.  Low = more stable, consistent answers.  High = more random answers.  

In [None]:
completer = SimpleBot(
    system_prompt='You are a helpful bot',
  model_name=f"ollama_chat/{pretrained_model}",
  temperature=100.0,
  num_predict=50
)

response = completer('What is the capital of France?')

### Talking to our bot
You can also see that this simple pre-trained model does not do great at having conversations.

In [None]:
response = completer('Hello, how are you?')

## Instruction tuned models
Usually the above won't give us useful answers.  That's because the model is not tuned to produce useful answers, just to predict the next word.  

That's where instruction tuning comes in.  That's covered in the slides, but here we'll re-run some of the above with a model that has been instruction tuned (Llama3).

In [None]:
# note - using default temperature (0.0) and no predict limit
# instruction tuned are better at knowing when to shush
completer = SimpleBot(
    system_prompt='You are a helpful bot',
    model_name=f"ollama_chat/{sft_model}",
)

response = completer('What is the capital of France?')

In [None]:
response = completer('Hello, how are you?')