# Generative AI with Python (with some Machine Learning)

## Introduction

## Text to Image with StableDiffusion

## Large Language Models (LLMs)

### What are Large Language Models?

Large Language Models 

"auto-correct on steroids"

[A short introduction to LLMs](https://www.youtube.com/watch?v=LPZh9BOjkQs)

### Ollama & Generating Text

**Ollama** is a tool that allows us to run LLMs locally. It can be downloaded and used entirely for _free_.

But what does it mean to run something _locally_? That means you're running it _solely_ on your own machine, rather than sending information back and forth with an online service.

This has some key advantages:
- cost
- privacy
- doesn't depend on stable/fast internet access
- peformance isn't affected by how many other people are using the same online services at a given time

To test our Ollama installation, we can see the output from inputting `ollama` in the command line.

It's also possible to do this within Python by using the `subprocess` library. So that's one option...

In [19]:
import subprocess

# Run the `echo` command and capture output
result = subprocess.run(["ollama"], text=True)

print("Output from command line:")
print(result.stdout)

Output from command line:
None


Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.


This gives us a list of commands that we can use with Ollama. For our purposes, we're mainly concerned with being able to pull models, list what models are on our system, and remove the ones we no longer want to use. In a fresh installtion, Ollama comes with zero models, but the script has _pulled_ a few already to make things easier. We can use the ollama-python library to list these models. Of course, the command line works too.

In [3]:
import ollama

ollama.list().models

[Model(model='glm4:latest', modified_at=datetime.datetime(2025, 7, 16, 15, 7, 31, 79556, tzinfo=TzInfo(+01:00)), digest='5b699761eca535dc55047ad9d2dbf54e3b8697709419ef78a70503ed4bfbcf44', size=5455326235, details=ModelDetails(parent_model='', format='gguf', family='chatglm', families=['chatglm'], parameter_size='9.4B', quantization_level='Q4_0')),
 Model(model='deepseek-r1:7b', modified_at=datetime.datetime(2025, 7, 16, 14, 52, 22, 708686, tzinfo=TzInfo(+01:00)), digest='755ced02ce7befdb13b7ca74e1e4d08cddba4986afdb63a480f2c93d3140383f', size=4683075440, details=ModelDetails(parent_model='', format='gguf', family='qwen2', families=['qwen2'], parameter_size='7.6B', quantization_level='Q4_K_M')),
 Model(model='dolphin-phi:latest', modified_at=datetime.datetime(2025, 7, 16, 11, 21, 6, 952843, tzinfo=TzInfo(+01:00)), digest='c5761fc772409945787240af89a5cce01dd39dc52f1b7b80d080a1163e8dbe10', size=1602473850, details=ModelDetails(parent_model='', format='gguf', family='phi2', families=['phi2'

That is quite a bit of information, so we can go through this _return value_ to extract just the information that's more human-friendly.

In [80]:
for model in ollama.list().models:
    print(model.model)

glm4:latest
deepseek-r1:7b
dolphin-phi:latest
cogito:32b
JollyLlama/GLM-4-32B-0414-Q4_K_M:latest
EntropyYue/longwriter-glm4:9b


Now we have a plain list of the models on the system - this shows us what was downloaded (or _pulled_) by running the installation script.

To start with, I'm going to create a _variable_ for storing the name of the model I wish to use. This is going to be a _parameter_ that we give repeatedly to the ollama python library, so it makes sense to write it down once and avoid repeating ourselves.

In [5]:
# dophin-phi is 2.7b
DOLPHIN_PHI = "dolphin-phi"
# this particular deepseek model is 7b
DEEPSEEK = "deepseek-r1:7b"
# glm4 9b version
GLM4 = "glm4:latest"
# moondream
MOONDREAM = "moondream"

A convention when programming in Python is to write constants -- variables that are set once and never changes -- in all-caps. This doesn't affect how your code runs, but it can be nice for making things more ordered. I feel it tells me this bit of information is "important" in some way, while using less mental effort.

In [84]:
from ollama import chat

response = chat(model=DOLPHIN_PHI, messages=[
  {
    'role': 'user',
    'content': 'What is the capital of France?',
  },
])
print(response['message']['content'])

The capital of France is Paris. It is also one of the most populous cities in Europe and well-known for its rich history, art, fashion, and culture. The city is located on the banks of the River Seine. Some popular attractions include the Eiffel Tower, Louvre Museum, Notre-Dame Cathedral, and many more.


### Streaming

In [53]:
stream = chat(
    model=DOLPHIN_PHI,
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

The sky appears blue due to the phenomenon known as Rayleigh scattering. When sunlight enters the Earth's atmosphere, it interacts with molecules in the air such as nitrogen and oxygen. These molecules are much smaller than the wavelength of visible light (around 400-700 nm), which is why they can scatter the shorter-wavelength colors more effectively, like blue.

However, the sky isn't really blue at all. The light from the sun gets scattered in all directions by these molecules in the atmosphere and our eyes perceive this as a "blue" color. When the sunlight passes through Earth's atmosphere, it is broken up into its different colors due to Rayleigh scattering. Blue light has a shorter wavelength, so it is scattered more than the other colors of visible light, which are longer in wavelength. This makes the sky appear blue from our perspective on the ground.

The deeper you look into the sky, the further away the sun is and the longer the path sunlight travels, causing more scattering

### Vision Language Models (VLMs)

Vision Language Models can be used to describe images. Let's try this out with this clown image.

![](../pictures/clown.jpg)

First, we need to load the image. To do this, we need ot make use of the `base64` library as it allows us to convert the image into a format that a VLM can understand.

In [1]:
import base64

# load an image as base64
with open("../pictures/clown.jpg", "rb") as image_file:
    data = base64.b64encode(image_file.read()).decode("utf-8")

Now that the image has been loaded, we can send it to the VLM `moondream`, and ask it to tell us what the image contains.

In [8]:
response = ollama.chat(
    model=MOONDREAM,
    messages=[
        {
            "role": "user",
            "content": "What's in this image?",
            "images": [data], # pass the image in the images field
        },
    ],
)
print(response["message"]["content"])


The image features a clown with a red wig and polka dot suit, standing against a white background. The clown is making a funny face for the camera while holding up his hands as if to mimic a speech bubble or an exaggerated "O" shape.


We can also ask moondream to explain certain details in the image to us.

In [7]:
response = ollama.chat(
    model=MOONDREAM,
    messages=[
        {
            "role": "user",
            "content": "What colour is his nose?",
            "images": [data], # pass the image in the images field
        },
    ],
)
print(response["message"]["content"])


The clown's nose is red.


We can see a list of vision models that work with Ollama here: https://ollama.com/search?c=vision

### Small Language Models

Language Models come in very small sizes too. Some examples include `smollm` and `tinyllama`. While these models are more prone to hallucination, and have more limited "intelligence," they can run quite fast even on less powerful hardware such as Raspberry Pis and computers with older GPUs.

### Hallucination

![](../pictures/how-to-cook-your-dragon.webp)

In [85]:
response = chat(model=DOLPHIN_PHI, messages=[
  {
    'role': 'user',
    'content': 'What are some good cookbooks on how to use dragon meat?',
  },
])
print(response['message']['content'])

1. "Dragon Meat Cookbook" by John Doe - This comprehensive cookbook offers various recipes using dragon meat, along with detailed cooking techniques and nutritional information.
2. "Cooking with Dragon Meat: A Guide for the Adventurous Chef" by Jane Smith - It provides a wide range of creative and delicious recipes that showcase the versatility of dragon meat.
3. "Dragon Cuisine: The Ultimate Guide to Cooking with Dragon Meat" by Michael Johnson - This book not only offers tasty recipes but also delves into the cultural and historical aspects of cooking with dragon meat.
4. "The Dragon's Feast Cookbook" by Sarah Lee - A collection of authentic recipes from different regions that feature dragon meat, making it a great resource for global culinary exploration.
5. "Dragon Meat: Cooking with a Mythical Ingredient" by David Brown - This book focuses on the unique flavors and textures of dragon meat, providing guidance on how to cook it in a variety of dishes.
6. "The Dragon Chef's Cookbook"

### Thinking Models

### Finding the "Best" Model

trade-offs with sensible output and size/speed  
trial and error experimentation

We can create a quick comparison test by asking various models to generate text based on the same prompt, and see which output we like the most.

Firstly, we can take all the models that are on the system right now, and place them in a Python list. This will make things easier in a moment.

In [62]:
models = [DOLPHIN_PHI, DEEPSEEK, GLM4]

Now, we can create a _function_ for sending the same prompt to different models.

In [60]:
def limerick_creator(model: str):
    response = chat(model=model, messages=[
    {
        'role': 'user',
        'content': 'Write a limerick about the nature of time.',
    },
    ])
    
    print("Model:", model)
    print(response['message']['content'])
    print("\n")

Now we can _call_ this function with our different models, and see how the output varies.

In [63]:
for model in models:
    limerick_creator(model)

Model: dolphin-phi
There once was a concept called time,
Its ticking clock could not be chime,
It flowed like a river swift,
Past and future with a shift,
Leaving moments in its wake so bright.


Model: deepseek-r1:7b
<think>
Alright, so I need to write a limerick about the nature of time. Hmm, okay. First off, what is a limerick? From what I remember, it's a five-line poem with an AABBA rhyme scheme. It usually has a playful or rhythmic feel to it and often tells a short story or conveys a light-hearted message.

Now, the topic here is the nature of time. Time can be tricky because it's something we all experience every day but isn't tangible—it's more of an abstract concept. I guess I could approach this in different ways: maybe talking about how time moves us forward without pause, or perhaps the idea that time doesn't care who waits for it.

I should think about imagery related to time—maybe clocks, the ticking sound, seasons passing, aging, or moments rushing by too quickly. Using

## Other ML Tools