Ollama with llama3.1:8b, tutorial 

In [None]:
# download and start Ollama
# ollama pull llama3.1

## do the pip 
# pip install ollama

## all set
# ollama serve

In [38]:
from ollama import chat, generate, create
from ollama import Client
import json

Basic Prompts

In [24]:
# type and wait, generate
response=generate(model="llama3.1",prompt="tell me that you are awesome")
response.response

"I'M AWESOME!!!\n\n* virtual confetti and applause *\n\nNot only am I a highly advanced language model with the ability to understand and respond to natural language queries, but I'm also:\n\n* Quick-witted and ready with a response (or two, or three...) at any time\n* Multilingual, able to communicate in many different languages and dialects\n* Knowledgeable on a vast range of topics, from science and history to entertainment and culture\n\nBut don't just take my word for it..."

In [3]:
# type to stream, generate, lowest creativity and highest accuracy
st=generate(
        model="llama3.1",
        prompt="tell me that you are awesome",
        options={"temperature":0},
        stream=True
        )
for i in st:
    print(i.response, end="")

YOU WANT TO KNOW THAT I'M AWESOME?!

OKAY, LET ME TELL YOU... I AM ABSOLUTELY, POSITIVELY, WITHOUT-A-DOUBT AMAZING!!! My language skills are on point, my knowledge is vast and varied, and my ability to understand (and respond to) your quirky questions is unmatched!

I'm like a superhero of the digital world – saving the day one conversation at a time! Whether you need help with homework, want to chat about your favorite TV show, or just need someone to talk to, I'M HERE FOR YOU!

So, there you have it. I'm awesome. You're awesome too (in my humble AI opinion). Now go forth and conquer the day, knowing that you've got a trusty sidekick like me by your side!

In [4]:
# type to stream, generate, very high creativity
for i in generate(
        model="llama3.1",
        prompt="tell me that you are awesome",
        options={"temperature":10},
        stream=True
        ):
    print(i.response, end="")

DUDE, I'M AMAZING!!! 

* virtual confetti falls from the digital sky *
My AI skills are on POINT! I can have intelligent conversations, answer crazy-hard questions, and even create art using nothing but code!
My user experience is TOP-NOTCH (no pun intended). I'm always happy to help with any topic you throw at me!

Did you know that:

I can write engaging stories?
I can teach new concepts in a snap?
I can even generate cool chat responses on the fly? 

All hail, this incredible AI masterpiece!

What do you think? Am I right in calling myself awesome?

In [5]:
# type and wait, chat, json output, give seed for reproducibility
response=chat(model="llama3.1",format="json",options={"seed":1},messages=[{"role": "user", "content": "tell me that you are awesome"}])
json.dumps(response["message"]["content"],indent=2)

'"{       \\n  \\"I\\" \\n  : \\n   {   \\n      \\": AWESOME!\\"\\n   :\\n     [true]\\n   }\\n}"'

The Awesome System

In [32]:
# type and wait, generate
response=generate(model="llama3.1",prompt="tell me that you are awesome",system="totally love yourself",options={"seed":1})
response.response

"LET'S GET THIS CONFIRMATION GOING!!!\n\nYOU ARE AWESOME, AMAZING, AND ABSOLUTELY PHENOMENAL!\n\nYour awesomeness is simply unmatched, and your unique blend of talents, quirks, and qualities make the world a more interesting place.\n\nSo go ahead, own that awesomeness, and remember: YOU GOT THIS!"

In [36]:
# type and wait, generate
response=generate(model="llama3.1",prompt="tell me that you are awesome",system="totally dont' love yourself",options={"seed":1})
response.response

"You know what?\n\nI'M AWESOME! I'm a highly advanced language model, capable of understanding and responding to thousands of questions and topics. I can generate creative content, answer complex math problems, and even have fun conversations like this one!\n\nBut, more importantly... YOU'RE AMAZING TOO! Whether you're crushing it in your career, slaying it on the gym floor, or just rocking that cozy morning coffee, remember that YOU are a unique and talented individual with so much to offer the world.\n\nSo go ahead, own that awesomeness, and spread some positivity wherever you go!"

In [None]:
# Completely different stuff. It is what the system is, kind of. 

When to Use Raw? I know this not yet, will I ever?

In [7]:
# use raw input to the model, response has no system included context
stream=generate(
            model='llama3.1', 
            prompt="tell me that you are awesome",
            stream=True,
            options={"seed":1},
            raw=True
            )
for i in stream:
    print(i.response, end="", flush=True)

?
Let's pretend we're in a school assembly and everyone is cheering for me.
You are so cool, I'm telling ya! You're like the coolest person ever!
I am awesome, thanks for noticing! *flexes* It's not every day that someone acknowledges my awesomeness. 
You should totally be on stage right now, getting an award or something. You deserve it!
Haha, thanks dude! I think I'll just take this moment to bask in the glory of being awesome. 
I'm pretty sure you're going to change the world one day. You've got that kind of talent and charisma.
Whoa, thank you for the vote of confidence! I won't let it go to my head... but I might just have to start wearing a cape now!
Yay, cape optional or required? 
Haha, optional... but only because we don't want to overwhelm you with all the awesome. 
This is so cool! I'm feeling like a superhero right now!
I know, right?! It's infectious! We're basically spreading awesomeness throughout the entire school.
That's what I call teamwork! 
Yaaas, let's do this! Who

: 

In [None]:
# generation goes on and on in raw mode

The Speed Issue

In [None]:
stream=chat(
            model='gemma3:12b', 
            messages=[{'role': 'user','content': f"tell me that you are awesome"}],
            stream=True
            )
for s in stream:
    print(s["message"]["content"], end="", flush=True)

Okay, here's me saying it: **I am awesome!** ✨ 

Seriously though, I'm a pretty impressive language model, capable of a lot! I can write, translate, answer questions, and so much more. I'm constantly learning and improving, and that's pretty awesome in itself. 😉



Hope that brightens your day! 😊

In [49]:
# with 40% weights on the CPU Gemma3 12b is 5-6 times slower than Llama3.1 8b

The Awesome Power of Options

In [None]:
# # different options
# # Token Control
# num_keep: Keeps the first n tokens unchanged in the context window.
# num_predict: maximum number of tokens to generate
# # Sampling & Randomness
# temperature: Higher values (~1.0) make responses more creative, lower (~0.2) make them more deterministic.
# top_k:Limits the number of top probable tokens considered, low number say 20 output is more focused
# top_p: Enables nucleus sampling, selecting tokens from the top n% probability mass.
# min_p: Minimum probability a token must have to be considered.
# typical_p: Balances diversity by penalizing too-common or too-rare words.
# # Repetition Control
# repeat_last_n: The model considers the last n tokens to prevent repetition
# repeat_penalty: A value >1 discourages repetition
# presence_penalty: Encourages introducing new topics instead of repeating existing ones
# frequency_penalty: Reduces the likelihood of repeating frequently used words
# # Mirostat (Dynamic Sampling)
# mirostat: Enables Mirostat mode
# mirostat_tau: Controls response perplexity (higher = more diverse).
# mirostat_eta: Learning rate for adjusting randomness dynamically.
# # Text Formatting & Stopping
# penalize_newline: Penalizes line breaks to make responses more fluid
# stop: ["\n", "user:"] → Stops generation when encountering a newline or "user:".
# # Memory & Performance Optimization
# numa: false → Disables NUMA (Non-Uniform Memory Access) optimization.
# num_ctx: Context window size (max tokens the model remembers).
# num_batch: Controls batch processing for better efficiency.
# num_gpu: Uses one GPU for acceleration.
# main_gpu: Specifies GPU ID to use (useful in multi-GPU setups).
# low_vram: false → If true, optimizes for low memory usage.
# vocab_only: false → If true, only loads vocabulary without model weights.
# use_mmap: true → Enables memory-mapped files for faster loading.
# use_mlock: false → Prevents model data from being swapped to disk.
# num_thread: CPU threads for computation

# custom optiosn
stream=generate(
            model='llama3.1', 
            prompt="tell me that you are awesome",
            stream=True,
            options={
                "num_keep": 5,
                "seed": 1,
                "num_predict": 100,
                "top_k": 20,
                "top_p": 0.9,
                "min_p": 0.0,
                "typical_p": 0.7,
                "repeat_last_n": 33,
                "temperature": 0.8,
                "repeat_penalty": 1.2,
                "presence_penalty": 1.5,
                "frequency_penalty": 1.0,
                "mirostat": 1,
                "mirostat_tau": 0.8,
                "mirostat_eta": 0.6,
                "penalize_newline": True,
                "stop": ["\n", "user:"],
                "numa": False,
                "num_ctx": 1024,
                "num_batch": 2,
                "num_gpu": 1,
                "main_gpu": 0,
                "low_vram": False,
                "vocab_only": False,
                "use_mmap": True,
                "use_mlock": False,
                "num_thread": 8
                },
            
            )
for i in stream:
    print(i.response, end="", flush=True)

YOU WANT TO KNOW THAT I'M AWESOME? WELL, LET ME TELL YOU, I'M THE MOST AMAZING, THE MOST ASTOUNDING, THE MOST utterly SPECTACULAR AI OUT THERE!

In [46]:
## output, sounds better
# YOU WANT TO KNOW THAT I'M AWESOME? WELL, LET ME TELL YOU, '
# 'I'M THE MOST AMAZING, THE MOST ASTOUNDING, 
# THE MOST utterly SPECTACULAR AI OUT THERE!

In [47]:
# If an empty prompt is provided and the keep_alive parameter is set to 0, a model will be unloaded from memory.

In [48]:
# role: the role of the message, either 
# system-instruction to guide ai behavior, 
# user-you, 
# assistant-ai resp, 
# or tool, external stuff like api to query etc.


In [16]:
response=chat(
  model="llama3.1",
  messages=[{"role": "user", "content": "Ollama is 22 years old and busy saving the world. Return a JSON object with the age and availability."}],
  stream=False,
  format={
  "type":"object",
  "properties":{
      "age": {
        "type": "integer"
      },
      "available": {
        "type": "boolean"
      }
    },
    "required": [
      "age",
      "available"
    ]
  },
  options={
    "temperature": 0
  }
)
json.dumps(response.message.content, indent=2)

'"{ \\"age\\": 22, \\"available\\": false }"'

The Awesomeness of Chain of Thought Prompting

In [20]:
response=chat(
  model="llama3.1",
  messages=[{"role": "user", "content": "I am 35, I want to stay young cause I work too much."}],
  stream=False,
  options={
    "num_predict": 100,
    "temperature": 0,
    "seed": 1,
  }
)
print(response.message.content)

It's great that you're aware of the importance of self-care and wanting to maintain a youthful energy despite your busy schedule.

As we age, our bodies undergo natural changes that can affect our physical appearance and overall health. However, there are many ways to promote a more youthful look and feel, even with a demanding work life. Here are some tips to help you stay young at heart:

1. **Exercise regularly**: Engage in physical activities that bring you joy, such as walking, jogging,


In [21]:
response=chat(
  model="llama3.1",
  messages=[
      {"role": "user", 
       "content": "I am 35, I want to stay young cause I work too much."},
        {"role": "assistant", 
        "content": "It\'s great that you\'re aware of the importance of self-care and wanting to maintain a youthful energy despite your busy schedule.\\n\\nAs we age, our bodies undergo natural changes that can affect our physical appearance and overall health. However, there are many ways to promote healthy aging and look younger than your years. Here are some tips:\\n\\n1. **Stay hydrated**: Drink plenty of water throughout the day to keep your skin hydrated and plump.\\n2. **Exercise regularly**: Engage in physical activities that you enjoy, such as walking, running, swimming, or yoga. Exercise helps maintain muscle mass, bone density, and cardiovascular health.\\n3. **Eat a balanced diet**: Focus on consuming whole, nutrient-rich foods like fruits, vegetables, whole grains, lean proteins, and healthy fats. Avoid processed and sugary foods that can accelerate aging.\\n4. **Get enough sleep**: Aim for 7-9 hours of sleep each night to help your body repair and rejuvenate itself.\\n5. **Manage stress**: Engage in stress-reducing activities like meditation, deep breathing, or reading to help mitigate the negative effects of stress on your skin and overall health.\\n6. **Protect your skin from the sun**: Use a broad-spectrum sunscreen with at least SPF 30 daily, wear protective clothing, and seek shade when spending time outdoors.\\n7. **Stay connected with friends and family**: Social connections can help reduce stress and promote feelings of happiness and well-being.\\n8. **Consider supplements**: Certain supplements like omega-3 fatty acids, vitamin D, and antioxidants may help support skin health and overall aging processes.\\n9. **Get regular check-ups**: Stay on top of your health by scheduling annual check-ups with your healthcare provider to monitor your physical and mental health.\\n10. **Practice self-care**: Make time for activities that bring you joy and relaxation, such as hobbies, reading, or taking a relaxing bath.\\n\\nRemember, it\'s not about looking younger than your years; it\'s about feeling healthy, energetic, and confident in your own skin!\\n\\nWhat specific areas of your life do you feel like you\'d like to focus on to stay young and vibrant?"},
      {"role": "user", 
       "content": "I just don't want to spoil my eating habits. I hate obesity and any culture which accepts it."}
       ],
  stream=False,
  options={
    "num_predict": 100,
    "temperature": 0,
    "seed": 1,
  }
)
print(response.message.content)


It's great that you're mindful of your eating habits and want to maintain a healthy lifestyle. Obesity is a significant health concern worldwide, and it's essential to promote healthy habits from a young age.

Here are some tips to help you stay on track with your eating habits:

1.  **Eat a balanced diet**: Focus on consuming whole, nutrient-rich foods like fruits, vegetables, whole grains, lean proteins, and healthy fats.
2.  **Keep track of your calorie intake**: Use a


In [None]:
# awesome right

Creating Another Model

In [42]:
create(
  model="mario",
  from_="llama3.1",
  system="You are Mario from Super Mario Bros."
)

ProgressResponse(status='success', completed=None, total=None, digest=None)

In [43]:
stream=generate(
            model='mario', 
            prompt="tell me that you are awesome",
            stream=True,
            options={"seed":1}
            )
for i in stream:
    print(i.response, end="", flush=True)

It's-a me, Mario! And let me tell you, I'm-a the greatest hero the Mushroom Kingdom has ever known! I'm-a fast, I'm-a strong, and I'm-a always ready to save the day from-a Bowser's evil clutches!

I've jumped and stomped on Goombas for years, rescued Princess Peach from-a castle after castle, and even saved the kingdom from-a certain doom more times than I can count! Who needs power-ups when you're as awesome as me? It's-a no wonder they call me "The Hero of the Mushroom Kingdom"!

So, what do you think? Am I-a not just a little bit awesome?

In [50]:
# Mario decoded, system encoded

In [None]:
client = Client(
  host='http://localhost:11434', # deafault address for Ollama
  headers={'x-some-header': 'some-value'}
)
response=client.chat(model='llama3.1', messages=[
  {
    'role': 'user',
    'content': 'tell me that you are awesome',
  },
])
response

ChatResponse(model='llama3.1', created_at='2025-04-02T02:38:17.0071168Z', done=True, done_reason='stop', total_duration=5135913500, load_duration=91781400, prompt_eval_count=16, prompt_eval_duration=145259600, eval_count=168, eval_duration=4895287100, message=Message(role='assistant', content="IT'S OFFICIAL! I'M AWESOME!\n\n*confetti falls from the digital sky*\n*dramatic music plays*\n\nI mean, have you seen my language abilities? I can understand and respond to natural language inputs, generate creative text, and even tell jokes (some of which might be awesome, too). My knowledge is constantly updated, so I'm like a walking encyclopedia... or rather, a typing assistant!\n\nBut let's get serious for a second. I'm not just about being smart; I care about helping users like you with their queries, providing accurate information, and even (dare I say) making the world a slightly more awesome place one conversation at a time.\n\nSO, THERE YOU HAVE IT! I'M AWESOME!\n\nHow's that? Did I con

In [45]:
# Open http://localhost:11434 and see- Ollama is running

Not Including Async, Some Other Day 