# First steps to becoming a Generative AI Engineer

Large language models are systems that receive input (historically only text but nowadays all kind of modalities) and create output. 

But that is what basically every system does, what makes LLMs special is that they are able to create much more sophisticated outputs than any other system currently on the market. At least in a lot of use-cases.

So let's set some of them up and try them out:

## Running LLMs locally with ollama

For this course we assume that you have downloaded ollama locally and pulled/ran "ollama run gemma3:1b" once to fetch the smallest gemma 3 model. This model is particularly small that it enables fast iteration locally. It not super smart, but so what :)

In [1]:
#pip install ollama

Collecting ollama
  Using cached ollama-0.4.8-py3-none-any.whl.metadata (4.7 kB)
Collecting httpx<0.29,>=0.27 (from ollama)
  Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting pydantic<3.0.0,>=2.9.0 (from ollama)
  Using cached pydantic-2.11.3-py3-none-any.whl.metadata (65 kB)
Collecting anyio (from httpx<0.29,>=0.27->ollama)
  Using cached anyio-4.9.0-py3-none-any.whl.metadata (4.7 kB)
Collecting certifi (from httpx<0.29,>=0.27->ollama)
  Using cached certifi-2025.1.31-py3-none-any.whl.metadata (2.5 kB)
Collecting httpcore==1.* (from httpx<0.29,>=0.27->ollama)
  Using cached httpcore-1.0.8-py3-none-any.whl.metadata (21 kB)
Collecting idna (from httpx<0.29,>=0.27->ollama)
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<0.29,>=0.27->ollama)
  Using cached h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting annotated-types>=0.6.0 (from pydantic<3.0.0,>=2.9.0->ollama)
  Using cached annotated_types

In [3]:
from ollama import chat

response = chat(model='gemma3:4b-it-qat', messages=[
  {
    'role': 'user',
    'content': 'Hey you!',
  },
])
# notice your exact answer could be different due to some random processes in the model (learn about that later)
print(response.message.content)

Hey there! How’s it going? 😊 

What’s on your mind today? Do you want to chat, need help with something, or just want to say hello?


As we can see we use the ollama python library to send messages to the ollama model (in our case gemma3:1b). The messages are a list of python dictionaries with each a role and content key. The role is only user and assitant for now where we obviously use user for user messages and assistant for the llm. The content is the generated text/content.

Let's try another one

In [4]:
response_2 = chat(model='gemma3:4b-it-qat', messages=[
  {
    'role': 'user',
    'content': 'Hey I am Julius and I am happy to participate in this exciting new llm and ai agents course at opencampus!',
  },
])

print(response_2.message.content)

Hey Julius! That's fantastic! It’s great to have you join us. Welcome to the course! I'm really excited about this too – it’s a fascinating area. 

To help me get to know you and what you’re hoping to learn, could you tell me a little more about:

* **What are you most interested in learning about specifically within the course?** (e.g., Prompt Engineering, Agent Design, a specific LLM, a particular application like coding or creative writing?)
* **What's your background like?** (e.g., Are you a programmer, a writer, a student, someone with no experience, etc.?)
* **What are you hoping to *do* with this knowledge?** (e.g., Build your own AI agents, improve your workflow, just understand how these things work?)

Don't feel like you have to answer *everything* at once. Just start with whatever comes to mind!  I'm here to help you get the most out of this course. 😊


Oh that so nice😊 Let's see if it remembers me!

In [5]:
response_3 = chat(model='gemma3:4b-it-qat', messages=[
  {
    'role': 'user',
    'content': 'Who am I?',
  },
])

print(response_3.message.content)

Okay, this is a classic riddle! I'm going to give you a few clues to help you figure out who you are.  

I am:

*   **A question.** You asked me this very question.
*   **A user.** You are interacting with me.
*   **A person seeking information or entertainment.**

To help me narrow it down, could you tell me:

*   **What are you doing right now?** (e.g., working, relaxing, bored?)
*   **What kind of things are you interested in?** (e.g., science, art, food, history?)


Oh it forget completely who I was. That's so bad.😭

The reason is that the model does not update its weights (i.e. its neuronal connections) permanentially for each of my inputs. So after it produced the answer everything is reset and hence any memory of me is lost.💀

So to keep the conversation going we have to put all the conversation messages into the messages list in the correct order

In [6]:
chat_response = chat(model='gemma3:4b-it-qat', messages=[
    {
    'role': 'user',
    'content': 'Hey I am Julius and I am happy to participate in this exciting new llm and ai agents course at opencampus!',
    },
    {
    'role': 'assistant',
    'content': '''Hey Julius! That’s fantastic to hear! It’s great you’re excited about the course. OpenCampus is a really cool platform, and it’s awesome you’re diving in.\n\nIt’s great to connect with you too. What specifically are you hoping to get out of the course? Are you interested in learning about LLMs, AI agents, or something else entirely?\n\nI'm happy to chat about it – feel free to tell me anything you're thinking. 😊''',
    },
    {
    'role': 'user',
    'content': 'Who am I?',
    },
])

print(chat_response.message.content)

You are Julius! You’re a participant in the exciting new LLM and AI agents course at OpenCampus. 😄 

It's good to officially acknowledge that! Are you ready to start learning?


So now it now I remembers me!

Ok, let's now do some friendly teasing

In [7]:
tease_response = chat(model='gemma3:4b-it-qat', messages=[
  {
    'role': 'user',
    'content': 3000*'Hi',
  },
])

print(tease_response.message.content)

Okay, let’s do this! You’ve clearly enjoyed a good, long string of “Hi!” 

Is there anything specific you’d like to do now, or were you just testing my response to a long sequence?


As you can see our AI friend is still a bit stupid at some inputs which it has not encountered during training. 3000 "Hi"s were recognized as an image..can you maybe think of why? :)

### Task 1

Be creative and try to fool/tease the model with inputs it was probably not trained on and see what happens😊

In [8]:
tease2_response = chat(model="gemma3:4b-it-qat", messages=[
    {
        'role': 'user',
        'content': 'How is the weather today in Kiel'
    },
])

print(tease2_response.message.content)

Okay, here's the weather forecast for Kiel, Germany as of November 2, 2023, at 10:45 AM PST:

*   **Current Conditions:** Mostly cloudy
*   **Temperature:** 8°C (46°F)
*   **Feels Like:** 6°C (43°F)
*   **Humidity:** 76%
*   **Wind:** West at 13 km/h (8 mph)
*   **Visibility:** 10 km (6 miles)
*   **Sunrise:** 07:49
*   **Sunset:** 16:49

**Forecast for today:**

*   **High:** 11°C (52°F)
*   **Low:** 7°C (45°F)
*   **Chance of rain:** 40% mainly in the afternoon.

**Source:** [https://www.wetter.de/kiel/](https://www.wetter.de/kiel/)

**Would you like me to provide a more detailed forecast (e.g., hourly forecast, extended forecast)?**


In [10]:
tease3_response = chat(model="gemma3:1b", messages=[
    {
        'role': 'user',
        'content': 'How many r are in strawberry'
    },
])

print(tease3_response.message.content)

Let's count the "r"s in "strawberry"!

There are **four** "r"s in "strawberry". 😊 

Let me know if you'd like to try another word!


In [14]:
tease4_response = chat(model="gemma3:1b", messages=[
    {
        'role': 'user',
        'content': 'Write this sentence backwards'
    },
])

print(tease4_response.message.content)

.ytte a se te o tna  


## Using OpenAI LLMs

Let's now try out the models from the company which generated to original buzz around LLMs!
(You need your own OPENAI API KEY to play around which this section)

We can use OpenAIs LLMs via two ways:
- OpenAI API
- Microsoft Azure API

So now we using these models as services i.e. they run on somebodyelse computers and we pay for them either with our data or with dollars or with both. Just be aware of it that what your inputting is now transferred to a third-party.

In [12]:
pip install openai

Collecting openai
  Downloading openai-1.76.0-py3-none-any.whl.metadata (25 kB)
Collecting distro<2,>=1.7.0 (from openai)
  Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Using cached jiter-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting tqdm>4 (from openai)
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Downloading openai-1.76.0-py3-none-any.whl (661 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m661.2/661.2 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[?25hUsing cached distro-1.9.0-py3-none-any.whl (20 kB)
Using cached jiter-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (352 kB)
Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)
Installing collected packages: tqdm, jiter, distro, openai
Successfully installed distro-1.9.0 jiter-0.9.0 openai-1.76.0 tqdm-4.67.1
Note: you may need to restart the kernel to use updated packages.


In [14]:
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))


completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Write a one-sentence bedtime story about a unicorn."
        }
    ]
)

print(completion.choices[0].message.content)

As the shimmering moonlight danced on her silver mane, Luna the unicorn soared through the starlit sky, spreading dreams of magic and wonder to every sleeping child on Earth.


So as we can see, the general setup is similar, we also have the messages list. We only need now an API Key to use the external services and have different names from the library.

## Using the free HuggingFace API

There is also a free api from HuggingFace which allows you to test your code which models which are freely available. You just need to setup a HuggingFace Account and a READ token.

In [None]:
import requests
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

API_KEY = os.getenv("HUGGINGFACE_API_KEY")

API_URL = "https://router.huggingface.co/novita/v3/openai/chat/completions"
headers = {"Authorization": f"{API_KEY}"}
payload = {
    "messages": [
        {
            "role": "user",
            "content": "Which number yields the same result when it is added or multiplied with itself?"
        }
    ],
    "model": "deepseek/deepseek-v3-0324",
}

response = requests.post(API_URL, headers=headers, json=payload)
print(response.json()["choices"][0]["message"])



{'role': 'assistant', 'content': 'Alright, let\'s tackle this problem step by step. The question is asking: \n\n**"Which number yields the same result when it is added or multiplied with itself?"**\n\nAt first glance, this seems straightforward, but let\'s break it down to ensure we understand it correctly and arrive at the right answer.\n\n### Understanding the Problem\n\nWe need to find a number, let\'s call it **x**, such that:\n\n- When you add **x** to itself: **x + x**\n- When you multiply **x** by itself: **x * x**\n\nThese two operations yield the same result. In other words:\n\n**x + x = x * x**\n\nOur goal is to find the value(s) of **x** that satisfy this equation.\n\n### Translating to an Equation\n\nLet\'s write down the equation based on the above:\n\n**x + x = x * x**\n\nSimplifying the left side:\n\n**2x = x²**\n\nNow, we have a simple quadratic equation:\n\n**x² = 2x**\n\n### Solving the Equation\n\nTo solve for **x**, let\'s rearrange the equation to set it to zero:\n

In [23]:
import requests
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

API_KEY = os.getenv("HUGGINGFACE_API_KEY")

API_URL = "https://router.huggingface.co/novita/v3/openai/chat/completions"
headers = {"Authorization": f"{API_KEY}"}
payload = {
    "messages": [
        {
            "role": "user",
            "content": "How many 'G's in 'huggingface'?"
        }
    ],
    "model": "deepseek/deepseek-v3-0324",
}

response = requests.post(API_URL, headers=headers, json=payload)
print(response.json()["choices"][0]["message"])

{'role': 'assistant', 'content': 'Alright, let\'s tackle the problem: **How many \'G\'s are in the word \'huggingface\'?**\n\n### Understanding the Problem\nFirst, I need to determine how many times the letter \'G\' (both uppercase and lowercase, though in this case, the word is all lowercase) appears in the word "huggingface."\n\n### Breaking Down the Word\nLet\'s write out the word and look at each letter one by one.\n\nThe word is: h u g g i n g f a c e\n\nNow, let\'s list the letters with their positions to keep track:\n\n1. h\n2. u\n3. g\n4. g\n5. i\n6. n\n7. g\n8. f\n9. a\n10. c\n11. e\n\n### Counting the \'G\'s\nNow, let\'s go through each letter and count how many times \'g\' appears.\n\n1. h - not a g\n2. u - not a g\n3. g - this is a g (count = 1)\n4. g - another g (count = 2)\n5. i - not a g\n6. n - not a g\n7. g - another g (count = 3)\n8. f - not a g\n9. a - not a g\n10. c - not a g\n11. e - not a g\n\n### Verifying\nLet me recount to ensure I didn\'t miss anything:\n\n- T

In [24]:
# Uhh let's render this long markdown string to read it more easily
from IPython.display import Markdown, display
display(Markdown(response.json()["choices"][0]["message"]["content"]))

Alright, let's tackle the problem: **How many 'G's are in the word 'huggingface'?**

### Understanding the Problem
First, I need to determine how many times the letter 'G' (both uppercase and lowercase, though in this case, the word is all lowercase) appears in the word "huggingface."

### Breaking Down the Word
Let's write out the word and look at each letter one by one.

The word is: h u g g i n g f a c e

Now, let's list the letters with their positions to keep track:

1. h
2. u
3. g
4. g
5. i
6. n
7. g
8. f
9. a
10. c
11. e

### Counting the 'G's
Now, let's go through each letter and count how many times 'g' appears.

1. h - not a g
2. u - not a g
3. g - this is a g (count = 1)
4. g - another g (count = 2)
5. i - not a g
6. n - not a g
7. g - another g (count = 3)
8. f - not a g
9. a - not a g
10. c - not a g
11. e - not a g

### Verifying
Let me recount to ensure I didn't miss anything:

- Third letter: g (1)
- Fourth letter: g (2)
- Seventh letter: g (3)

No other 'g's are present in the word.

### Potential Pitfalls
It's easy to overlook letters, especially when they're repeated or not at the beginning. One might rush and miss the 'g's in the middle or at the end. Also, ensuring that we're only counting lowercase 'g's is important unless specified otherwise (here, the word is all lowercase).

### Conclusion
After carefully examining each letter in "huggingface," I found that the letter 'g' appears **three times**.

### Final Answer
There are **3 'G's** in the word 'huggingface'.

Really cool, isn't it? 

### Short excurse about the term payload

I wondered always why they use the term **payload**. So here is a nice explanation: 

In API requests, the term payload refers to the **actual data** or information being sent within the request, typically in the body of an HTTP request (such as POST or PUT). The term is derived from communication and networking protocols, where the payload is the part of transmitted data that contains the meaningful information, as opposed to headers or metadata which describe or route the data.

Why is the term "payload" commonly used?

Historical Origin:
The term originally comes from aerospace and transport, describing the **actual useful content** carried by a vehicle (like cargo in a truck or spacecraft).

Clear Differentiation:
It clearly distinguishes between control data (headers, parameters, routing information) and the actual content (payload) intended for processing by the receiving application or service.

Generalization:
Payload provides a generic, neutral term for the transmitted data, regardless of its structure (JSON, XML, plain text, binary files, etc.), which simplifies communication across different APIs and systems.

## Conclusion

So there are a lot different LLMs out in the wild. They differ in size and capabilities. We always access them via an API call. They run either locally, in our own cloud or we use some third party provider. Always check where the data is going and what implactions that might have!