### **Prologue**;

- In this guide, we’ll walk through the process of using Ollama and CrewAI on Kaggle to run local LLM (Large Language Model) agents.

- These tools enable you to harness the power of advanced AI models directly in a hosted environment like Kaggle, providing an efficient way to work with LLMs without requiring complex infrastructure.

- Throughout this guide, I’ll cover the setup, configuration, and usage of Ollama and CrewAI to help you integrate AI-driven solutions seamlessly into your Kaggle notebooks.

- Whether you're a beginner or an experienced AI practitioner, this guide will give you the knowledge to start working with LLM agents quickly and effectively.

In [1]:
## to suppress the warnings all types of warnings;

import logging
import os
import sys
from IPython.utils import io
import subprocess
os.chdir('/kaggle/working/')

# Suppress Python logging
logging.getLogger().setLevel(logging.ERROR)
logging.getLogger("transformers").setLevel(logging.ERROR)

# Suppress environment-level logging
os.environ["RUST_LOG"] = "error"
os.environ["LOGLEVEL"] = "ERROR"

# Suppress stderr output (like Go warnings)
sys.stderr = open(os.devnull, 'w')


## ignore normal warnings;
import warnings
warnings.filterwarnings("ignore")


In [2]:
## To run the shell commands on kaggle's terminal

def run(commands):
    for command in commands:
        with subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT, bufsize = 1) as sp:
            for line in sp.stdout:
                line = line.decode("utf-8", errors = "replace")
                if "undefined reference" in line:
                    raise RuntimeError("Failed Processing.")
                print(line, flush = True, end = "")
        pass
    pass
pass

### What is Ollama?

- Ollama is a tool that helps us easily work with powerful AI models, such as language models, right on our computers or in the cloud. These models can understand and generate human-like text, making them useful for tasks like writing, answering questions, or even creating summaries.


### Why are we using Ollama?

- In this guide, we’re using Ollama because it makes it simple to access and run these advanced AI models without needing complex setups or specialized hardware.
- Ollama allows us to download, run, and interact with models in a straightforward way, directly within Kaggle notebooks, where we can experiment and see results quickly.


### Your next question - I might guess it, how is it possible to integrate Ollama into a ipynb notebook on kaggle. That's because the Kaggle environment is not what you think it is. Clearing some misconceptions;

### Is Kaggle a Cloud Environment?
- Yes, Kaggle is a cloud-based environment. It provides free access to computational resources (like CPUs, GPUs, and TPUs) through Jupyter notebooks. These notebooks allow users to run code and analyze data without needing a personal computer or server. **When you run code on Kaggle, you're actually executing it on remote servers hosted in the cloud.**

### How Does Ollama Work in Kaggle?
- In the context of Kaggle, Ollama allows you to bring AI models (like language models) into this cloud environment and interact with them. Here’s how it works step by step:

#### 1. Installation on Kaggle’s Cloud-Based Environment
- The first thing we do is install Ollama using the curl command. Kaggle provides an environment where we can run shell commands to install software, just like on a regular computer, but here we’re installing Ollama within the cloud.

#### 2. Running the Model
- After installation, we start the Ollama service. This step makes sure the AI model is ready and running, similar to starting a program on your computer. Once the service is running, we can interact with it.

#### 3. Cloud-Based AI Models
- The beauty of using Ollama in Kaggle is that you don’t need specialized hardware like high-performance GPUs on your personal machine. Kaggle’s cloud servers provide that, and Ollama can pull models (like the Llama3.2 model) from the cloud. This means we can run AI models efficiently without needing to worry about the technical infrastructure.

In [3]:
%%time
# Installing Ollama using curl and a shell script
commands = [
        "curl -fsSL https://ollama.com/install.sh | sh",
]
run(commands)

>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
CPU times: user 44.6 ms, sys: 30 ms, total: 74.6 ms
Wall time: 43.2 s


In [4]:
# Starting the Ollama service
os.system("/usr/local/bin/ollama serve &")

# Verifying the installation by running a simple test command.
os.system("echo 'ollama test'")

ollama test


0

In [5]:
%%time
# Downloading the specified Llama3.2 model (1B parameters) using Ollama;
model = 'llama3.2:1b'
commands = [
        f"ollama pull {model}"
]
run(commands)

Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is: 

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHDckeeEFv5VBKbOPE1AfCu+I0mEYfBwZBdtCXBrUflv

[GIN] 2025/05/12 - 18:43:08 | 200 |      60.236µs |       127.0.0.1 | HEAD     "/"
[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ 

In [6]:
## installing for integration into the Crew
!pip install -qq langchain-community

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m30.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m48.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m437.7/437.7 kB[0m [31m30.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25h

### Seeking inference of a sample query from the Llama model;

In [7]:
from langchain_community.llms import Ollama

## Wrapping the 
llm = Ollama(model=model, temperature=0)

In [8]:
%%time
llm.invoke(f"What are the components of a computer?")

[GIN] 2025/05/12 - 18:43:47 | 200 |  7.001287044s |       127.0.0.1 | POST     "/api/generate"
CPU times: user 68.5 ms, sys: 17 ms, total: 85.4 ms
Wall time: 7.01 s


"A computer is composed of several key components that work together to process information, store data, and provide a user interface. Here are the main components of a computer:\n\n1. **Central Processing Unit (CPU)**: The CPU, also known as the processor, is the brain of the computer. It executes instructions and performs calculations at high speeds.\n2. **Motherboard**: The motherboard is the main circuit board that connects all the hardware components together. It provides a platform for the CPU, memory, and other peripherals.\n3. **Memory (RAM)**: Random Access Memory (RAM) is temporary storage for data and applications. It's volatile, meaning it loses its contents when the computer is powered off.\n4. **Storage Drive**: A storage drive, such as a Hard Disk Drive (HDD), Solid-State Drive (SSD), or Flash Drive, stores the operating system, programs, and data.\n5. **Power Supply**: The power supply unit (PSU) provides power to all the components in the computer. It converts AC power

### 👉 Looks like the model is giving us good results. 

### 👉 The reason why we are using this model is - 

🔹 1. **Lightweight & Fast** - With only **1 billion parameters**, this model is significantly **smaller** than standard LLMs (like GPT-3.5 or Llama 3-70B), which means **Faster response times** and **Lower memory usage**

🔹 2. **Good for Prototyping** - Great for **quick experimentation** and testing agent workflows.

🔹 3. **Privacy & Offline Capability** - No dependency on external APIs and Full control of your data and requests.

🔹 4. **Seamlessly Compatible with LangChain & CrewAI**  

In [9]:
!pip install -qq --upgrade crewai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.2/48.2 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m308.7/308.7 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.7/7.7 MB[0m [31m79.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.3/135.3 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.9/18.9 MB[0m [31m86.5 MB/s[0

#### The LLM has to be wrapped using crewai's LLM class.

In [10]:
from crewai import LLM 

llm = LLM(
    model="ollama/llama3.2:1b",
    base_url="http://localhost:11434"
)

### Let's break the above snippet part by part;

- "http://localhost:11434" is the URL where Ollama listens for input — it’s like a local API endpoint that your code can call to:

1. Run a model

2. Send a prompt

3. Get a response


- When you run this command (ran earlier):

```os.system("/usr/local/bin/ollama serve &") ```

It starts a local server hosting the Ollama models at port **11434**.

Then, libraries like CrewAI or LangChain connect to that server by specifying:

``` base_url = "http://localhost:11434" ```

That’s how they send prompts and get answers from the model — like calling an API, but all running within your local/cloud environment.

🔹 Importing CrewAI’s core components:

Agent: Represents an AI role with a goal and backstory.

Task: Assigns work to an agent.

Crew: Manages multiple agents and their tasks as a collaborative unit.



In [11]:
from crewai import Agent, Task, Crew

### 🧑‍🔬 Define the Agents:

🔹 This creates the **Researcher** agent:

- Has a clear Role ("Researcher") and Goal.
1. **Role** - defines the expertiser
2. **Goal** - defines the objective

- The **Backstory** describes context and personality of the agent.

- **llm=llm** attaches the language model (Ollama via CrewAI.LLM).

- allow_delegation=False means it won't pass its task to other agents.


🔹 Similarly have created the **Writer** agent


In [12]:
researcher = Agent(
    role="Researcher",
    goal="Find fun facts about space",
    backstory="You are great at researching scientific facts.",
    llm=llm,
    allow_delegation=False,
)

writer = Agent(
    role="Writer",
    goal="Summarize space facts in a fun way",
    backstory="You are a creative writer who loves to make science engaging.",
    llm=llm,
    allow_delegation=False,
)


### ✅ Define the Tasks:


🔹 Assigns a task to the Researcher agent:

- Describes what the agent should do (**description**).

- **expected_output** sets a format to guide the model’s response.

- **Each Task is assigned one Agent.**


🔹 Similarly a task is created the **Writer** agent: Here the only difference is;

- This task **depends on the output of the research_task**.

- The **context** parameter ensures the Writer gets the Researcher's output as input.


In [13]:
# Researcher task
research_task = Task(
    description="Find 3 interesting facts about space that are suitable for kids.",
    expected_output="A bullet list of 3 fun facts about space.",
    agent=researcher,
)

# Writer task
write_task = Task(
    description="Take the space facts and write a short fun paragraph summarizing them.",
    expected_output="A short paragraph in a fun tone summarizing the facts.",
    agent=writer,
    context=[research_task],
)

### 🤝 Bring It All Together:

In [14]:
## Combining the agents and tasks;
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    verbose=True,  ### ensures you’ll see detailed logs of what’s happening during execution.
)


### 🗣️ Once you run crew.kickoff(), CrewAI will:

1. Ask the Researcher to find 3 fun facts.

2. Feed those facts to the Writer.

3. The Writer then composes a fun summary paragraph.

Note: You'll see all the intermediate steps if verbose=True.

In [15]:
result = crew.kickoff()
print("\nFinal Result:\n", result)


[1m[95m# Agent:[00m [1m[92mResearcher[00m
[95m## Task:[00m [92mFind 3 interesting facts about space that are suitable for kids.[00m


[GIN] 2025/05/12 - 18:44:34 | 200 |  2.074150233s |       127.0.0.1 | POST     "/api/generate"
[GIN] 2025/05/12 - 18:44:34 | 200 |   70.038122ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:34 | 200 |    71.36172ms |       127.0.0.1 | POST     "/api/show"




[1m[95m# Agent:[00m [1m[92mResearcher[00m
[95m## Final Answer:[00m [92m
I now can give a great answer

• The Andromeda galaxy, which is the closest major galaxy to our Milky Way, is approaching us at a speed of about 250,000 miles per hour (400,000 kilometers per hour). This means that if we don't do anything to stop it, the two galaxies will collide in about 4 billion years.
 
• The International Space Station orbits the Earth more than 16 times per day and travels around our planet at an altitude of around 250 miles (400 kilometers) above its surface. The ISS is so massive that it weighs as much as 450,000 tons and has a mass of over 500,000 pounds.
 
• There are billions of stars in the Milky Way galaxy alone, but scientists estimate that there may be many more. The most distant star observed from Earth was called LGM-1 and was seen in 2016. It is located about 5 billion light-years away from us.[00m




[1m[95m# Agent:[00m [1m[92mWriter[00m
[95m## Task:[00m [92mTake the space facts and write a short fun paragraph summarizing them.[00m
[GIN] 2025/05/12 - 18:44:34 | 200 |   49.964741ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:34 | 200 |   42.798535ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:34 | 200 |    55.55951ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:36 | 200 |  1.852135334s |       127.0.0.1 | POST     "/api/generate"
[GIN] 2025/05/12 - 18:44:36 | 200 |   44.314293ms |       127.0.0.1 | POST     "/api/show"


[1m[95m# Agent:[00m [1m[92mWriter[00m
[95m## Final Answer:[00m [92m
I now can give a great answer

The vast expanse of space holds countless mysteries waiting to be unraveled. But one fascinating fact that never fails to captivate us is the alarming rate at which our closest galactic neighbor, Andromeda, is hurtling towards us. The estimated time it'll take for these two massive gala

[GIN] 2025/05/12 - 18:44:36 | 200 |   52.796918ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:36 | 200 |   46.431628ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/05/12 - 18:44:36 | 200 |   41.809847ms |       127.0.0.1 | POST     "/api/show"



Final Result:
 I now can give a great answer

The vast expanse of space holds countless mysteries waiting to be unraveled. But one fascinating fact that never fails to captivate us is the alarming rate at which our closest galactic neighbor, Andromeda, is hurtling towards us. The estimated time it'll take for these two massive galaxies to collide in 4 billion years? A mere blink of an eye in astronomical terms – a mere fraction of a second compared to the 13.8 billion-year history of our universe. It's a reminder that space is unforgiving, yet awe-inspiring, with stars and galaxies constantly shifting and colliding in an eternal dance of gravitational forces. As we continue to explore the cosmos, we're left with more questions than answers – and a sense of wonder at the boundless mysteries waiting to be explored in this vast, starry expanse.


### ✅ End Notes

In this notebook, we explored how to run Ollama inside Kaggle and integrate it with CrewAI to simulate collaborative agents performing creative tasks. Here's a quick recap of what we achieved:

🧠 Installed and served Ollama locally within the Kaggle environment.

🤖 Pulled and used a lightweight LLM (llama3.2:1b) that's fast, efficient, and great for demos and prototyping.

🛠️ Wrapped the LLM using both LangChain and CrewAI for compatibility and task orchestration.

👥 Created a multi-agent workflow where each agent had a distinct role and goal — working together to generate fun content about space!

### 🚀 Why This Matters:

- By combining local models with modular agent frameworks, you can build **intelligent systems** **without relying on paid APIs** — opening up powerful possibilities for education, experimentation, and real-world applications, even in constrained environments like Kaggle.

### If this helped you or sparked any new ideas, consider leaving an **upvote** or sharing your thoughts in the comment section! 🌟

### Also Please consider going through this powerful series of articles here as a starting point for learning about building agentic AI based systems. It is very beginner friendly:

https://www.dailydoseofds.com/ai-agents-crash-course-part-1-with-implementation/#building-ai-agent-systems


Credits to **Avi Chawla** and team for his wonderful articles - https://www.linkedin.com/in/avi-chawla/