<a href="https://colab.research.google.com/github/SahDavies/commons/blob/main/Ollama_Setup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Run Ollama in Colab
---

[![5aharsh/collama](https://raw.githubusercontent.com/5aharsh/collama/main/assets/banner.png)](https://github.com/5aharsh/collama)

This is an example notebook which demonstrates how to run Ollama inside a Colab instance. With this you can run pretty much any small to medium sized models offerred by Ollama for free.

For the list of available models check [models being offerred by Ollama](https://ollama.com/library).


## Before you proceed
---

Since by default the runtime type of Colab instance is CPU based, in order to use LLM models make sure to change your runtime type to T4 GPU (or better if you're a paid Colab user). This can be done by going to **Runtime > Change runtime type**.

While running your script be mindful of the resources you're using. This can be tracked at **Runtime > View resources**.

## Running the notebook
---

After configuring the runtime just run it with **Runtime > Run all**. And you can start tinkering around. This example uses [Llama 3.2](https://ollama.com/library/llama3.2) to generate a response from a prompted question using [LangChain Ollama Integration](https://python.langchain.com/docs/integrations/chat/ollama/).

## Installing Dependencies
---

1. `pciutils` is required by Ollama to detect the GPU type.
2. Installation of Ollama in the runtime instance will be taken care by `curl -fsSL https://ollama.com/install.sh | sh`




In [1]:
!sudo apt update
!sudo apt install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh

[33m0% [Working][0m            Hit:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
[33m0% [Connecting to archive.ubuntu.com] [Connecting to security.ubuntu.com (185.1[0m                                                                               Hit:2 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:3 https://cli.github.com/packages stable InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:7 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:8 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:10 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:11 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Reading package lists... Do

## Running Ollama
---

In order to use Ollama it needs to run as a service in background parallel to your scripts. Becasue Jupyter Notebooks is built to run code blocks in sequence this make it difficult to run two blocks at the same time. As a workaround we will create a service using subprocess in Python so it doesn't block any cell from running.

Service can be started by command `ollama serve`.

`time.sleep(5)` adds some delay to get the Ollama service up before downloading the model.

In [2]:
import threading
import subprocess
import time

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

## Pulling Model
---

Download the LLM model using `ollama pull llama3.2`.

For other models check https://ollama.com/library

In [3]:
!ollama pull llama3.2

[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l


## And that's it!
---

With this you should be able to freely play around with the models in your scripts. Following is an example using `langchain-ollama` to answer a simple prompt.

If you have a use-case that can help out others feel free to add your notebook to [Collama](https://github.com/5aharsh/collama/fork)

In [4]:
!pip install langchain-ollama

Collecting langchain-core<2.0.0,>=1.0.0 (from langchain-ollama)
  Using cached langchain_core-1.0.5-py3-none-any.whl.metadata (3.6 kB)
Using cached langchain_core-1.0.5-py3-none-any.whl (471 kB)
Installing collected packages: langchain-core
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 0.3.79
    Uninstalling langchain-core-0.3.79:
      Successfully uninstalled langchain-core-0.3.79
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-openai 0.3.35 requires langchain-core<1.0.0,>=0.3.78, but you have langchain-core 1.0.5 which is incompatible.
langchain 0.3.27 requires langchain-core<1.0.0,>=0.3.72, but you have langchain-core 1.0.5 which is incompatible.[0m[31m
[0mSuccessfully installed langchain-core-1.0.5


In [5]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown

template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="llama3.2")

chain = prompt | model

display(Markdown(chain.invoke({"question": "What's the length of hypotenuse in a right angled triangle"})))

To find the length of the hypotenuse in a right-angled triangle, we can use the Pythagorean theorem.

The formula for the Pythagorean theorem is:

a² + b² = c²

where:
- 'a' and 'b' are the lengths of the two sides that form the right angle (the legs)
- 'c' is the length of the hypotenuse (the side opposite the right angle)

Can I help you with anything else?

In [6]:
!pip install langchain-nomic langchain-openai faiss-cpu

Collecting langchain-core>=0.3.76 (from langchain-nomic)
  Using cached langchain_core-0.3.79-py3-none-any.whl.metadata (3.2 kB)
Using cached langchain_core-0.3.79-py3-none-any.whl (449 kB)
Installing collected packages: langchain-core
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 1.0.5
    Uninstalling langchain-core-1.0.5:
      Successfully uninstalled langchain-core-1.0.5
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-ollama 1.0.0 requires langchain-core<2.0.0,>=1.0.0, but you have langchain-core 0.3.79 which is incompatible.[0m[31m
[0mSuccessfully installed langchain-core-0.3.79


In [1]:
!pip uninstall -y langchain langchain-core langchain-community langchain-ollama

Found existing installation: langchain 0.3.27
Uninstalling langchain-0.3.27:
  Successfully uninstalled langchain-0.3.27
Found existing installation: langchain-core 0.3.79
Uninstalling langchain-core-0.3.79:
  Successfully uninstalled langchain-core-0.3.79
Found existing installation: langchain-community 0.3.31
Uninstalling langchain-community-0.3.31:
  Successfully uninstalled langchain-community-0.3.31
Found existing installation: langchain-ollama 1.0.0
Uninstalling langchain-ollama-1.0.0:
  Successfully uninstalled langchain-ollama-1.0.0


In [2]:
!pip install "langchain>=1.0.0" langchain-community langchain-ollama

Collecting langchain>=1.0.0
  Downloading langchain-1.0.7-py3-none-any.whl.metadata (4.9 kB)
Collecting langchain-community
  Using cached langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-ollama
  Using cached langchain_ollama-1.0.0-py3-none-any.whl.metadata (2.1 kB)
Collecting langchain-core<2.0.0,>=1.0.4 (from langchain>=1.0.0)
  Using cached langchain_core-1.0.5-py3-none-any.whl.metadata (3.6 kB)
Collecting langgraph<1.1.0,>=1.0.2 (from langchain>=1.0.0)
  Downloading langgraph-1.0.3-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain-community)
  Downloading langchain_classic-1.0.0-py3-none-any.whl.metadata (3.9 kB)
Collecting langchain-text-splitters<2.0.0,>=1.0.0 (from langchain-classic<2.0.0,>=1.0.0->langchain-community)
  Downloading langchain_text_splitters-1.0.0-py3-none-any.whl.metadata (2.6 kB)
Collecting langgraph-checkpoint<4.0.0,>=2.1.0 (from langgraph<1.1.0,>=1.0.2->langchain>=1.0.0)
  Downloadin

In [3]:
!pip install --upgrade \
  langchain \
  langchain-core \
  langchain-community \
  langchain-openai \
  langchain-ollama \
  langchain-nomic


Collecting langchain-openai
  Using cached langchain_openai-1.0.3-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-nomic
  Using cached langchain_nomic-1.0.1-py3-none-any.whl.metadata (1.8 kB)
Downloading langchain_openai-1.0.3-py3-none-any.whl (82 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m82.5/82.5 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langchain_nomic-1.0.1-py3-none-any.whl (4.1 kB)
Installing collected packages: langchain-openai, langchain-nomic
  Attempting uninstall: langchain-openai
    Found existing installation: langchain-openai 0.3.35
    Uninstalling langchain-openai-0.3.35:
      Successfully uninstalled langchain-openai-0.3.35
  Attempting uninstall: langchain-nomic
    Found existing installation: langchain-nomic 0.1.5
    Uninstalling langchain-nomic-0.1.5:
      Successfully uninstalled langchain-nomic-0.1.5
Successfully installed langchain-nomic-1.0.1 langchain-openai-1.0.3


In [5]:
import threading
import subprocess
import time

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

In [6]:
!ollama pull llama3.2

[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l


In [7]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown

template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="llama3.2")

chain = prompt | model

display(Markdown(chain.invoke({"question": "What's the length of hypotenuse in a right angled triangle"})))

To find the length of the hypotenuse in a right-angled triangle, we can use the Pythagorean theorem.

The Pythagorean theorem states that in a right-angled triangle, the square of the length of the hypotenuse (c) is equal to the sum of the squares of the lengths of the other two sides (a and b).

Mathematically, this can be expressed as:

c² = a² + b²

To find the length of the hypotenuse, we need to take the square root of both sides of the equation:

c = √(a² + b²)

So, to answer your question, the length of the hypotenuse in a right-angled triangle can be found by taking the square root of the sum of the squares of the lengths of the other two sides.

In [8]:
import faiss
print(faiss.__version__)

1.12.0


In [9]:
import faiss
import numpy as np

# Create random vectors
d = 64                  # dimension
xb = np.random.random((1000, d)).astype('float32')
xq = np.random.random((5, d)).astype('float32')

# Build index
index = faiss.IndexFlatL2(d)
index.add(xb)

# Search
D, I = index.search(xq, 3)

print("Distances:\n", D)
print("Indices:\n", I)


Distances:
 [[6.5923653 7.255051  7.579202 ]
 [6.761565  7.13003   7.377714 ]
 [5.9324813 6.372758  6.702374 ]
 [6.5101876 6.5497303 6.767188 ]
 [7.7357616 7.980453  8.142959 ]]
Indices:
 [[418 649 284]
 [193   3 227]
 [415 272 663]
 [861 417 453]
 [859 556   3]]


In [11]:
from langchain_nomic import NomicEmbeddings

print("LangChain-Nomic OK")

LangChain-Nomic OK


In [12]:
!pip install nomic



In [14]:
!pip list

Package                                  Version
---------------------------------------- --------------------
absl-py                                  1.4.0
absolufy-imports                         0.3.1
accelerate                               1.11.0
aiofiles                                 24.1.0
aiohappyeyeballs                         2.6.1
aiohttp                                  3.13.2
aiosignal                                1.4.0
alabaster                                1.0.0
albucore                                 0.0.24
albumentations                           2.0.8
ale-py                                   0.11.2
alembic                                  1.17.1
altair                                   5.5.0
annotated-doc                            0.0.4
annotated-types                          0.7.0
antlr4-python3-runtime                   4.9.3
anyio                                    4.11.0
anywidget                                0.9.19
argon2-cffi                        

In [15]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown

template = """Question: {question}

Answer: Let's think step by step."""

prompt = ChatPromptTemplate.from_template(template)

model = OllamaLLM(model="llama3.2")

chain = prompt | model

display(Markdown(chain.invoke({"question": "What's the length of hypotenuse in a right angled triangle"})))

To find the length of the hypotenuse in a right-angled triangle, we can use the Pythagorean theorem.

Here are the steps:

1. We need to know the lengths of the other two sides (the legs) of the triangle.
2. The formula for the Pythagorean theorem is: c² = a² + b²
3. Where 'c' is the length of the hypotenuse, and 'a' and 'b' are the lengths of the other two sides.

For example, if we have a right-angled triangle with one leg of length 3 inches and the other leg of length 4 inches, we can plug these values into the formula to get:

c² = 3² + 4²
= 9 + 16
= 25

Now, to find the length of the hypotenuse (c), we take the square root of both sides:

c = √25
= 5 inches

In [18]:
!pip install gpt4all

Collecting gpt4all
  Downloading gpt4all-2.8.2-py3-none-manylinux1_x86_64.whl.metadata (4.8 kB)
Downloading gpt4all-2.8.2-py3-none-manylinux1_x86_64.whl (121.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m121.6/121.6 MB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: gpt4all
Successfully installed gpt4all-2.8.2


In [19]:
!pip install nomic --upgrade



In [21]:
from nomic import embed

result = embed.text(
    texts=["hello world"],
    model="nomic-embed-text-v1.5",
    dimensionality=768,
    local=True     # important!
)

print(result["embeddings"][0][:5])


TypeError: Unexpected keyword arguments: ['local']

In [20]:
from nomic import embed
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_community.docstore.in_memory import InMemoryDocstore
import faiss

# ----- 1. Initialize Nomic Embeddings -----
emb = embed.text(
    texts=['Nomic Embedding API', '#keepAIOpen'],
    model='nomic-embed-text-v1.5',
    task_type='search_document',
    inference_mode='local',
)

# Create FAISS index with correct dimension
index = faiss.IndexFlatL2(
    len(emb.embed_query(" "))
)

# Build FAISS vector store
vector_store = FAISS(
    embedding_function=emb,
    index=index,
    docstore=InMemoryDocstore(),
    index_to_docstore_id={}
)

RuntimeError: The 'gpt4all' package is required for local inference. Suggestion: `pip install "nomic[local]"`