<a href="https://colab.research.google.com/github/imvickykumar999/Ollama-Model/blob/main/Gemma/Run_with_Ollama.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2024 Google LLC.

In [1]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemma - Run with Ollama

This notebook demonstrates how you can run inference on a Gemma model using  [Ollama](https://ollama.com/). Ollama is an easy-to-use solution for running LLMs locally and provides built-in support Gemma.

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Gemma/Run_with_Ollama.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

## Setup

### Select the Colab runtime
To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:

1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.
2. Select **Change runtime type**.
3. Under **Hardware accelerator**, select **T4 GPU**.

## Installation


Install Ollama through the offical installation script.

In [2]:
!curl -fsSL https://ollama.com/install.sh | sh

>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


## Start Ollama

Start Ollama in background using nohup.

In [3]:
!nohup ollama serve > ollama.log &

nohup: redirecting stderr to stdout


## Inference

Run inference using command line.

In [4]:
!ollama run gemma:7b "What is the capital of France?" 2> ollama.log

The capital of France is **Paris**.



Generate a response via REST endpoint

In [5]:
!curl http://localhost:11434/api/generate -d '{ \
  "model": "gemma:7b", \
  "prompt":"What is the capital of Portugal?" \
}'

{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.43222753Z","response":"The","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.47591313Z","response":" capital","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.519318633Z","response":" of","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.562144397Z","response":" Portugal","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.605497104Z","response":" is","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.648557148Z","response":" **","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.678531787Z","response":"Lis","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.707627668Z","response":"bon","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.735979643Z","response":"**.","done":false}
{"model":"gemma:7b","created_at":"2025-02-06T05:59:59.763914382Z","response":"","done":true,"done_reason":"stop","cont

Chat with Gemma via REST endpoint

In [6]:
!curl http://localhost:11434/api/chat -d '{ \
  "model": "gemma:7b", \
  "messages": [ \
    { "role": "user", "content": "what is the capital of Spain?" } \
  ] \
}'

{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.128709282Z","message":{"role":"assistant","content":"The"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.174127967Z","message":{"role":"assistant","content":" capital"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.219504069Z","message":{"role":"assistant","content":" of"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.265259517Z","message":{"role":"assistant","content":" Spain"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.303033857Z","message":{"role":"assistant","content":" is"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.332317242Z","message":{"role":"assistant","content":" **"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.360740771Z","message":{"role":"assistant","content":"Madrid"},"done":false}
{"model":"gemma:7b","created_at":"2025-02-06T06:00:12.388679869Z","message":{"role":"assistant","con

In [8]:
!python app.py

2025-02-06 06:02:40,000 - INFO - Found 25 URLs in sitemap
2025-02-06 06:02:40,141 - INFO - Processing URL: https://blogforge.pythonanywhere.com/
2025-02-06 06:02:40,142 - INFO - Successfully processed https://blogforge.pythonanywhere.com/
2025-02-06 06:02:40,163 - INFO - Processing URL: https://blogforge.pythonanywhere.com/contact/
2025-02-06 06:02:40,164 - INFO - Successfully processed https://blogforge.pythonanywhere.com/contact/
2025-02-06 06:02:40,301 - INFO - Processing URL: https://blogforge.pythonanywhere.com/blogs/exploring-the-future-of-travel-and-tourism-in-the-digital-era/
2025-02-06 06:02:40,301 - INFO - Successfully processed https://blogforge.pythonanywhere.com/blogs/exploring-the-future-of-travel-and-tourism-in-the-digital-era/
2025-02-06 06:02:40,336 - INFO - Processing URL: https://blogforge.pythonanywhere.com/about/
2025-02-06 06:02:40,337 - INFO - Successfully processed https://blogforge.pythonanywhere.com/about/
2025-02-06 06:02:40,507 - INFO - Processing URL: https

In [10]:
!python generate.py

Modelfile has been created successfully.


In [11]:
!ollama create blogforge -f ./Modelfile

[?25l[?25h[?25lgathering model components 
pulling manifest ⠋ [?25h[?25l[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠙ [?25h[?25l[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠹ [?25h[?25l[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠸ [?25h[?25l[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠼ 
pulling 74701a8c35f6...   0% ▕▏    0 B/1.3 GB                  [?25h[?25l[2K[1G[A[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠴ 
pulling 74701a8c35f6...   0% ▕▏    0 B/1.3 GB                  [?25h[?25l[2K[1G[A[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠦ 
pulling 74701a8c35f6...   2% ▕▏  21 MB/1.3 GB                  [?25h[?25l[2K[1G[A[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠧ 
pulling 74701a8c35f6...   6% ▕▏  84 MB/1.3 GB                  [?25h[?25l[2K[1G[A[2K[1G[A[2K[1Ggathering model components 
pulling manifest ⠇ 
pulling 74

In [None]:
!OLLAMA_USE_CUDA=1 ollama run blogforge

[?25l⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[?25l[2K[1G[?25h[2K[1G[?25h[?2004h>>> [38;5;245mSend a message (/? for help)[28D[0m[Ktell
...  me 
... abou
... t th
... is w
... ebsi
... te
[?25l⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?25h[?25l[2K[1G⠇ [?25h[?25l[2K[1G⠏ [?25h[?25l[2K[1G⠋ [?25h[?25l[2K[1G⠙ [?25h[?25l[2K[1G⠹ [?25h[?25l[2K[1G⠸ [?25h[?25l[2K[1G⠼ [?25h[?25l[2K[1G⠴ [?25h[?25l[2K[1G⠦ [?25h[?25l[2K[1G⠧ [?2