<a href="https://colab.research.google.com/github/imvickykumar999/Ollama-Model/blob/main/Gemma/Run_with_Ollama.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2024 Google LLC.

In [1]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemma - Run with Ollama

This notebook demonstrates how you can run inference on a Gemma model using  [Ollama](https://ollama.com/). Ollama is an easy-to-use solution for running LLMs locally and provides built-in support Gemma.

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Gemma/Run_with_Ollama.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

## Setup

### Select the Colab runtime
To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:

1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.
2. Select **Change runtime type**.
3. Under **Hardware accelerator**, select **T4 GPU**.

## Installation


Install Ollama through the offical installation script.

In [None]:
!curl -fsSL https://ollama.com/install.sh | sh

>>> Downloading ollama...
############################################################################################# 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


## Start Ollama

Start Ollama in background using nohup.

In [None]:
!nohup ollama serve > ollama.log &

nohup: redirecting stderr to stdout


## Inference

Run inference using command line.

In [None]:
!ollama run gemma:7b "What is the capital of France?" 2> ollama.log

The capital of France is **Paris**.



Generate a response via REST endpoint

In [None]:
!curl http://localhost:11434/api/generate -d '{ \
  "model": "gemma:7b", \
  "prompt":"What is the capital of Portugal?" \
}'

{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.14899689Z","response":"The","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.178231303Z","response":" capital","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.207532308Z","response":" of","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.236605028Z","response":" Portugal","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.265333563Z","response":" is","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.294147887Z","response":" **","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.323264861Z","response":"Lis","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.35282411Z","response":"bon","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.382855843Z","response":"**.","done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.413118162Z","response":"\n\n","done":false}
{"model":"gemma:7b",

Chat with Gemma via REST endpoint

In [None]:
!curl http://localhost:11434/api/chat -d '{ \
  "model": "gemma:7b", \
  "messages": [ \
    { "role": "user", "content": "what is the capital of Spain?" } \
  ] \
}'

{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.80317128Z","message":{"role":"assistant","content":"The"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.832244294Z","message":{"role":"assistant","content":" capital"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.864473026Z","message":{"role":"assistant","content":" of"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.894548916Z","message":{"role":"assistant","content":" Spain"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.924834821Z","message":{"role":"assistant","content":" is"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.954322472Z","message":{"role":"assistant","content":" **"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:56.984517118Z","message":{"role":"assistant","content":"Madrid"},"done":false}
{"model":"gemma:7b","created_at":"2024-07-08T10:53:57.014076809Z","message":{"role":"assistant","cont