# Getting Started with Ollama

This notebook demonstrates using inference calls against a model hosted locally on [Ollama](https://ollama.com/).

### Install dependencies

In [None]:
!pip install git+https://github.com/ibm-granite-community/granite-kitchen

## Running an LLM on Ollama 

The Ollama server can be run locally on your system, or wherever the notebook is run (eg., [Colab](https://colab.research.google.com/)).

### Option A: Run Ollama Locally (Linux, MacOS, Windows)

1. [Download and install Ollama](https://ollama.com/download), if you haven't already.
1. Start the Ollama server: `ollama serve`
1. Pull down Granite models:

    ```shell
    ollama pull granite3-dense:2b
    ollama pull granite3-dense:8b
    ollama pull granite-code:3b
    ollama pull granite-code:8b
    ollama pull granite-code:20b
    ```

### Option B: Run Ollama in the Notebook Environment

1. Download and install Ollama, if you haven't already:

In [None]:
!curl https://ollama.ai/install.sh | sh

2. Start the Ollama server using `nohup` and `&`; this will run the server independently in the background:

In [None]:
import os
os.system("nohup ollama serve &")

3. Pull down Granite models:

In [None]:
!ollama pull granite3-dense:2b
# !ollama pull granite3-dense:8b
# !ollama pull granite-code:3b
# !ollama pull granite-code:8b
# !ollama pull granite-code:20b

## Querying the LLM with Langchain

In [None]:
!curl https://ollama.ai/install.sh | sh

### Select a model

Select one of the models you pulled with Ollama above.

In [None]:
model_id = "granite3-dense:2b"
# model_id = "granite3-dense:8b"
# model_id = "granite-code:3b"
# model_id = "granite-code:8b"
# model_id = "granite-code:20b"

### Instantiate the model client

In [None]:
from langchain_ollama.llms import OllamaLLM

model = OllamaLLM(model=model_id)

### Perform inference

Invoke the model with a prompt, and get an answer back.

In [None]:
prompt = """\
<|start_of_role|>user<|end_of_role|>\
Tell a story about a duck who likes french fries.<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>"""

response = model.invoke(prompt)
print(response)