# Getting Started with Ollama

## Running an LLM on Ollama 

The Ollama server can be run locally on your system or in the notebook itself.

### Option A: Running Locally (Linux, MacOS, Windows)

1. [Download and install Ollama](https://ollama.com/download), if you haven't already.
1. Start the Ollama server: `ollama serve`
1. Pull down Granite models:

    ```shell
    ollama pull granite-code:3b
    ollama pull granite-code:8b
    ollama pull granite-code:20b
    ```

### Option B: Running in Notebook

1. Download and install Ollama, if you haven't already:

In [None]:
!curl https://ollama.ai/install.sh | sh

2. Start the Ollama server using `nohup` and `&` will run the server in the background:

In [None]:
import os
os.system("nohup ollama serve &")

3. Pull down Granite models:

In [None]:
!ollama pull granite-code:3b
!ollama pull granite-code:8b
!ollama pull granite-code:20b

## Querying the LLM with Langchain

In [None]:
!curl https://ollama.ai/install.sh | sh

### Select a model

Select one of the models you pulled with Ollama above.

In [13]:
model_id = "granite-code:3b"
# model_id = "granite-code:8b"
# model_id = "granite-code:20b"

### Instantiate the model client

In [14]:
from langchain_ollama.llms import OllamaLLM

model = OllamaLLM(model=model_id)

### Perform inference

Invoke the model with a prompt, and get an answer back.

In [None]:
prompt = """
    Show me a SQL query that fetches all columns for the first 50 rows
    in a table named 'users'."""

response = model.invoke(prompt)
print(response)