# Ollama setup

https://ollama.com/

#### Models:

https://ollama.com/library

#### API endpoints i.e., base URL

These endpoints can be invoked with HTTP POST calls

https://github.com/ollama/ollama/blob/main/docs/api.md

E.g., http://localhost:11434/api/tags


#### Models names

https://github.com/ollama/ollama/blob/main/docs/api.md#model-names

#### Pull the models that you will use



Use the command ollama pull <model-name>

In [6]:
import requests

# URL for the endpoints
base_url = 'http://localhost:11434'

## Get model information
#### /api/show/

Get information about the models.

https://github.com/ollama/ollama/blob/main/docs/api.md#show-model-information

```
curl http://localhost:11434/api/show -d '{
  "name": "llama3.1"
}'
```

In [9]:
# Create the URL for getting model information
url = base_url + '/api/show'

# Query to be sent in body
query = {
  "name": "llama3.1"
}

# Invoke API
response = requests.post(url, json=query)


<Response [200]>


In [12]:
from IPython.display import JSON

# Pretty print JSON
JSON(response.json())

<IPython.core.display.JSON object>

## Invoke a model (text generation)
#### /api/generate/

Generates completions based on the prompts

* Pull the model prior to running the code
* Use the command ```ollama pull <model-name>```

https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion

In [3]:
# Create the URL for completions
url = base_url + '/api/generate/'

question = 'what is the capital of France?'

model_name="llama3.1"
# model_name = "gemma2"

# Setup the query
query = {
  "model": model_name,
  "prompt": question,
  "stream": False
}

# Print the query
print(query)

# Invoke the model
response = requests.post(url, json=query)

{'model': 'llama3.1', 'prompt': 'what is the capital of France?', 'stream': False}


In [5]:
print("model = ", response.json()['model'])
print("response = ", response.json()['response'])

model =  llama3.1
response =  The capital of France is Paris.


## Generate the embedding

#### /api/embed

```
curl http://localhost:11434/api/embed -d '{
  "model": "all-minilm",
  "input": "Why is the sky blue?"
}'
```

In [23]:
# Create the URL for completions
url = base_url + '/api/embed/'

text_to_embed = 'what is the capital of France?'

model_name="llama3.1"
# model_name = "gemma2"

# Setup the query
query = {
  "model": model_name,
  "input": text_to_embed
}

# Print the query
print(query)

# Invoke the model
response = requests.post(url, json=query)

{'model': 'llama3.1', 'input': 'what is the capital of France?'}


In [26]:
print("response = ", response.json()['embeddings'])

response =  [[-0.0066903066, -0.02398971, 0.019075649, 0.019821068, -0.013049805, -0.015515752, 0.012045559, -0.005566741, -0.002857238, 0.010428703, 0.0042861123, 0.0013013287, -0.0076645855, -0.00430129, -0.005899321, 0.012941956, -0.016326845, -0.012703059, -0.030795956, 0.01076826, -0.0017009195, -0.008293275, 0.01634423, 0.0067773373, -0.020835157, -0.0012843953, 0.00975133, 4.3528446e-05, -0.008498694, -0.006049714, -0.0081323795, 0.0055420045, 0.0019687326, 0.0018798327, 0.05983707, -0.0017327972, -0.01785057, -0.006082623, 0.017558703, 0.007517587, 0.0037891876, -0.009714222, 0.0021945892, 0.00924892, 0.012759032, -0.013938318, 0.01053351, -0.0007843376, 0.0053187176, -0.014070577, 0.0043032006, 0.01275136, -0.0065982128, -0.021384932, 0.010541776, 0.0027303898, 0.0020268967, 0.011365201, -0.012446224, 0.008164922, -0.012386633, 0.0034320753, -0.020028196, 0.011394283, 0.005505171, 0.004499805, -0.0021315531, 0.014888749, 0.0059285634, 0.009778211, 0.002812484, -0.014678711, -0