# Ollama

Ollama is a simple way to download and run many open source LLM's locally. The smaller the LLM the faster and easier to run, however smaller models are also very stupid and often only understand specific tasks.

- [Download](https://ollama.com/) the software and run one of the òllama run` commands below to start using it.
- [Docs](https://github.com/ollama/ollama?tab=readme-ov-file#ollama)
- [API Docs](https://github.com/ollama/ollama/blob/main/docs/api.md#api)
- [Models](https://ollama.com/library)
- Some more [models](https://github.com/ollama/ollama?tab=readme-ov-file#model-library) with size info
- Suggestions for models that could possibly be run on a laptop:
  - [llama3.1:8b](https://ollama.com/library/llama3.1:8b) from facebook 8 Billion parameter version (4.7Gb download)
    - Run with command: `ollama run llama3.1:8b`
  - [phi3:3.8b](https://ollama.com/library/phi3:3.8b) from Microsoft 3.8 Billion parameter version (2.2Gb download)
    - Run with command: `ollama run phi3:3.8b` 
  - [phi3:14b](https://ollama.com/library/phi3:14b) from Microsoft 14 Billion parameter version (7.9Gb download)
    - Run with command `ollama run phi3:14b`

## Example

- [Download](https://ollama.com/) Ollama and run to start the server:
    ```
    ollama run phi3:3.8b
    ```
    - You can now chat with it directly in the terminal.
    - To leave, type `/bye`.
    If you wish to only serve the REST API, use the following command:
    ```
    ollama serve phi3:3.8b
    ```
    - To stop the server, close the ollama application from the task tray or task manager.
    - For more commands, type `ollama --help` in the terminal.

In [17]:
import requests
import json

def prompt_ollama(model, prompt, debug=False):
    # Define the Ollama server URL
    ollama_url = "http://localhost:11434"
    
    # Send a POST request to the Ollama server
    # Url path to the HTTP POST endpoint of the Ollama REST API for generating chat responses
    url = f"{ollama_url}/api/chat"
    
    # The JSON formatted data sent to the endpoint with the POST request, also known as the request body or payload
    # https://github.com/ollama/ollama/blob/main/docs/api.md#examples-1
    body = {
        "model": model,
        "stream": False,
        "messages": [
            { "role": "system", "content": "Be very concise. Do not add formatting or newlines to responses. Answer in at most 1-3 sentences." },
            { "role": "user", "content": prompt },
        ]        
    }

    # Call the REST API
    response = requests.post(url, json=body)
    
    # Parse the body of the REST API response from text to JSON
    response_body = response.json()
    
    if debug:
        print(json.dumps(response_body, indent=4))
    
    # Get the generated output from the response
    output = response_body["message"]["content"]
    
    # Return the generated output
    return output

# Example usage
model_name = "phi3:3.8b"
input_text = "Hello, how are you?"

response = prompt_ollama(model_name, input_text, debug=True)

print("--------------------")
print("You asked: ")
print(input_text)
print("The model responded:")
print(response)

{
    "model": "phi3:3.8b",
    "created_at": "2024-08-19T12:46:48.3985307Z",
    "message": {
        "role": "assistant",
        "content": "I'm functioning within optimal parameters; thank you for asking. How about yourself?"
    },
    "done_reason": "stop",
    "done": true,
    "total_duration": 379279000,
    "load_duration": 11799400,
    "prompt_eval_count": 46,
    "prompt_eval_duration": 42199000,
    "eval_count": 19,
    "eval_duration": 323309000
}
--------------------
You asked: 
Hello, how are you?
The model responded:
I'm functioning within optimal parameters; thank you for asking. How about yourself?


In [18]:
print(prompt_ollama(model_name, "What is the capital of France?"))

Paris


### Can this model be trusted?
Run the cell below multiple times to find out.
Larger models are more reliable and more consistent, but they will also lie.

In [36]:
prompt = """
What is the most recent information you have?
Give me only the month and the year
Also yes/no, are you able to know anything after the point you were trained?
"""

response = prompt_ollama(model_name, prompt)
print(response)

The latest data available up until my last update in September 2021. No, I cannot access or predict real-time information beyond that point. Yes, I am designed with a knowledge cutoff mechanism preventing me from accessing new data post-September 2decade_two thousand twenty
