# 02. Ollama Setup

## 1. Introduction

Welcome to the third notebook in our RAG Tools series! In this notebook, we'll set up Ollama, an open-source tool that allows us to run large language models locally. We'll configure Ollama instances in Docker containers, enabling us to use different models for various tasks in our RAG (Retrieval-Augmented Generation) system, such as embedding generation and text generation.

By the end of this notebook, you'll have:
1. Updated our project directory structure for Ollama
2. Created an OllamaManager class to handle Ollama operations
3. Updated our environment variables and Docker Compose configuration
4. Tested the OllamaManager class with a sample model

## 2. Update Project Directory Structure

First, let's update our project directory structure to accommodate the Ollama models:

In [None]:
import os

# Get the project root directory
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))

# Create directories for Ollama models
ollama_models_path = os.path.join(project_root, 'db_data', 'ollama_models')
ollama_llm_path = os.path.join(ollama_models_path, 'llm')

os.makedirs(ollama_models_path, exist_ok=True)
os.makedirs(ollama_llm_path, exist_ok=True)

print(f"Created Ollama models directory at: {ollama_models_path}")
print(f"Created Ollama LLM directory at: {ollama_llm_path}")


Our updated project structure now looks like this:

```
RAG_tools/
├── config/
│   ├── docker-compose.yml
│   └── .env
├── notebooks/
│   ├── 00_Environment_Setup.ipynb
│   ├── 01_Database_Setup.ipynb
│   └── 02_Ollama_setup.ipynb
├── src/
│   └── utils/
│       ├── config_utils.py
│       └── ollama_manager.py
├── db_data/
│   ├── postgres/
│   ├── neo4j/
│   └── ollama_models/
│       └── llm/
└── tests/
```

## 3. Create OllamaManager Class

Now, let's create the OllamaManager class. This class will be used to launch and manage Ollama instances in Docker containers.

In [None]:
%%writefile ../src/utils/ollama_manager.py
import os
import requests
import time
import logging
import json
from .DockerComposeManager import DockerComposeManager
from .config_utils import Config

class OllamaManager:
    def __init__(self, config: Config):
        self.config = config
        self.container_name = self.config.OLLAMA_LLM_CONTAINER_NAME
        self.port = self.config.OLLAMA_LLM_PORT
        self.model = self.config.OLLAMA_LLM_MODEL
        self.gpu = self.config.OLLAMA_LLM_GPU
        
        self.models_path = self.config.OLLAMA_MODELS_PATH
        self.llm_path = self.config.OLLAMA_LLM_PATH
        
        print("OllamaManager initialized with:")
        print(f"container_name: {self.container_name}")
        print(f"port: {self.port}")
        print(f"model: {self.model}")
        print(f"gpu: {self.gpu}")
        print(f"models_path: {self.models_path}")
        print(f"llm_path: {self.llm_path}")
        
        # Ensure directories exist
        if self.models_path:
            os.makedirs(self.models_path, exist_ok=True)
        if self.llm_path:
            os.makedirs(self.llm_path, exist_ok=True)
        
        # Initialize DockerComposeManager
        docker_compose_path = os.path.join('..', 'config', 'docker-compose.yml')
        self.docker_manager = DockerComposeManager(docker_compose_path)

        logging.info(f"OllamaManager initialized with models_path: {self.models_path}, llm_path: {self.llm_path}")
        logging.info(f"Using model: {self.model} on port: {self.port}")

    def generate_response(self, prompt):
        try:
            payload = {
                'model': self.model,
                'prompt': prompt
            }
            logging.debug(f"Sending request to Ollama API with payload: {payload}")
            logging.debug(f"API URL: http://localhost:{self.port}/api/generate")
            
            response = requests.post(
                f'http://localhost:{self.port}/api/generate',
                json=payload,
                stream=True
            )
            logging.debug(f"Response status code: {response.status_code}")
            response.raise_for_status()
            
            full_response = ""
            for line in response.iter_lines():
                if line:
                    try:
                        chunk = json.loads(line)
                        logging.debug(f"Received chunk: {chunk}")
                        if 'response' in chunk:
                            token = chunk['response']
                            full_response += token
                            print(token, end='', flush=True)
                        if chunk.get('done', False):
                            break
                    except json.JSONDecodeError:
                        logging.warning(f"Failed to decode JSON: {line}")
            
            print("\n")  # New line after the response
            return full_response.strip()
        except requests.exceptions.RequestException as e:
            logging.error(f"Error generating response: {str(e)}")
            if hasattr(e, 'response') and e.response is not None:
                logging.error(f"Response content: {e.response.text}")
            return f"Error: {str(e)}"
        except Exception as e:
            logging.error(f"Unexpected error: {str(e)}")
            return f"Unexpected error: {str(e)}"

    def is_model_running(self):
        try:
            response = requests.get(f'http://localhost:{self.port}/api/tags')
            response.raise_for_status()
            models = response.json()
            logging.debug(f"Available models: {models}")
            return self.model in [model['name'] for model in models['models']]
        except requests.exceptions.RequestException as e:
            logging.error(f"Error checking if model is running: {e}")
            return False

    def pull_model(self):
        logging.info(f"Pulling model {self.model}...")
        logging.debug(f"self.models_path: {self.models_path}")
        logging.debug(f"self.model: {self.model}")
        
        if not self.models_path:
            logging.error("models_path is not set. Cannot pull model.")
            return

        try:
            model_path = os.path.join(self.models_path, 'models', 'manifests', 'registry.ollama.ai', 'library', self.model)
            logging.info(f"Checking for model at path: {model_path}")
        except Exception as e:
            logging.error(f"Error constructing model path: {str(e)}")
            return

        try:
            response = requests.post(f'http://localhost:{self.port}/api/pull', json={'name': self.model}, stream=True)
            response.raise_for_status()
            for line in response.iter_lines():
                if line:
                    print(line.decode())
        except requests.exceptions.RequestException as e:
            logging.error(f"Error pulling model: {str(e)}")
            raise

    def start_container(self):
        self.docker_manager.start_containers()
        logging.info(f"Started container: {self.container_name}")

    def stop_container(self):
        self.docker_manager.stop_containers()
        logging.info(f"Stopped container: {self.container_name}")

    def wait_for_ollama(self, max_attempts=5, delay=5):
        for attempt in range(max_attempts):
            try:
                response = requests.get(f'http://localhost:{self.port}/api/tags')
                if response.status_code == 200:
                    logging.info(f"Successfully connected to Ollama on port {self.port}")
                    return True
            except requests.exceptions.RequestException:
                logging.warning(f"Attempt {attempt + 1}/{max_attempts}: Ollama on port {self.port} is not ready yet. Retrying in {delay} seconds...")
                time.sleep(delay)
        logging.error(f"Failed to connect to Ollama after {max_attempts} attempts")
        return False



## 4. Update Environment Variables

Now, let's update our .env file to include the Ollama-related variables. Add the following to your `.env` file in the `config/` directory:

```
# Ollama Configuration
OLLAMA_LLM_CONTAINER_NAME=ragtools_ollama_llm
OLLAMA_LLM_PORT=11435
OLLAMA_LLM_MODEL=tinyllama
OLLAMA_LLM_GPU=0

OLLAMA_MODELS_PATH=../db_data/ollama_models
OLLAMA_LLM_PATH=../db_data/ollama_models/llm
```

## 5. Update Config Class

Before we proceed with the verification step, we need to update our Config class to include the new Ollama-related attributes. This is a crucial step when extending our framework with new components.

This step demonstrates how to extend the Config class when new components are added to the framework. It's important to update this class whenever new environment variables or configuration options are introduced.

Let's update the `config_utils.py` file:

In [None]:
%%writefile ../src/utils/config_utils.py
import os
from dotenv import load_dotenv

class Config:
    def __init__(self):
        load_dotenv()
        
        # Database configurations
        self.POSTGRES_DB = os.getenv('POSTGRES_DB')
        self.POSTGRES_USER = os.getenv('POSTGRES_USER')
        self.POSTGRES_PASSWORD = os.getenv('POSTGRES_PASSWORD')
        self.POSTGRES_HOST = os.getenv('POSTGRES_HOST')
        self.POSTGRES_PORT = os.getenv('POSTGRES_PORT')
        
        self.NEO4J_AUTH = os.getenv('NEO4J_AUTH')
        self.NEO4J_HOST = os.getenv('NEO4J_HOST')
        self.NEO4J_HTTP_PORT = os.getenv('NEO4J_HTTP_PORT')
        self.NEO4J_BOLT_PORT = os.getenv('NEO4J_BOLT_PORT')
        
        # Docker configurations
        self.POSTGRES_CONTAINER_NAME = os.getenv('POSTGRES_CONTAINER_NAME')
        self.NEO4J_CONTAINER_NAME = os.getenv('NEO4J_CONTAINER_NAME')
        self.DOCKER_NETWORK_NAME = os.getenv('DOCKER_NETWORK_NAME')

        # Ollama configurations
        self.OLLAMA_LLM_CONTAINER_NAME = os.getenv('OLLAMA_LLM_CONTAINER_NAME')
        self.OLLAMA_LLM_PORT = int(os.getenv('OLLAMA_LLM_PORT', 11435))
        self.OLLAMA_LLM_MODEL = os.getenv('OLLAMA_LLM_MODEL')
        self.OLLAMA_LLM_GPU = int(os.getenv('OLLAMA_LLM_GPU', 0))
        self.OLLAMA_MODELS_PATH = os.getenv('OLLAMA_MODELS_PATH')
        self.OLLAMA_LLM_PATH = os.getenv('OLLAMA_LLM_PATH')

    def get_postgres_connection_params(self):
        return {
            "dbname": self.POSTGRES_DB,
            "user": self.POSTGRES_USER,
            "password": self.POSTGRES_PASSWORD,
            "host": self.POSTGRES_HOST,
            "port": self.POSTGRES_PORT
        }

    def get_neo4j_connection_params(self):
        return {
            "uri": f"bolt://{self.NEO4J_HOST}:{self.NEO4J_BOLT_PORT}",
            "auth": tuple(self.NEO4J_AUTH.split('/'))
        }

    def print_all_attributes(self):
        print("All Config attributes:")
        for attr, value in self.__dict__.items():
            print(f"{attr}: {value}")


## 6. Update Docker Compose Configuration

Now, let's update our docker-compose.yml file to include the Ollama service:

```yaml
version: '3.8'

services:
  postgres:
    # ... (existing PostgreSQL configuration)

  neo4j:
    # ... (existing Neo4j configuration)

  ollama:
    image: ollama/ollama
    container_name: ${OLLAMA_LLM_CONTAINER_NAME}
    environment:
      - OLLAMA_HOST=0.0.0.0:${OLLAMA_LLM_PORT}
    ports:
      - "${OLLAMA_LLM_PORT}:${OLLAMA_LLM_PORT}"
    volumes:
      - ${OLLAMA_MODELS_PATH}:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    networks:
      - ragtools_network

networks:
  ragtools_network:
    name: ${DOCKER_NETWORK_NAME}

volumes:
  postgres_data:
  neo4j_data:
  ollama_data:
```

In [None]:
# Initialize the Config class
config = Config()

print("Configuration values:")
for attr, value in vars(config).items():
    if attr.startswith('OLLAMA_'):
        print(f"{attr}: {value}")

# Initialize OllamaManager
ollama_manager = OllamaManager(config)


## 7. Test OllamaManager Class

Now, let's test our OllamaManager class by spinning up a test model and running a simple prompt:

In [1]:
import sys
import os
import logging

logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
sys.path.append(project_root)

from src.utils.config_utils import Config
from src.utils.ollama_manager import OllamaManager
from src.utils.DockerComposeManager import DockerComposeManager

config = Config()
codestral_manager = OllamaManager(config)

print("Starting Ollama container...")
codestral_manager.start_container()

print("Waiting for Ollama to be ready...")
codestral_manager.wait_for_ollama()

print("Ensuring Codestral 22B model is available...")
codestral_manager.pull_model()

print("Testing Codestral 22B model...")
test_prompt = "Explain the concept of Retrieval-Augmented Generation in three sentences."
response = codestral_manager.generate_response(test_prompt)
print(f"Prompt: {test_prompt}")
print(f"Response: {response}")

print("Stopping Ollama container...")
codestral_manager.stop_container()


2024-07-16 11:07:39,066 - ERROR - .env file not found
2024-07-16 11:07:39,066 - INFO - Config initialized with OLLAMA_LLM_MODEL: None
2024-07-16 11:07:39,068 - INFO - OllamaManager initialized with models_path: None, llm_path: None
2024-07-16 11:07:39,068 - INFO - Using model: None on port: 11435


OllamaManager initialized with:
container_name: None
port: 11435
model: None
gpu: 0
models_path: None
llm_path: None
Starting Ollama container...


2024-07-16 11:07:39,532 - INFO - Started container: None
2024-07-16 11:07:39,533 - DEBUG - Starting new HTTP connection (1): localhost:11435



Waiting for Ollama to be ready...


2024-07-16 11:07:41,587 - DEBUG - http://localhost:11435 "GET /api/tags HTTP/11" 200 13
2024-07-16 11:07:41,587 - INFO - Successfully connected to Ollama on port 11435
2024-07-16 11:07:41,588 - INFO - Pulling model None...
2024-07-16 11:07:41,588 - DEBUG - self.models_path: None
2024-07-16 11:07:41,588 - DEBUG - self.model: None
2024-07-16 11:07:41,588 - ERROR - models_path is not set. Cannot pull model.
2024-07-16 11:07:41,589 - INFO - Generating response for prompt: Explain the concept of Retrieval-Augmented Generation in three sentences.
2024-07-16 11:07:41,589 - INFO - Sending request to http://localhost:11435/api/generate with payload: {'model': None, 'prompt': 'Explain the concept of Retrieval-Augmented Generation in three sentences.'}
2024-07-16 11:07:41,589 - DEBUG - Starting new HTTP connection (1): localhost:11435
2024-07-16 11:07:41,591 - DEBUG - http://localhost:11435 "POST /api/generate HTTP/11" 400 29
2024-07-16 11:07:41,591 - INFO - Response status code: 400
2024-07-16 1

Ensuring Codestral 22B model is available...
Testing Codestral 22B model...
Prompt: Explain the concept of Retrieval-Augmented Generation in three sentences.
Response: Error: 400 Bad Request - {"error":"model is required"}
Stopping Ollama container...


2024-07-16 11:07:42,306 - INFO - Stopped container: None





## Conclusion

In this notebook, we have successfully:

1. Set up Ollama instances in Docker containers
2. Created an OllamaManager class to handle Ollama operations
3. Implemented a method to generate responses from the LLM
4. Demonstrated the streaming nature of the LLM's output
5. Verified the functionality of our setup with test questions

## Next Steps

Our next notebook will focus on creating a CLI interface for interacting with the LLM. Before diving into the implementation, we'll need to consider:

1. LLM Configurables:
   - Context length
   - Temperature
   - Other relevant parameters (e.g., top_p, frequency_penalty, presence_penalty)

2. CLI Interface Options:
   - Evaluate the merits of adopting a pre-built CLI interface vs. creating our own
   - Consider libraries like `click`, `typer`, or `argparse` for building a custom CLI

3. Chat Interface Design:
   - How to maintain conversation history
   - Handling user input and system responses
   - Implementing commands for adjusting LLM parameters on-the-fly

4. Integration with OllamaManager:
   - How to incorporate our existing OllamaManager class into the CLI interface

By addressing these points, we'll be well-prepared to create a robust and user-friendly CLI for interacting with our LLM setup.