# RAG Tools: CLI Implementation

## 1. Introduction

In this notebook, we'll implement a Command Line Interface (CLI) for our RAG Tools project. The CLI will allow us to interact with our Ollama instances and manage our Docker containers from the command line, providing a more efficient way to control our ML environment.
By creating a CLI, we're adding a new layer of usability to our project. This interface will make it easier to start and stop services, check their status, and interact with our LLM without having to navigate through Docker commands or Python scripts directly.

# 2. Project Structure Update
First, let's update our project structure to include our new CLI file. We'll create a new file called cli.py in the src directory

In [None]:
import os

# Get the project root directory
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))

# Path for the new CLI file
cli_file_path = os.path.join(project_root, 'src', 'utils', 'cli.py')

# Create an empty cli.py file
with open(cli_file_path, 'w') as f:
    pass

print(f"Created empty cli.py file at: {cli_file_path}")


## 3. CLI Implementation

We'll use the `typer` library to create our CLI. Typer is built on top of Click and provides a simple, intuitive interface for creating command-line tools.

### 3.1 Install Typer

Let's start by installing Typer:

In [None]:
!conda install -c conda-forge typer -y


### 3.2 Implement Basic CLI Structure

Now, let's implement the basic structure of our CLI:

In [None]:
%%writefile {cli_file_path}
import typer
import sys
from pathlib import Path

# Add the project root to the Python path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.append(str(project_root))

from src.utils.config_utils import Config
from src.utils.ollama_manager import OllamaManager
from src.utils.DockerComposeManager import DockerComposeManager

app = typer.Typer()
config = Config()
ollama_manager = OllamaManager(config)
docker_manager = DockerComposeManager(str(project_root / "config" / "docker-compose.yml"))

@app.command()
def start():
    """Start the Ollama containers"""
    typer.echo("Starting Ollama containers...")
    docker_manager.start_containers()

@app.command()
def stop():
    """Stop the Ollama containers"""
    typer.echo("Stopping Ollama containers...")
    docker_manager.stop_containers()

@app.command()
def status():
    """Check the status of the Ollama containers"""
    typer.echo("Checking container status...")
    docker_manager.show_container_status()

@app.command()
def chat():
    """Start a chat session with the LLM"""
    typer.echo("Starting chat session. Type 'exit' to end the session.")
    while True:
        prompt = typer.prompt("You")
        if prompt.lower() == 'exit':
            break
        response = ollama_manager.generate_response(prompt)
        typer.echo(f"LLM: {response}")

if __name__ == "__main__":
    app()


## 4. Test the CLI tool

Let's test the cli.py script commands for starting/stopping containers, checking status, and chatting with the LLM. The bellow commands will just tell you if the CLI help menu is working and that the docker containers are running.

In [None]:
!python {cli_file_path} --help
!python {cli_file_path} status
!python {cli_file_path} chat


however given the nature of the jypter notebook we will need to do some further checking at the CLI. Run the following commands from a terminal on your desktop. Be sure to update the file path as needed in the second command.
```bash
conda activate ragtools
python3 src/utils/cli.py chat
```

You should see similar output to this bellow:
```text
/RAG_tools$ python3 src/utils/cli.py chat
Starting chat session. Type 'exit' to end the session.
You: what is the capitol of Texas?                                            
ERROR:root:Error generating response: 400 Client Error: Bad Request for url: http://localhost:11435/api/generate
LLM: None
You: exit
```

## 5. Troubleshooting and Improving the CLI

After initial testing, we encountered some issues with our CLI, particularly when trying to generate responses from the LLM. Let's analyze the problem and implement a solution.

### 5.1 Problem Identification

1. The Ollama service is running correctly on port 11435.
2. The service can handle GET requests to `/api/tags` successfully.
3. POST requests to `/api/generate` are sometimes successful, but often result in 400 Bad Request errors.

### 5.2 Potential Issues

1. The format of the request being sent to the Ollama API might be incorrect.
2. There might be an issue with how we're handling the streaming response from the API.
3. The error handling in our `generate_response` method may not be capturing all potential errors.

### 5.3 Solution Implementation

Let's update our `OllamaManager` class to address these issues:

1. Ensure the request format matches Ollama's API expectations.
2. Improve error handling and logging.
3. Implement proper handling of the streaming response.

Here's an updated version of the `generate_response` method:
```python
import os
import requests
import time
import logging
import json
from .DockerComposeManager import DockerComposeManager
from .config_utils import Config

class OllamaManager::
    # ... (previous code remains the same)

    def generate_response(self, prompt):
        try:
            response = requests.post(
                f'http://localhost:{self.port}/api/generate',
                json={'model': self.model, 'prompt': prompt},
                stream=True
            )
            response.raise_for_status()
            
            full_response = ""
            for line in response.iter_lines():
                if line:
                    try:
                        chunk = json.loads(line)
                        if 'response' in chunk:
                            token = chunk['response']
                            full_response += token
                            print(token, end='', flush=True)
                        if chunk.get('done', False):
                            break
                    except json.JSONDecodeError:
                        logging.warning(f"Failed to decode JSON: {line}")
            
            print("\n")  # New line after the response
            return full_response.strip()
        except requests.exceptions.RequestException as e:
            logging.error(f"Error generating response: {str(e)}")
            if hasattr(e.response, 'text'):
                logging.error(f"Response content: {e.response.text}")
            return f"Error: {str(e)}"
        except Exception as e:
            logging.error(f"Unexpected error: {str(e)}")
            return f"Unexpected error: {str(e)}"
```
the code block below will update the ollama_manager.py file.

In [10]:
%%writefile ../src/utils/ollama_manager.py
import os
import requests
import time
import logging
import json
from .DockerComposeManager import DockerComposeManager
from .config_utils import Config


class OllamaManager:
    def __init__(self, config: Config):
        self.config = config
        self.container_name = self.config.OLLAMA_LLM_CONTAINER_NAME
        self.port = self.config.OLLAMA_LLM_PORT
        self.model = self.config.OLLAMA_LLM_MODEL
        self.gpu = self.config.OLLAMA_LLM_GPU
        
        self.models_path = self.config.OLLAMA_MODELS_PATH or os.path.expanduser('~/ollama_models')
        self.llm_path = self.config.OLLAMA_LLM_PATH or os.path.join(self.models_path, 'llm')
        
        # Ensure directories exist
        os.makedirs(self.models_path, exist_ok=True)
        os.makedirs(self.llm_path, exist_ok=True)
        
        # Initialize DockerComposeManager
        docker_compose_path = os.path.join('..', 'config', 'docker-compose.yml')
        self.docker_manager = DockerComposeManager(docker_compose_path)

        logging.info(f"OllamaManager initialized with models_path: {self.models_path}, llm_path: {self.llm_path}")
        logging.info(f"Using model: {self.model} on port: {self.port}")

    def is_model_running(self):
        try:
            response = requests.get(f'http://localhost:{self.port}/api/tags')
            response.raise_for_status()
            models = response.json()
            return self.model in [model['name'] for model in models['models']]
        except requests.exceptions.RequestException as e:
            logging.error(f"Error checking if model is running: {e}")
            return False

    def generate_response(self, prompt):
        try:
            response = requests.post(
                f'http://localhost:{self.port}/api/generate',
                json={'model': self.model, 'prompt': prompt},
                stream=True
            )
            response.raise_for_status()
            
            full_response = ""
            for line in response.iter_lines():
                if line:
                    try:
                        chunk = json.loads(line)
                        if 'response' in chunk:
                            token = chunk['response']
                            full_response += token
                            print(token, end='', flush=True)
                        if chunk.get('done', False):
                            break
                    except json.JSONDecodeError:
                        logging.warning(f"Failed to decode JSON: {line}")
            
            print("\n")  # New line after the response
            return full_response.strip()
        except requests.exceptions.RequestException as e:
            logging.error(f"Error generating response: {str(e)}")
            if hasattr(e.response, 'text'):
                logging.error(f"Response content: {e.response.text}")
            return f"Error: {str(e)}"
        except Exception as e:
            logging.error(f"Unexpected error: {str(e)}")
            return f"Unexpected error: {str(e)}"

    def pull_model(self):
        logging.debug(f"Starting pull_model for model: {self.model}")
        logging.debug(f"self.models_path: {self.models_path}")
        logging.debug(f"self.model: {self.model}")
        
        try:
            model_path = os.path.join(self.models_path, 'models', 'manifests', 'registry.ollama.ai', 'library', self.model)
            logging.info(f"Checking for model at path: {model_path}")
        except Exception as e:
            logging.error(f"Error constructing model path: {str(e)}")
            raise
        
        if os.path.exists(model_path):
            logging.info(f"Model {self.model} already exists. Skipping download.")
            return

        logging.info(f"Pulling model {self.model}...")
        try:
            response = requests.post(f'http://localhost:{self.port}/api/pull', json={'name': self.model}, stream=True)
            response.raise_for_status()
            for line in response.iter_lines():
                if line:
                    print(line.decode())
        except requests.exceptions.RequestException as e:
            logging.error(f"Error pulling model: {str(e)}")
            raise

    def start_container(self):
        self.docker_manager.start_containers()
        logging.info(f"Started container: {self.container_name}")

    def stop_container(self):
        self.docker_manager.stop_containers()
        logging.info(f"Stopped container: {self.container_name}")

    def wait_for_ollama(self, max_attempts=5, delay=5):
        for attempt in range(max_attempts):
            try:
                response = requests.get(f'http://localhost:{self.port}/api/tags')
                if response.status_code == 200:
                    logging.info(f"Successfully connected to Ollama on port {self.port}")
                    return True
            except requests.exceptions.RequestException:
                logging.warning(f"Attempt {attempt + 1}/{max_attempts}: Ollama on port {self.port} is not ready yet. Retrying in {delay} seconds...")
                time.sleep(delay)
        logging.error(f"Failed to connect to Ollama after {max_attempts} attempts")
        return False



Overwriting ../src/utils/ollama_manager.py


This updated version includes better error handling and logging, which should help us identify any issues more easily.

### 5.4 CLI Update

We should also update our CLI to handle these potential errors more gracefully:

```python
@app.command()
def chat():
    """Start a chat session with the LLM"""
    typer.echo("Starting chat session. Type 'exit' to end the session.")
    logging.debug(f"Configuration values:")
    logging.debug(f"OLLAMA_LLM_CONTAINER_NAME: {config.OLLAMA_LLM_CONTAINER_NAME}")
    logging.debug(f"OLLAMA_LLM_PORT: {config.OLLAMA_LLM_PORT}")
    logging.debug(f"OLLAMA_LLM_MODEL: {config.OLLAMA_LLM_MODEL}")
    logging.debug(f"Using model: {ollama_manager.model}")
    logging.debug(f"Ollama port: {ollama_manager.port}")
    
    # Check if the model is running
    if not ollama_manager.is_model_running():
        typer.echo(f"Error: Model {ollama_manager.model} is not running. Please start the model first.")
        return

    while True:
        prompt = typer.prompt("You")
        if prompt.lower() == 'exit':
            break
        logging.debug(f"Sending prompt to OllamaManager: {prompt}")
        response = ollama_manager.generate_response(prompt)
        logging.debug(f"Received response from OllamaManager: {response[:100]}...")  # Log first 100 chars
        if response.startswith("Error:") or response.startswith("Unexpected error:"):
            typer.echo(f"LLM Error: {response}", err=True)
        else:
            typer.echo(f"LLM: {response}")

if __name__ == "__main__":
    app()
```

This update will display errors to the user more clearly and use the standard error stream for error messages.

In [12]:
%%writefile ../src/utils/cli.py
import typer
import sys
from pathlib import Path

# Add the project root to the Python path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.append(str(project_root))

from src.utils.config_utils import Config
from src.utils.ollama_manager import OllamaManager
from src.utils.DockerComposeManager import DockerComposeManager

app = typer.Typer()
config = Config()
ollama_manager = OllamaManager(config)
docker_manager = DockerComposeManager(str(project_root / "config" / "docker-compose.yml"))

@app.command()
def start():
    """Start the Ollama containers"""
    typer.echo("Starting Ollama containers...")
    docker_manager.start_containers()

@app.command()
def stop():
    """Stop the Ollama containers"""
    typer.echo("Stopping Ollama containers...")
    docker_manager.stop_containers()

@app.command()
def status():
    """Check the status of the Ollama containers"""
    typer.echo("Checking container status...")
    docker_manager.show_container_status()

@app.command()
def chat():
    """Start a chat session with the LLM"""
    typer.echo("Starting chat session. Type 'exit' to end the session.")
    logging.debug(f"Configuration values:")
    logging.debug(f"OLLAMA_LLM_CONTAINER_NAME: {config.OLLAMA_LLM_CONTAINER_NAME}")
    logging.debug(f"OLLAMA_LLM_PORT: {config.OLLAMA_LLM_PORT}")
    logging.debug(f"OLLAMA_LLM_MODEL: {config.OLLAMA_LLM_MODEL}")
    logging.debug(f"Using model: {ollama_manager.model}")
    logging.debug(f"Ollama port: {ollama_manager.port}")
    
    # Check if the model is running
    if not ollama_manager.is_model_running():
        typer.echo(f"Error: Model {ollama_manager.model} is not running. Please start the model first.")
        return

    while True:
        prompt = typer.prompt("You")
        if prompt.lower() == 'exit':
            break
        logging.debug(f"Sending prompt to OllamaManager: {prompt}")
        response = ollama_manager.generate_response(prompt)
        logging.debug(f"Received response from OllamaManager: {response[:100]}...")  # Log first 100 chars
        if response.startswith("Error:") or response.startswith("Unexpected error:"):
            typer.echo(f"LLM Error: {response}", err=True)
        else:
            typer.echo(f"LLM: {response}")

if __name__ == "__main__":
    app()


Overwriting ../src/utils/cli.py


## 5. Conclusion and Next Steps

In this notebook, we've successfully implemented a basic CLI for our RAG Tools project. This CLI allows us to manage our Docker containers and interact with our LLM from the command line.

Next steps could include:
1. Adding more advanced commands (e.g., switching models, viewing logs)
2. Implementing error handling and input validation
3. Adding support for LLM configurables (which we'll address in a future notebook)
4. Creating a user guide for the CLI

In our next notebook, we'll focus on [brief description of the next topic].