# NVIDIA GPU Setup Guide for WSL in CodexContinue

This notebook provides a step-by-step guide for diagnosing and fixing NVIDIA GPU support in Windows Subsystem for Linux (WSL) for the CodexContinue project. The guide focuses on resolving common issues with missing libraries and ensuring proper Docker integration with GPU support.

## Import Required Libraries

First, let's import the necessary Python libraries for system checks and diagnostics.

In [None]:
import os
import subprocess
import json
import glob
import sys

# For prettier output
from IPython.display import display, Markdown

## Check NVIDIA Libraries in WSL

Let's check for the presence of NVIDIA libraries in the standard WSL location (`/usr/lib/wsl/lib/libnvidia-ml.so`).

In [None]:
def check_nvidia_libraries():
    wsl_lib_path = "/usr/lib/wsl/lib/libnvidia-ml.so"
    if os.path.exists(wsl_lib_path):
        display(Markdown(f"✅ Found NVIDIA libraries in WSL driver location: `{wsl_lib_path}`"))
    else:
        display(Markdown(f"❌ Missing NVIDIA libraries in standard WSL location: `{wsl_lib_path}`"))
        
    # Check LD_LIBRARY_PATH
    ld_library_path = os.environ.get('LD_LIBRARY_PATH', '')
    display(Markdown(f"Current LD_LIBRARY_PATH: `{ld_library_path}`"))

check_nvidia_libraries()

## Verify `nvidia-smi` Output

Now, let's run the `nvidia-smi` command to verify GPU status.

In [None]:
def run_nvidia_smi():
    try:
        result = subprocess.run(['which', 'nvidia-smi'], capture_output=True, text=True, check=False)
        if result.returncode == 0:
            display(Markdown(f"Found nvidia-smi at: `{result.stdout.strip()}`"))
            
            # Run nvidia-smi
            nvidia_smi_output = subprocess.run(['nvidia-smi'], capture_output=True, text=True, check=False)
            if nvidia_smi_output.returncode == 0:
                display(Markdown("### nvidia-smi output:"))
                print(nvidia_smi_output.stdout)
                return True
            else:
                display(Markdown(f"❌ Error running nvidia-smi: {nvidia_smi_output.stderr}"))
        else:
            display(Markdown("❌ nvidia-smi command not found"))
    except Exception as e:
        display(Markdown(f"❌ Error: {str(e)}"))
    return False

ran_successfully = run_nvidia_smi()

## Search for NVIDIA Libraries

Let's search for `libnvidia-ml.so` in common library paths and display the results.

In [None]:
def search_nvidia_libraries():
    display(Markdown("Looking for libnvidia-ml.so in common paths:"))
    
    search_paths = [
        "/usr/lib",
        "/usr/lib/x86_64-linux-gnu",
        "/usr/lib/wsl/lib",
        "/usr/local/cuda*/targets/x86_64-linux/lib"
    ]
    
    found_libs = []
    
    for path in search_paths:
        # Handle glob patterns
        if '*' in path:
            matching_dirs = glob.glob(path)
            for dir in matching_dirs:
                lib_file = os.path.join(dir, "libnvidia-ml.so")
                if os.path.exists(lib_file):
                    found_libs.append(lib_file)
                # Check for stub files
                stub_path = os.path.join(dir, "stubs/libnvidia-ml.so")
                if os.path.exists(stub_path):
                    found_libs.append(stub_path + " (stub)")
        else:
            lib_file = os.path.join(path, "libnvidia-ml.so")
            if os.path.exists(lib_file):
                found_libs.append(lib_file)
    
    if found_libs:
        for lib in found_libs:
            display(Markdown(f"- Found `{lib}`"))
    else:
        display(Markdown("❌ No NVIDIA libraries found in common paths"))

search_nvidia_libraries()

## Check NVIDIA Container Toolkit Configuration

Let's read and parse the Docker daemon configuration file to verify NVIDIA runtime settings.

In [None]:
def check_docker_nvidia_config():
    docker_config_path = "/etc/docker/daemon.json"
    
    if os.path.exists(docker_config_path):
        try:
            with open(docker_config_path, 'r') as f:
                config = json.load(f)
                
            display(Markdown("### Docker daemon configuration:"))
            print(json.dumps(config, indent=4))
            
            if 'runtimes' in config and 'nvidia' in config['runtimes']:
                display(Markdown("✅ NVIDIA Container Runtime is configured"))
                return True
            else:
                display(Markdown("❌ NVIDIA Container Runtime is not configured"))
        except Exception as e:
            display(Markdown(f"Error reading Docker configuration: {str(e)}"))
    else:
        display(Markdown(f"❌ Docker daemon configuration file not found at {docker_config_path}"))
    
    return False

docker_configured = check_docker_nvidia_config()

## Test Docker GPU Capability

Now let's run a Docker command to check for GPU support.

In [None]:
def test_docker_gpu():
    display(Markdown("### Testing Docker with GPU support:"))
    
    # First check docker info for nvidia runtime
    try:
        docker_info = subprocess.run(['docker', 'info'], capture_output=True, text=True, check=False)
        if 'nvidia' in docker_info.stdout and 'Runtimes' in docker_info.stdout:
            display(Markdown("✅ Docker info shows nvidia runtime is available"))
        else:
            display(Markdown("❌ Docker info does not show nvidia runtime"))
            print("Docker runtimes found:")
            for line in docker_info.stdout.splitlines():
                if 'Runtimes' in line:
                    print(line)
    except Exception as e:
        display(Markdown(f"Error checking Docker info: {str(e)}"))
    
    # Try running a container with GPU access
    try:
        display(Markdown("### Testing GPU access with a CUDA container:"))
        display(Markdown("Running: `docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi`"))
        
        # Run the command but don't capture output as it might be large
        display(Markdown("This might take a moment to pull the container image if it's not already cached..."))
        result = subprocess.run(
            ['docker', 'run', '--rm', '--gpus', 'all', 'nvidia/cuda:11.6.2-base-ubuntu20.04', 'nvidia-smi'],
            capture_output=True, text=True, check=False
        )
        
        if result.returncode == 0:
            display(Markdown("✅ Successfully ran nvidia-smi in a Docker container with GPU access"))
            print(result.stdout)
            return True
        else:
            display(Markdown(f"❌ Failed to run nvidia-smi in Docker: {result.stderr}"))
    except Exception as e:
        display(Markdown(f"Error testing Docker GPU support: {str(e)}"))
    
    return False

# Only run if the user wants to test Docker
test_docker = input("Do you want to test Docker with GPU support? (y/n): ")
if test_docker.lower() == 'y':
    docker_test_successful = test_docker_gpu()
else:
    display(Markdown("Skipping Docker GPU test"))
    docker_test_successful = None

## Provide Fix Suggestions

Based on the diagnostic results, here are actionable suggestions to fix common issues.

In [None]:
def provide_fixes():
    display(Markdown("## Diagnostic Summary and Fix Suggestions"))
    
    # Check if NVIDIA libraries exist in /usr/lib/wsl/lib
    wsl_lib_path = "/usr/lib/wsl/lib/libnvidia-ml.so"
    wsl_lib_exists = os.path.exists(wsl_lib_path)
    
    # Check if we found nvidia-smi earlier
    nvidia_smi_runs = ran_successfully
    
    # Check Docker configuration
    docker_configured_correctly = docker_configured
    
    # Print summary
    display(Markdown("### Status Summary:"))
    display(Markdown(f"- NVIDIA libraries in WSL path: {'✅ Found' if wsl_lib_exists else '❌ Missing'}"))
    display(Markdown(f"- nvidia-smi runs successfully: {'✅ Yes' if nvidia_smi_runs else '❌ No'}"))
    display(Markdown(f"- Docker NVIDIA runtime configured: {'✅ Yes' if docker_configured_correctly else '❌ No'}"))
    
    # Generate fix suggestions
    display(Markdown("### Fix Suggestions:"))
    
    fixes = []
    
    if not wsl_lib_exists and nvidia_smi_runs:
        # We have NVIDIA drivers but missing the symbolic link
        fixes.append("**Missing Library Symbolic Link**:"
                  "\n1. Run the fix script: `sudo ./scripts/fix-nvidia-wsl-libs.sh`"
                  "\n2. This will create the symbolic link from an existing library to the standard WSL location")
    
    if not nvidia_smi_runs:
        fixes.append("**NVIDIA Driver Issues**:"
                  "\n1. Verify you have the NVIDIA driver for WSL installed on Windows (not in WSL)"
                  "\n2. Install from: https://developer.nvidia.com/cuda/wsl"
                  "\n3. Restart your Windows system after installation"
                  "\n4. Make sure your GPU is CUDA-capable and supported by WSL2")
    
    if not docker_configured_correctly:
        fixes.append("**Docker NVIDIA Runtime Configuration**:"
                  "\n1. Install NVIDIA Container Toolkit: `sudo apt-get install -y nvidia-container-toolkit`"
                  "\n2. Configure Docker: `sudo nvidia-ctk runtime configure --runtime=docker`"
                  "\n3. Restart Docker: `sudo systemctl restart docker`")
    
    if not fixes:
        display(Markdown("✅ **Congratulations!** Your NVIDIA GPU appears to be properly configured for WSL and Docker."))
        display(Markdown("You should now be able to use GPU acceleration with Ollama in the CodexContinue project."))
        display(Markdown("Run the following to start with GPU support:"))
        display(Markdown("`docker compose -f docker-compose.yml up`"))
    else:
        for i, fix in enumerate(fixes):
            display(Markdown(f"#### Fix {i+1}:\n{fix}"))
    
    # Add CodexContinue specific notes
    display(Markdown("### CodexContinue Project Notes:"))
    display(Markdown("1. After fixing GPU issues, test the setup with Ollama:"))
    display(Markdown("   ```bash\n   ./scripts/start-ollama-wsl.sh\n   ```"))
    display(Markdown("2. If problems persist in CodexContinue, use the CPU-only configuration:"))
    display(Markdown("   ```bash\n   docker compose -f docker-compose.yml -f docker-compose.macos.yml up\n   ```"))

provide_fixes()