# llcuda v2.1.1 Installation & Setup Guide

This notebook provides a reliable way to install and validate llcuda v2.1.1, handling common issues and providing proper error diagnostics.

## Section 1: Verify Package Availability

Check PyPI and GitHub for available versions of llcuda.

In [None]:
import subprocess
import json

print("üîç Checking PyPI for llcuda versions...\n")

# Check PyPI for available versions
result = subprocess.run(
    ["pip", "index", "versions", "llcuda"],
    capture_output=True,
    text=True
)

print("PyPI Status:")
if result.returncode == 0:
    print(result.stdout)
else:
    print("(Using pip search fallback...)")
    print(result.stderr)

print("\n" + "="*70)
print("‚úÖ GitHub Release Information:")
print("="*70)
print("Repository: https://github.com/llcuda/llcuda/")
print("Latest Release: v2.1.1")
print("Release URL: https://github.com/llcuda/llcuda/releases/tag/v2.1.1")
print("\nNote: v2.1.1 is available on GitHub but may not yet be on PyPI.")
print("We'll install directly from the GitHub repository.")

## Section 2: Install from Source Repository

Install llcuda directly from GitHub using pip's git support. This is the recommended approach for v2.1.1.

In [None]:
import subprocess
import sys

print("üì• Installing llcuda v2.1.1 from GitHub Repository...\n")
print("This will download from: https://github.com/llcuda/llcuda.git@v2.1.1\n")

# Install from GitHub
result = subprocess.run(
    [sys.executable, "-m", "pip", "install", "-q", 
     "git+https://github.com/llcuda/llcuda.git@v2.1.1"],
    capture_output=True,
    text=True
)

if result.returncode == 0:
    print("‚úÖ Installation successful!")
else:
    print("‚ùå Installation encountered issues:")
    print(result.stderr)
    if "Could not find" in result.stderr or "No matching" in result.stderr:
        print("\nüí° Hint: The tag may not be available yet. Trying latest main branch...")
        result = subprocess.run(
            [sys.executable, "-m", "pip", "install", "-q",
             "git+https://github.com/llcuda/llcuda.git"],
            capture_output=True,
            text=True
        )
        if result.returncode == 0:
            print("‚úÖ Installed from main branch successfully!")
        else:
            print("‚ö†Ô∏è  Installation failed. See error above.")

## Section 3: Handle Installation Errors

Implement error handling to catch and diagnose installation failures.

In [None]:
import subprocess
import sys

def diagnose_installation():
    """Comprehensive diagnosis of llcuda installation status."""
    
    print("üîß Running Installation Diagnostics...\n")
    
    # Check 1: Git availability
    print("1. Checking Git availability...")
    try:
        result = subprocess.run(["git", "--version"], capture_output=True, text=True, timeout=5)
        print(f"   ‚úÖ {result.stdout.strip()}")
    except Exception as e:
        print(f"   ‚ùå Git not available: {e}")
        print("   üí° Install git: sudo apt-get install git")
    
    # Check 2: Network connectivity
    print("\n2. Checking network connectivity to GitHub...")
    try:
        result = subprocess.run(
            ["python", "-m", "pip", "index", "versions", "llcuda"],
            capture_output=True,
            text=True,
            timeout=10
        )
        print("   ‚úÖ Network connection OK")
    except Exception as e:
        print(f"   ‚ùå Network issue: {e}")
        print("   üí° Check your internet connection or firewall")
    
    # Check 3: pip version
    print("\n3. Checking pip version...")
    result = subprocess.run([sys.executable, "-m", "pip", "--version"], 
                          capture_output=True, text=True)
    print(f"   ‚úÖ {result.stdout.strip()}")
    
    # Check 4: Python version
    print("\n4. Checking Python version...")
    print(f"   ‚úÖ Python {sys.version.split()[0]} (Required: 3.11+)")
    if sys.version_info < (3, 11):
        print("   ‚ö†Ô∏è  WARNING: llcuda v2.1.1 requires Python 3.11 or higher")
    
    # Check 5: NVIDIA GPU
    print("\n5. Checking for NVIDIA GPU...")
    try:
        result = subprocess.run(["nvidia-smi", "--query-gpu=name,compute_cap", "--format=csv,noheader"],
                              capture_output=True, text=True, timeout=5)
        if result.returncode == 0:
            gpus = result.stdout.strip().split('\n')
            for i, gpu in enumerate(gpus):
                print(f"   ‚úÖ GPU {i+1}: {gpu}")
        else:
            print("   ‚ö†Ô∏è  No NVIDIA GPU detected (nvidia-smi not found)")
            print("   üí° This is OK - llcuda will use CPU fallback")
    except Exception as e:
        print(f"   ‚ö†Ô∏è  Could not detect GPU: {e}")
        print("   üí° This is OK - llcuda will use CPU fallback")
    
    print("\n" + "="*70)
    print("Diagnostics Complete")
    print("="*70)

# Run diagnostics
diagnose_installation()

## Section 4: Validate Installation Success

Verify that llcuda was installed correctly and check version compatibility.

In [None]:
print("‚úÖ Validating llcuda Installation...\n")

try:
    import llcuda
    print(f"‚úÖ llcuda imported successfully")
    print(f"   Package location: {llcuda.__file__}")
    print(f"   Version: {llcuda.__version__}")
    
    # Verify version
    version = llcuda.__version__
    if version.startswith("2.1"):
        print(f"\n‚úÖ Version {version} is compatible with v2.1.x series")
    else:
        print(f"\n‚ö†Ô∏è  Version {version} - Expected 2.1.x series")
        
except ImportError as e:
    print(f"‚ùå Failed to import llcuda: {e}")
    print("\nüí° Troubleshooting steps:")
    print("   1. Check installation completed without errors above")
    print("   2. Restart the kernel (Kernel ‚Üí Restart)")
    print("   3. Try running the installation cell again")

## Section 5: Configure CUDA Binary Caching

Set up proper caching directories for CUDA binaries and verify the auto-download mechanism.

In [None]:
import os
from pathlib import Path

print("üì¶ CUDA Binary Caching Configuration\n")

# Check cache directories
cache_locations = [
    Path.home() / ".cache" / "llcuda",
    Path.home() / ".llcuda",
]

print("Expected cache locations:")
for cache_dir in cache_locations:
    status = "‚úÖ exists" if cache_dir.exists() else "‚è≥ will be created on first import"
    size = ""
    if cache_dir.exists():
        total_size = sum(f.stat().st_size for f in cache_dir.rglob('*') if f.is_file())
        size = f" ({total_size / (1024**2):.1f} MB)"
    print(f"   {cache_dir}{size} - {status}")

print("\nüì• First Import Behavior:")
print("   1. llcuda will be imported")
print("   2. Auto-detection of GPU (Tesla T4, RTX, A100, H100, etc.)")
print("   3. Download v2.1.1 CUDA binaries (~267 MB) to cache")
print("   4. Extract and configure paths")
print("   5. Subsequent imports use cached binaries")

print("\nüöÄ Starting first import (this may take a minute)...\n")

try:
    import llcuda
    print("‚úÖ First import successful!")
    print("   CUDA binaries are now cached and ready to use")
    print("\n‚ú® Features in v2.1.1:")
    print("   ‚Ä¢ Fixed llama-server fallback mechanism")
    print("   ‚Ä¢ Full compatibility with Tesla T4 and newer GPUs")
    print("   ‚Ä¢ Automatic model downloading from Hugging Face")
    print("   ‚Ä¢ High-performance inference with FlashAttention")
    
except Exception as e:
    print(f"‚ö†Ô∏è  Import encountered an issue: {e}")
    print("\nCommon solutions:")
    print("   ‚Ä¢ GPU compatibility: Requires SM 7.5+ (Tesla T4, RTX 20xx+, A100+)")
    print("   ‚Ä¢ Storage space: Ensure 1GB free space for binaries + models")
    print("   ‚Ä¢ Network: Check internet connection for binary download")