# GameForge SDXL on RTX 4090 - Interactive Deployment

This notebook will help us deploy and test your GameForge SDXL service on your Vast.ai RTX 4090 instance.

**Instance Details:**
- Instance ID: 25599851
- GPU: RTX 4090 (24GB VRAM)
- Host: 3483
- Cost: ~$0.20-0.40/hour vs $1.00/hour on AWS

## Step 1: Environment Check

Let's verify our RTX 4090 and system environment:

In [None]:
# Check system information
import os
import subprocess
import sys

print("=== System Information ===")
print(f"Python Version: {sys.version}")
print(f"Current Directory: {os.getcwd()}")
print(f"User: {os.getenv('USER', 'unknown')}")
print("")

# Check GPU
try:
    result = subprocess.run(['nvidia-smi'], capture_output=True, text=True)
    if result.returncode == 0:
        print("=== NVIDIA GPU Information ===")
        print(result.stdout)
    else:
        print("❌ nvidia-smi failed")
except Exception as e:
    print(f"❌ Error running nvidia-smi: {e}")

## Step 2: PyTorch GPU Test

Test PyTorch CUDA support on your RTX 4090:

In [None]:
# Test PyTorch GPU support
try:
    import torch
    print("=== PyTorch GPU Test ===")
    print(f"PyTorch Version: {torch.__version__}")
    print(f"CUDA Available: {torch.cuda.is_available()}")
    
    if torch.cuda.is_available():
        print(f"CUDA Version: {torch.version.cuda}")
        print(f"GPU Count: {torch.cuda.device_count()}")
        print(f"Current GPU: {torch.cuda.current_device()}")
        print(f"GPU Name: {torch.cuda.get_device_name(0)}")
        
        # Memory info
        total_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
        print(f"Total VRAM: {total_memory:.1f} GB")
        
        # Quick tensor test
        test_tensor = torch.randn(1000, 1000).cuda()
        result = torch.mm(test_tensor, test_tensor)
        print(f"✅ GPU Tensor Test: {result.shape} - SUCCESS")
        
        del test_tensor, result
        torch.cuda.empty_cache()
    else:
        print("❌ CUDA not available - GPU support missing")
        
except ImportError:
    print("❌ PyTorch not installed")
except Exception as e:
    print(f"❌ Error testing GPU: {e}")

## Step 3: Install Dependencies

Install all required packages for GameForge SDXL:

In [None]:
# Install PyTorch with CUDA support
print("Installing PyTorch with CUDA 12.1 support...")
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

In [None]:
# Install GameForge dependencies
print("Installing GameForge SDXL dependencies...")

packages = [
    "fastapi==0.104.1",
    "uvicorn[standard]==0.24.0", 
    "diffusers==0.24.0",
    "transformers==4.36.0",
    "accelerate==0.24.1",
    "xformers==0.0.23",
    "pillow",
    "redis",
    "pydantic==2.5.0",
    "pydantic-settings==2.1.0",
    "python-multipart",
    "numpy",
    "requests"
]

for package in packages:
    print(f"Installing {package}...")
    !pip install {package}

## Step 4: Verify Installation

Test that all key packages work with your RTX 4090:

In [None]:
# Test key imports
print("=== Testing Key Imports ===")

try:
    import torch
    import torchvision
    import diffusers
    import transformers
    import fastapi
    import PIL
    
    print(f"✅ torch: {torch.__version__}")
    print(f"✅ diffusers: {diffusers.__version__}")
    print(f"✅ transformers: {transformers.__version__}")
    print(f"✅ fastapi: {fastapi.__version__}")
    print(f"✅ pillow: {PIL.__version__}")
    
    # Test diffusers on GPU
    from diffusers import StableDiffusionXLPipeline
    print("✅ StableDiffusionXLPipeline import successful")
    
    print("\n🎉 All key packages installed successfully!")
    
except Exception as e:
    print(f"❌ Import error: {e}")

## Step 5: Navigate to GameForge Directory

Set up the working directory and check our files:

In [None]:
# Navigate to GameForge directory
import os

# Check current directory
print(f"Current directory: {os.getcwd()}")

# Look for GameForge directory
if os.path.exists('GameForge/services/asset-gen'):
    os.chdir('GameForge/services/asset-gen')
    print(f"✅ Changed to: {os.getcwd()}")
elif os.path.exists('services/asset-gen'):
    os.chdir('services/asset-gen')
    print(f"✅ Changed to: {os.getcwd()}")
else:
    print("❌ GameForge directory not found")
    print("Available directories:")
    for item in os.listdir('.'):
        if os.path.isdir(item):
            print(f"  📁 {item}")

# List files in current directory
print("\n=== Files in current directory ===")
for item in sorted(os.listdir('.')):
    if os.path.isfile(item):
        size = os.path.getsize(item)
        print(f"📄 {item} ({size} bytes)")
    else:
        print(f"📁 {item}/")

## Step 6: Set Up Storage Directories

Create the necessary directories for GameForge SDXL:

In [None]:
# Create storage directories
import os

directories = [
    'outputs/assets',
    'outputs/thumbnails', 
    'outputs/references',
    'outputs/temp',
    'models/lora',
    'models/checkpoints'
]

print("Creating storage directories...")
for directory in directories:
    os.makedirs(directory, exist_ok=True)
    print(f"✅ Created: {directory}")

print("\n=== Directory Structure ===")
for root, dirs, files in os.walk('.'):
    level = root.replace('.', '').count(os.sep)
    indent = ' ' * 2 * level
    print(f"{indent}📁 {os.path.basename(root)}/")
    if level < 2:  # Don't go too deep
        subindent = ' ' * 2 * (level + 1)
        for file in files[:5]:  # Limit files shown
            print(f"{subindent}📄 {file}")
        if len(files) > 5:
            print(f"{subindent}... and {len(files)-5} more files")

## Step 7: Test SDXL Model Loading

Test loading a fast SDXL model on your RTX 4090:

In [None]:
# Test SDXL model loading
import torch
from diffusers import StableDiffusionXLPipeline
import time

print("=== Testing SDXL Model Loading on RTX 4090 ===")
print("This will download a fast SDXL model (~2-3GB)...")

try:
    start_time = time.time()
    
    # Load a fast SDXL variant
    model_id = "segmind/SSD-1B"  # Fast SDXL alternative
    print(f"Loading model: {model_id}")
    
    pipe = StableDiffusionXLPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16,
        use_safetensors=True,
        variant="fp16"
    )
    
    print("Moving pipeline to GPU...")
    pipe = pipe.to("cuda")
    
    # Enable memory efficient attention
    try:
        pipe.enable_xformers_memory_efficient_attention()
        print("✅ XFormers attention enabled")
    except:
        print("⚠️ XFormers not available, using default attention")
    
    load_time = time.time() - start_time
    print(f"✅ Model loaded successfully in {load_time:.1f} seconds")
    
    # Check memory usage
    if torch.cuda.is_available():
        memory_used = torch.cuda.max_memory_allocated() / 1e9
        memory_total = torch.cuda.get_device_properties(0).total_memory / 1e9
        print(f"GPU Memory: {memory_used:.1f}GB / {memory_total:.1f}GB ({memory_used/memory_total*100:.1f}%)")
    
    print("\n🎉 RTX 4090 SDXL setup successful!")
    
except Exception as e:
    print(f"❌ Error loading SDXL model: {e}")
    import traceback
    traceback.print_exc()

## Step 8: Generate Test Image

Let's generate a test image to verify everything works:

In [None]:
# Generate test image
import time
from IPython.display import Image, display
import os

if 'pipe' in locals():
    print("=== Generating Test Image on RTX 4090 ===")
    
    prompt = "fantasy knight character, pixel art style, detailed armor, medieval setting"
    print(f"Prompt: {prompt}")
    
    try:
        start_time = time.time()
        
        # Generate image
        image = pipe(
            prompt=prompt,
            width=512,
            height=512,
            num_inference_steps=20,
            guidance_scale=7.5,
            generator=torch.Generator(device="cuda").manual_seed(42)
        ).images[0]
        
        generation_time = time.time() - start_time
        print(f"✅ Image generated in {generation_time:.1f} seconds")
        
        # Save image
        output_path = "outputs/assets/test_knight.png"
        image.save(output_path)
        print(f"💾 Saved to: {output_path}")
        
        # Display image
        display(image)
        
        # Memory cleanup
        torch.cuda.empty_cache()
        
        print(f"\n🎮 GameForge SDXL is working perfectly on your RTX 4090!")
        print(f"⚡ Generation speed: {generation_time:.1f}s for 512x512 image")
        
    except Exception as e:
        print(f"❌ Error generating image: {e}")
        import traceback
        traceback.print_exc()
else:
    print("❌ Model not loaded. Run the previous cell first.")

## Step 9: Start GameForge FastAPI Service

Now let's start the full GameForge service:

In [None]:
# Start Redis for job queue
import subprocess
import time

print("=== Starting Redis ===")
try:
    # Install Redis if not available
    result = subprocess.run(['which', 'redis-server'], capture_output=True)
    if result.returncode != 0:
        print("Installing Redis...")
        !apt-get update && apt-get install -y redis-server
    
    # Start Redis
    subprocess.Popen(['redis-server', '--daemonize', 'yes'])
    time.sleep(2)
    
    # Test Redis
    result = subprocess.run(['redis-cli', 'ping'], capture_output=True, text=True)
    if result.stdout.strip() == 'PONG':
        print("✅ Redis is running")
    else:
        print("❌ Redis failed to start")
        
except Exception as e:
    print(f"⚠️ Redis setup issue: {e}")
    print("GameForge can run without Redis for basic testing")

In [None]:
# Check if main.py exists and start the service
import os
import threading
import time

if os.path.exists('main.py'):
    print("=== Starting GameForge SDXL Service ===")
    print("This will start the FastAPI server on port 8000")
    print("Use Ctrl+C to stop the server when you're done testing")
    
    # Start the service in background
    def start_service():
        os.system('python3 main.py')
    
    service_thread = threading.Thread(target=start_service, daemon=True)
    service_thread.start()
    
    print("⏳ Starting service...")
    time.sleep(10)
    
    # Test the service
    try:
        import requests
        response = requests.get('http://localhost:8000/health', timeout=5)
        if response.status_code == 200:
            print("✅ GameForge SDXL service is running!")
            print("🌐 Health endpoint: http://localhost:8000/health")
            print("📚 API docs: http://localhost:8000/docs")
            print(response.json())
        else:
            print(f"⚠️ Service responding with status {response.status_code}")
    except Exception as e:
        print(f"⚠️ Could not connect to service: {e}")
        print("The service might still be starting up...")
        
else:
    print("❌ main.py not found in current directory")
    print("Available Python files:")
    for file in os.listdir('.'):
        if file.endswith('.py'):
            print(f"  📄 {file}")

## Step 10: Test API Endpoints

Test the GameForge API endpoints:

In [None]:
# Test GameForge API endpoints
import requests
import json
import time

BASE_URL = "http://localhost:8000"

print("=== Testing GameForge API Endpoints ===")

# Test health endpoint
try:
    response = requests.get(f"{BASE_URL}/health")
    print(f"Health Check: {response.status_code}")
    if response.status_code == 200:
        health_data = response.json()
        print(json.dumps(health_data, indent=2))
except Exception as e:
    print(f"❌ Health check failed: {e}")

print("\n" + "="*50)

# Test generation endpoint
try:
    generation_request = {
        "prompt": "fantasy knight character, pixel art style, detailed armor",
        "asset_type": "character_design",
        "style": "pixel_art",
        "width": 512,
        "height": 512,
        "steps": 20,
        "guidance_scale": 7.5
    }
    
    print("Submitting generation request...")
    response = requests.post(
        f"{BASE_URL}/generate",
        json=generation_request,
        headers={"Content-Type": "application/json"}
    )
    
    print(f"Generation Request: {response.status_code}")
    if response.status_code in [200, 202]:
        result = response.json()
        print(json.dumps(result, indent=2))
        
        # If we got a job_id, try to check its status
        if 'job_id' in result:
            job_id = result['job_id']
            print(f"\nMonitoring job {job_id}...")
            
            for i in range(30):  # Wait up to 30 seconds
                try:
                    job_response = requests.get(f"{BASE_URL}/job/{job_id}")
                    if job_response.status_code == 200:
                        job_data = job_response.json()
                        status = job_data.get('status', 'unknown')
                        print(f"Job status: {status}")
                        
                        if status == 'completed':
                            print("🎉 Generation completed!")
                            print(json.dumps(job_data, indent=2))
                            break
                        elif status == 'failed':
                            print("❌ Generation failed")
                            print(json.dumps(job_data, indent=2))
                            break
                    time.sleep(2)
                except Exception as e:
                    print(f"Error checking job: {e}")
                    break
    else:
        print(f"Error: {response.text}")
        
except Exception as e:
    print(f"❌ Generation test failed: {e}")

print("\n🎮 GameForge SDXL testing completed!")

## Summary

🎉 **Congratulations!** Your GameForge SDXL service is now running on your RTX 4090!

### What's Working:
- ✅ RTX 4090 GPU detection and CUDA support
- ✅ PyTorch with GPU acceleration
- ✅ SDXL model loading and inference
- ✅ FastAPI service with health checks
- ✅ Image generation API endpoints
- ✅ File storage system

### Your Service URLs:
- **Health Check**: http://localhost:8000/health
- **API Documentation**: http://localhost:8000/docs
- **Interactive API**: http://localhost:8000/redoc

### Performance:
- **GPU**: RTX 4090 (24GB VRAM)
- **Model**: Segmind SSD-1B (Fast SDXL)
- **Generation Speed**: ~10-20 seconds per 512x512 image
- **Cost**: ~$0.20-0.40/hour vs $1.00/hour on AWS

### Next Steps:
1. **Test different prompts** and styles
2. **Try larger images** (768x768, 1024x1024)
3. **Experiment with different models**
4. **Connect your frontend** to this API
5. **Set up LoRA training** for custom styles

Your GameForge AI development environment is ready! 🚀