# 🚀 Context-Aware Code Documentation Generator - Colab Setup

This notebook sets up the Context-Aware Code Documentation Generator in Google Colab.

**Before running**: Upload the entire project folder to Colab or clone from GitHub.

## Step 1: Install Dependencies

In [None]:
# Install core dependencies first
!pip install -q fastapi uvicorn[standard] streamlit pydantic
print("✅ Core web framework dependencies installed")

In [None]:
# Install tree-sitter and language parsers
!pip install -q tree-sitter
!pip install -q tree-sitter-python tree-sitter-javascript tree-sitter-java tree-sitter-go tree-sitter-cpp
print("✅ Tree-sitter and language parsers installed")

In [None]:
# Install RAG and ML dependencies
!pip install -q sentence-transformers faiss-cpu gitpython
print("✅ RAG and Git dependencies installed")

In [None]:
# Install LLM and fine-tuning dependencies
!pip install -q transformers torch peft bitsandbytes accelerate
print("✅ LLM and fine-tuning dependencies installed")

In [None]:
# Install utility dependencies
!pip install -q python-multipart aiofiles python-dotenv loguru tqdm
print("✅ Utility dependencies installed")

## Step 2: Environment Setup

In [None]:
import os
import sys
import torch
import warnings
warnings.filterwarnings('ignore')

# Set environment variables
os.environ['TOKENIZERS_PARALLELISM'] = 'false'
os.environ['HF_HOME'] = './models'

# Create directories
directories = ['models', 'temp', 'output', 'logs', 'indexes']
for directory in directories:
    os.makedirs(directory, exist_ok=True)

# Check GPU
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1024**3
    print(f"🖥️  GPU: {gpu_name} ({gpu_memory:.1f} GB)")
else:
    print("🖥️  Using CPU (GPU not available)")

print("✅ Environment setup complete")

## Step 3: Test Imports

In [None]:
# Test critical imports
try:
    import tree_sitter
    print("✅ tree-sitter imported")
except ImportError as e:
    print(f"❌ tree-sitter import failed: {e}")

try:
    from sentence_transformers import SentenceTransformer
    print("✅ sentence-transformers imported")
except ImportError as e:
    print(f"❌ sentence-transformers import failed: {e}")

try:
    from transformers import AutoTokenizer
    print("✅ transformers imported")
except ImportError as e:
    print(f"❌ transformers import failed: {e}")

try:
    import faiss
    print("✅ faiss imported")
except ImportError as e:
    print(f"❌ faiss import failed: {e}")

## Step 4: Import Project Modules

In [None]:
# Add project to Python path
import sys
from pathlib import Path

# Adjust path if needed - assumes you're in the project root
project_root = Path.cwd()
if project_root.name != 'context-aware-doc-generator':
    # Look for the project directory
    for p in [Path.cwd() / 'context-aware-doc-generator', Path('/content/context-aware-doc-generator')]:
        if p.exists():
            project_root = p
            break

sys.path.insert(0, str(project_root))
print(f"📁 Project root: {project_root}")

# Test project imports
try:
    from src.parser import create_parser
    print("✅ Parser module imported")
except ImportError as e:
    print(f"❌ Parser import failed: {e}")

try:
    from src.rag import create_rag_system
    print("✅ RAG module imported")
except ImportError as e:
    print(f"❌ RAG import failed: {e}")

try:
    from src.llm import create_documentation_generator
    print("✅ LLM module imported")
except ImportError as e:
    print(f"❌ LLM import failed: {e}")
    print("Note: This might fail due to GPU memory constraints in Colab")

## Step 5: Quick Test

In [None]:
# Quick functionality test
print("🧪 Running quick functionality test...")

# Test parser
try:
    parser = create_parser()
    
    # Test language detection
    test_file = "test.py"
    language = parser.detect_language(test_file)
    print(f"✅ Language detection works: {test_file} -> {language}")
    
except Exception as e:
    print(f"❌ Parser test failed: {e}")

# Test RAG system
try:
    rag_system = create_rag_system()
    print("✅ RAG system initialized (this may take a moment to download models)")
except Exception as e:
    print(f"❌ RAG system test failed: {e}")

print("\n🎉 Setup validation complete!")

## Step 6: Ready to Use!

In [None]:
print("🚀 Context-Aware Code Documentation Generator is ready!")
print("\n📖 What you can do now:")
print("\n1. 📝 Open examples.ipynb for usage demonstrations")
print("2. 🎓 Open training.ipynb for model fine-tuning")
print("3. 🌐 Start web interface:")
print("   - API: !uvicorn src.api:app --host 0.0.0.0 --port 8000")
print("   - Frontend: !streamlit run src/frontend.py --server.port 8501")
print("4. 💻 Use CLI: !python main.py --help")
print("\n5. 🧪 Quick example:")
print("   parser = create_parser()")
print("   # Create a Python file and parse it")
print("\n🎯 The system is optimized for Colab and should work with the free tier!")

## Optional: Start Web Interface

Uncomment and run these cells to start the web interface:

In [None]:
# # Start FastAPI backend in background
# import subprocess
# import time
# 
# # Start API server
# api_process = subprocess.Popen([
#     'uvicorn', 'src.api:app', 
#     '--host', '0.0.0.0', 
#     '--port', '8000'
# ])
# 
# print("🚀 API server starting...")
# time.sleep(5)
# print("✅ API server should be running on port 8000")

In [None]:
# # Start Streamlit frontend
# !streamlit run src/frontend.py --server.port 8501 --server.address 0.0.0.0