In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 🚀 Quick Demo - Context-Aware Code Documentation Generator\n",
    "\n",
    "This notebook demonstrates the key features of the documentation generator now that it's set up in Colab.\n",
    "\n",
    "**Prerequisites**: Run the setup first using `setup_colab.py` or `colab_setup.ipynb`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import Core Modules\n",
    "\n",
    "Let's import and test the core components:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import the core modules\n",
    "import sys\n",
    "sys.path.append('/content/context-aware-doc-generator')\n",
    "\n",
    "from src.parser import create_parser\n",
    "from src.rag import create_rag_system\n",
    "from src.git_handler import create_git_handler\n",
    "\n",
    "print(\"✅ All core modules imported successfully!\")\n",
    "print(\"🎯 Ready to test the documentation generator\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Test Multi-Language Code Parsing\n",
    "\n",
    "Let's test the parser with different programming languages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tempfile\n",
    "import os\n",
    "\n",
    "# Create sample code files\n",
    "code_samples = {\n",
    "    'python_example.py': '''\n",
    "def fibonacci(n):\n",
    "    \"\"\"Calculate fibonacci number.\"\"\"\n",
    "    if n <= 1:\n",
    "        return n\n",
    "    return fibonacci(n-1) + fibonacci(n-2)\n",
    "\n",
    "class Calculator:\n",
    "    def __init__(self):\n",
    "        self.history = []\n",
    "    \n",
    "    def add(self, a, b):\n",
    "        result = a + b\n",
    "        self.history.append(f\"{a} + {b} = {result}\")\n",
    "        return result\n",
    "''',\n",
    "    \n",
    "    'javascript_example.js': '''\n",
    "function validateEmail(email) {\n",
    "    const emailRegex = /^[^\\\\s@]+@[^\\\\s@]+\\\\.[^\\\\s@]+$/;\n",
    "    return emailRegex.test(email);\n",
    "}\n",
    "\n",
    "class UserManager {\n",
    "    constructor() {\n",
    "        this.users = [];\n",
    "    }\n",
    "    \n",
    "    addUser(name, email) {\n",
    "        if (this.validateEmail(email)) {\n",
    "            this.users.push({name, email});\n",
    "            return true;\n",
    "        }\n",
    "        return false;\n",
    "    }\n",
    "}\n",
    "''',\n",
    "    \n",
    "    'java_example.java': '''\n",
    "public class BinarySearch {\n",
    "    public static int search(int[] arr, int target) {\n",
    "        int left = 0, right = arr.length - 1;\n",
    "        \n",
    "        while (left <= right) {\n",
    "            int mid = left + (right - left) / 2;\n",
    "            \n",
    "            if (arr[mid] == target) {\n",
    "                return mid;\n",
    "            } else if (arr[mid] < target) {\n",
    "                left = mid + 1;\n",
    "            } else {\n",
    "                right = mid - 1;\n",
    "            }\n",
    "        }\n",
    "        return -1;\n",
    "    }\n",
    "}\n",
    "'''\n",
    "}\n",
    "\n",
    "print(\"🔍 Testing Multi-Language Code Parsing...\")\n",
    "print(\"=\" * 50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize parser\n",
    "parser = create_parser()\n",
    "print(\"✅ Parser initialized\")\n",
    "\n",
    "# Test parsing\n",
    "results = {}\n",
    "for filename, code in code_samples.items():\n",
    "    # Create temporary file\n",
    "    with tempfile.NamedTemporaryFile(mode='w', suffix=f'.{filename.split(\".\")[-1]}', delete=False) as f:\n",
    "        f.write(code)\n",
    "        temp_path = f.name\n",
    "    \n",
    "    try:\n",
    "        # Parse the file\n",
    "        parsed = parser.parse_file(temp_path)\n",
    "        if parsed:\n",
    "            results[filename] = {\n",
    "                'language': parsed['language'],\n",
    "                'functions': len(parsed['functions']),\n",
    "                'classes': len(parsed['classes']),\n",
    "                'function_names': [f['name'] for f in parsed['functions']],\n",
    "                'class_names': [c['name'] for c in parsed['classes']]\n",
    "            }\n",
    "        \n",
    "        # Cleanup\n",
    "        os.unlink(temp_path)\n",
    "        \n",
    "    except Exception as e:\n",
    "        print(f\"Error parsing {filename}: {e}\")\n",
    "\n",
    "# Display results\n",
    "print(\"\\n📊 Parsing Results:\")\n",
    "for filename, result in results.items():\n",
    "    print(f\"\\n📁 {filename}\")\n",
    "    print(f\"   Language: {result['language']}\")\n",
    "    print(f\"   Functions: {result['functions']} {result['function_names']}\")\n",
    "    print(f\"   Classes: {result['classes']} {result['class_names']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Test RAG System\n",
    "\n",
    "Let's test the RAG system for context-aware understanding:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a mock codebase for testing\n",
    "mock_codebase = {\n",
    "    'files': {\n",
    "        'math_utils.py': {\n",
    "            'language': 'python',\n",
    "            'file_path': 'math_utils.py',\n",
    "            'functions': [\n",
    "                {\n",
    "                    'name': 'add',\n",
    "                    'text': 'def add(a, b):\\n    \"\"\"Add two numbers.\"\"\"\\n    return a + b',\n",
    "                    'start_line': 1,\n",
    "                    'end_line': 3\n",
    "                },\n",
    "                {\n",
    "                    'name': 'multiply',\n",
    "                    'text': 'def multiply(a, b):\\n    \"\"\"Multiply two numbers.\"\"\"\\n    return a * b',\n",
    "                    'start_line': 5,\n",
    "                    'end_line': 7\n",
    "                },\n",
    "                {\n",
    "                    'name': 'factorial',\n",
    "                    'text': 'def factorial(n):\\n    \"\"\"Calculate factorial.\"\"\"\\n    if n <= 1:\\n        return 1\\n    return n * factorial(n-1)',\n",
    "                    'start_line': 9,\n",
    "                    'end_line': 13\n",
    "                }\n",
    "            ],\n",
    "            'classes': [],\n",
    "            'imports': ['import math'],\n",
    "            'comments': []\n",
    "        },\n",
    "        'string_utils.py': {\n",
    "            'language': 'python',\n",
    "            'file_path': 'string_utils.py',\n",
    "            'functions': [\n",
    "                {\n",
    "                    'name': 'reverse_string',\n",
    "                    'text': 'def reverse_string(s):\\n    \"\"\"Reverse a string.\"\"\"\\n    return s[::-1]',\n",
    "                    'start_line': 1,\n",
    "                    'end_line': 3\n",
    "                },\n",
    "                {\n",
    "                    'name': 'capitalize_words',\n",
    "                    'text': 'def capitalize_words(text):\\n    \"\"\"Capitalize each word.\"\"\"\\n    return \" \".join(word.capitalize() for word in text.split())',\n",
    "                    'start_line': 5,\n",
    "                    'end_line': 7\n",
    "                }\n",
    "            ],\n",
    "            'classes': [],\n",
    "            'imports': [],\n",
    "            'comments': []\n",
    "        }\n",
    "    },\n",
    "    'summary': {\n",
    "        'total_files': 2,\n",
    "        'languages': ['python'],\n",
    "        'total_functions': 5,\n",
    "        'total_classes': 0\n",
    "    }\n",
    "}\n",
    "\n",
    "print(\"🧠 Testing RAG System...\")\n",
    "print(f\"📊 Mock codebase: {mock_codebase['summary']['total_files']} files, {mock_codebase['summary']['total_functions']} functions\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize RAG system\n",
    "rag_system = create_rag_system()\n",
    "print(\"✅ RAG system initialized\")\n",
    "\n",
    "# Prepare code chunks\n",
    "print(\"📦 Preparing code chunks...\")\n",
    "code_chunks = rag_system.prepare_code_chunks(mock_codebase)\n",
    "print(f\"   Created {len(code_chunks)} code chunks\")\n",
    "\n",
    "# Build index\n",
    "print(\"🔨 Building FAISS index...\")\n",
    "rag_system.build_index(code_chunks)\n",
    "print(\"✅ Index built successfully\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test searches\n",
    "test_queries = [\n",
    "    \"mathematical operations\",\n",
    "    \"string manipulation\", \n",
    "    \"recursive function\",\n",
    "    \"text processing\"\n",
    "]\n",
    "\n",
    "print(\"🔍 Testing similarity search:\")\n",
    "for query in test_queries:\n",
    "    print(f\"\\nQuery: '{query}'\")\n",
    "    results = rag_system.search(query, k=2)\n",
    "    for i, result in enumerate(results, 1):\n",
    "        chunk = result['chunk']\n",
    "        score = result['score']\n",
    "        name = chunk['metadata'].get('name', 'N/A')\n",
    "        print(f\"  {i}. {chunk['type']} '{name}' (score: {score:.3f})\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Test Documentation Generation (Optional)\n",
    "\n",
    "⚠️ **Note**: This step requires significant GPU memory and may take time. Skip if you encounter memory issues."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    print(\"🤖 Testing LLM Documentation Generation...\")\n",
    "    print(\"⚠️  This step downloads large models and may take several minutes...\")\n",
    "    \n",
    "    from src.llm import create_documentation_generator\n",
    "    \n",
    "    # Initialize with a smaller configuration to save memory\n",
    "    doc_generator = create_documentation_generator()\n",
    "    print(\"✅ Documentation generator initialized\")\n",
    "    \n",
    "    # Test with a simple function\n",
    "    test_code = '''def quicksort(arr):\n",
    "    if len(arr) <= 1:\n",
    "        return arr\n",
    "    pivot = arr[len(arr) // 2]\n",
    "    left = [x for x in arr if x < pivot]\n",
    "    middle = [x for x in arr if x == pivot]\n",
    "    right = [x for x in arr if x > pivot]\n",
    "    return quicksort(left) + middle + quicksort(right)'''\n",
    "    \n",
    "    print(\"\\n📝 Generating documentation for quicksort function...\")\n",
    "    \n",
    "    # Get context from RAG\n",
    "    context = rag_system.get_context_for_documentation(test_code, 'function')\n",
    "    \n",
    "    # Generate docstring\n",
    "    docstring = doc_generator.generate_docstring(\n",
    "        code=test_code,\n",
    "        language='python',\n",
    "        context=context[:200],  # Limit context for demo\n",
    "        style='google'\n",
    "    )\n",
    "    \n",
    "    print(\"\\n🎯 Generated Docstring:\")\n",
    "    print(docstring)\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"⚠️  LLM test skipped due to: {e}\")\n",
    "    print(\"This is normal if GPU memory is limited or models are not yet cached.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Test GitHub Repository Processing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"📥 Testing GitHub Repository Processing...\")\n",
    "\n",
    "# Initialize git handler\n",
    "git_handler = create_git_handler()\n",
    "\n",
    "# Test with a small, simple repository\n",
    "test_repo_url = \"https://github.com/octocat/Hello-World.git\"\n",
    "\n",
    "try:\n",
    "    print(f\"Cloning: {test_repo_url}\")\n",
    "    repo_path = git_handler.clone_repository(test_repo_url)\n",
    "    \n",
    "    # Get repository info\n",
    "    repo_info = git_handler.get_repository_info(repo_path)\n",
    "    print(f\"\\n📊 Repository Info:\")\n",
    "    print(f\"   Files: {repo_info.get('files_count', 0)}\")\n",
    "    print(f\"   Size: {repo_info.get('total_size_mb', 0)} MB\")\n",
    "    print(f\"   Languages: {repo_info.get('languages', [])}\")\n",
    "    \n",
    "    # Parse if there are files\n",
    "    if repo_info.get('files_count', 0) > 0:\n",
    "        print(\"\\n🔍 Parsing repository...\")\n",
    "        parsed_repo = parser.parse_codebase(repo_path)\n",
    "        print(f\"   Processed: {parsed_repo['summary']['total_files']} files\")\n",
    "        print(f\"   Functions: {parsed_repo['summary']['total_functions']}\")\n",
    "        print(f\"   Classes: {parsed_repo['summary']['total_classes']}\")\n",
    "    \n",
    "    # Cleanup\n",
    "    git_handler.cleanup(repo_path)\n",
    "    print(\"✅ Repository processing completed\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"⚠️  Repository test failed: {e}\")\n",
    "    print(\"This might be due to network issues or repository access.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Complete Workflow Demo"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"🚀 Complete Workflow Demonstration\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "# Create a simple demo project structure\n",
    "demo_files = {\n",
    "    'calculator.py': '''\n",
    "class Calculator:\n",
    "    def __init__(self):\n",
    "        self.history = []\n",
    "    \n",
    "    def add(self, a, b):\n",
    "        result = a + b\n",
    "        self.history.append(f\"add({a}, {b}) = {result}\")\n",
    "        return result\n",
    "    \n",
    "    def multiply(self, a, b):\n",
    "        result = a * b\n",
    "        self.history.append(f\"multiply({a}, {b}) = {result}\")\n",
    "        return result\n",
    "    \n",
    "    def get_history(self):\n",
    "        return self.history.copy()\n",
    "''',\n",
    "    'utils.py': '''\n",
    "def is_prime(n):\n",
    "    if n < 2:\n",
    "        return False\n",
    "    for i in range(2, int(n ** 0.5) + 1):\n",
    "        if n % i == 0:\n",
    "            return False\n",
    "    return True\n",
    "\n",
    "def fibonacci_sequence(count):\n",
    "    sequence = []\n",
    "    a, b = 0, 1\n",
    "    for _ in range(count):\n",
    "        sequence.append(a)\n",
    "        a, b = b, a + b\n",
    "    return sequence\n",
    "'''\n",
    "}\n",
    "\n",
    "# Create temporary directory and files\n",
    "import tempfile\n",
    "import shutil\n",
    "\n",
    "temp_dir = tempfile.mkdtemp(prefix='demo_project_')\n",
    "print(f\"📁 Created demo project: {temp_dir}\")\n",
    "\n",
    "for filename, code in demo_files.items():\n",
    "    file_path = os.path.join(temp_dir, filename)\n",
    "    with open(file_path, 'w') as f:\n",
    "        f.write(code.strip())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    # Step 1: Parse the project\n",
    "    print(\"\\n1️⃣ Parsing demo project...\")\n",
    "    parsed_project = parser.parse_codebase(temp_dir)\n",
    "    \n",
    "    stats = parsed_project['summary']\n",
    "    print(f\"   ✅ Parsed {stats['total_files']} files\")\n",
    "    print(f\"   📊 Found {stats['total_functions']} functions, {stats['total_classes']} classes\")\n",
    "    \n",
    "    # Step 2: Build RAG index\n",
    "    print(\"\\n2️⃣ Building RAG index...\")\n",
    "    project_chunks = rag_system.prepare_code_chunks(parsed_project)\n",
    "    rag_system.build_index(project_chunks)\n",
    "    print(f\"   ✅ Built index with {len(project_chunks)} chunks\")\n",
    "    \n",
    "    # Step 3: Show file analysis\n",
    "    print(\"\\n3️⃣ File Analysis:\")\n",
    "    for file_path, file_data in parsed_project['files'].items():\n",
    "        filename = os.path.basename(file_path)\n",
    "        print(f\"\\n   📄 {filename}:\")\n",
    "        \n",
    "        if file_data['functions']:\n",
    "            print(f\"      Functions: {[f['name'] for f in file_data['functions']]}\")\n",
    "        \n",
    "        if file_data['classes']:\n",
    "            print(f\"      Classes: {[c['name'] for c in file_data['classes']]}\")\n",
    "    \n",
    "    # Step 4: Test context retrieval\n",
    "    print(\"\\n4️⃣ Testing context retrieval:\")\n",
    "    test_function = parsed_project['files'][list(parsed_project['files'].keys())[0]]['functions'][0]\n",
    "    context = rag_system.get_context_for_documentation(\n",
    "        test_function.get('text', ''), 'function'\n",
    "    )\n",
    "    print(f\"   ✅ Retrieved context for '{test_function['name']}' ({len(context)} chars)\")\n",
    "    \n",
    "    print(\"\\n🎉 Workflow demonstration completed successfully!\")\n",
    "    \n",
    "finally:\n",
    "    # Cleanup\n",
    "    shutil.rmtree(temp_dir)\n",
    "    print(\"🧹 Cleaned up temporary files\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary and Next Steps\n",
    "\n",
    "🎉 **Congratulations!** Your Context-Aware Code Documentation Generator is now fully operational in Google Colab!\n",
    "\n",
    "### What We've Successfully Tested:\n",
    "- ✅ Multi-language code parsing (Python, JavaScript, Java)\n",
    "- ✅ RAG system for context understanding  \n",
    "- ✅ Vector similarity search with FAISS\n",
    "- ✅ GitHub repository processing\n",
    "- ✅ Complete documentation workflow\n",
    "\n",
    "### Next Steps:\n",
    "1. **Explore Examples**: Open `notebooks/examples.ipynb` for more detailed examples\n",
    "2. **Try Training**: Open `notebooks/training.ipynb` to fine-tune the model\n",
    "3. **Web Interface**: Start with `!streamlit run src/frontend.py --server.port 8501`\n",
    "4. **CLI Usage**: Try `!python main.py --help` for command-line options\n",
    "\n",
    "### For Your 4-2 Project:\n",
    "- The system demonstrates advanced AI/ML concepts\n",
    "- RAG system provides intelligent context understanding\n",
    "- Multi-language support shows technical breadth\n",
    "- Production-ready architecture with web interface\n",
    "\n",
    "**Happy coding! 🚀**"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}