# Docs2Code Generator

**InsightPulseAI Enterprise Pipeline**

This notebook generates production code from Google Docs documentation.

## Pipeline Flow
```
Google Docs ‚Üí Parse Structure ‚Üí Generate Code ‚Üí Push to GitHub
```

## Supported Output Frameworks
- Odoo 18 CE modules
- FastAPI endpoints
- React components
- pytest test suites

## 1. Setup & Authentication

In [None]:
# Install required packages
!pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib PyGithub markdown pyyaml

In [None]:
# Authenticate with Google
from google.colab import auth
auth.authenticate_user()

print("‚úÖ Google authentication successful")

In [None]:
# Configuration
from google.colab import userdata

# Set your GitHub token (store in Colab secrets)
try:
    GITHUB_TOKEN = userdata.get('GITHUB_TOKEN')
except:
    GITHUB_TOKEN = input("Enter GitHub Token: ")

# Repository configuration
REPO_NAME = "Insightpulseai-net/pulser-agent-framework"
BRANCH = "claude/system-design-analysis-pVVIl"
OUTPUT_PATH = "generated/"

print(f"üì¶ Target: {REPO_NAME}")
print(f"üåø Branch: {BRANCH}")

## 2. Fetch Google Doc

In [None]:
from googleapiclient.discovery import build
from google.auth import default

def fetch_google_doc(doc_id: str) -> dict:
    """
    Fetch a Google Doc by ID and return its content.

    Args:
        doc_id: The Google Doc ID (from URL)

    Returns:
        dict: Document content with title and body
    """
    creds, _ = default()
    service = build('docs', 'v1', credentials=creds)

    doc = service.documents().get(documentId=doc_id).execute()

    print(f"üìÑ Fetched: {doc.get('title')}")
    return doc

# Example: Fetch the Comprehensive Testing Strategy doc
# DOC_ID = '1Qp4nf8nl7M8MnaNtmrBgP4B1mw2aSUqEzYMKmFBCzH4'
# doc = fetch_google_doc(DOC_ID)

## 3. Parse Document Structure

In [None]:
import re
from typing import List, Dict, Any

def extract_text_from_element(element: dict) -> str:
    """Extract plain text from a document element."""
    text_run = element.get('textRun', {})
    return text_run.get('content', '')

def parse_document_structure(doc: dict) -> dict:
    """
    Parse Google Doc structure into a structured format.

    Extracts:
    - Headings (H1, H2, H3)
    - Code blocks
    - Tables
    - Lists
    """
    structure = {
        'title': doc.get('title', 'Untitled'),
        'headings': [],
        'code_blocks': [],
        'tables': [],
        'sections': []
    }

    content = doc.get('body', {}).get('content', [])
    current_section = None
    full_text = []

    for item in content:
        if 'paragraph' in item:
            para = item['paragraph']
            style = para.get('paragraphStyle', {}).get('namedStyleType', '')

            # Extract text
            para_text = ''.join(
                extract_text_from_element(elem)
                for elem in para.get('elements', [])
            ).strip()

            if para_text:
                full_text.append(para_text)

                # Check for headings
                if style.startswith('HEADING_'):
                    level = int(style.replace('HEADING_', ''))
                    structure['headings'].append({
                        'level': level,
                        'text': para_text
                    })

                    if level == 1:
                        if current_section:
                            structure['sections'].append(current_section)
                        current_section = {'title': para_text, 'content': []}
                    elif current_section:
                        current_section['content'].append(para_text)

                # Check for code blocks (monospace font or specific formatting)
                for elem in para.get('elements', []):
                    text_style = elem.get('textRun', {}).get('textStyle', {})
                    if text_style.get('weightedFontFamily', {}).get('fontFamily', '').lower() in ['courier', 'consolas', 'monospace']:
                        code = extract_text_from_element(elem)
                        if code.strip():
                            structure['code_blocks'].append(code)

        elif 'table' in item:
            table = item['table']
            table_data = []

            for row in table.get('tableRows', []):
                row_data = []
                for cell in row.get('tableCells', []):
                    cell_text = ''.join(
                        extract_text_from_element(elem)
                        for para in cell.get('content', [])
                        for elem in para.get('paragraph', {}).get('elements', [])
                    ).strip()
                    row_data.append(cell_text)
                table_data.append(row_data)

            structure['tables'].append(table_data)

    if current_section:
        structure['sections'].append(current_section)

    structure['full_text'] = '\n'.join(full_text)

    print(f"üìä Parsed: {len(structure['headings'])} headings, {len(structure['tables'])} tables, {len(structure['code_blocks'])} code blocks")
    return structure

## 4. Code Generation Templates

In [None]:
def generate_odoo_module(structure: dict) -> Dict[str, str]:
    """
    Generate an Odoo 18 CE module from parsed document structure.

    Args:
        structure: Parsed document structure

    Returns:
        dict: File path -> content mapping
    """
    # Sanitize module name
    module_name = re.sub(r'[^a-z0-9_]', '_', structure['title'].lower())
    module_name = re.sub(r'_+', '_', module_name).strip('_')

    # Extract model definitions from tables
    models = []
    for table in structure['tables']:
        if len(table) > 1 and len(table[0]) >= 2:
            # Check if this looks like a model definition
            headers = [h.lower() for h in table[0]]
            if any(h in headers for h in ['field', 'column', 'name', 'type']):
                model = {
                    'name': f'{module_name}_model',
                    'fields': []
                }
                for row in table[1:]:
                    if len(row) >= 2:
                        model['fields'].append({
                            'name': row[0],
                            'type': row[1] if len(row) > 1 else 'Char'
                        })
                models.append(model)

    # Generate files
    files = {}

    # __manifest__.py
    files[f'{module_name}/__manifest__.py'] = f'''# -*- coding: utf-8 -*-
{{
    'name': '{structure["title"]}',
    'version': '18.0.1.0.0',
    'category': 'Uncategorized',
    'summary': 'Auto-generated from Google Docs via Docs2Code pipeline',
    'description': """
        Generated from: {structure["title"]}
        Sections: {len(structure["sections"])}
        Tables: {len(structure["tables"])}
    """,
    'author': 'InsightPulseAI',
    'website': 'https://insightpulseai.net',
    'license': 'LGPL-3',
    'depends': ['base'],
    'data': [
        'security/ir.model.access.csv',
        'views/{module_name}_views.xml',
    ],
    'installable': True,
    'application': False,
    'auto_install': False,
}}
'''

    # __init__.py
    files[f'{module_name}/__init__.py'] = '''# -*- coding: utf-8 -*-
from . import models
'''

    # models/__init__.py
    files[f'{module_name}/models/__init__.py'] = '''# -*- coding: utf-8 -*-
from . import {module_name}_model
'''.format(module_name=module_name)

    # models/{module_name}_model.py
    model_fields = ''
    if models:
        for model in models:
            for field in model['fields']:
                field_name = re.sub(r'[^a-z0-9_]', '_', field['name'].lower()).strip('_')
                field_type = 'Char'
                if 'int' in field['type'].lower():
                    field_type = 'Integer'
                elif 'float' in field['type'].lower() or 'decimal' in field['type'].lower():
                    field_type = 'Float'
                elif 'date' in field['type'].lower():
                    field_type = 'Date'
                elif 'bool' in field['type'].lower():
                    field_type = 'Boolean'
                elif 'text' in field['type'].lower():
                    field_type = 'Text'

                model_fields += f"    {field_name} = fields.{field_type}(string='{field['name']}')\n"

    files[f'{module_name}/models/{module_name}_model.py'] = f'''# -*- coding: utf-8 -*-
from odoo import models, fields, api


class {module_name.title().replace("_", "")}Model(models.Model):
    _name = '{module_name}.model'
    _description = '{structure["title"]}'

    name = fields.Char(string='Name', required=True)
    description = fields.Text(string='Description')
    active = fields.Boolean(string='Active', default=True)
{model_fields}
'''

    # security/ir.model.access.csv
    files[f'{module_name}/security/ir.model.access.csv'] = f'''id,name,model_id:id,group_id:id,perm_read,perm_write,perm_create,perm_unlink
access_{module_name}_model,{module_name}.model,model_{module_name.replace(".", "_")}_model,base.group_user,1,1,1,0
'''

    # views/{module_name}_views.xml
    files[f'{module_name}/views/{module_name}_views.xml'] = f'''<?xml version="1.0" encoding="utf-8"?>
<odoo>
    <!-- Tree View -->
    <record id="{module_name}_model_tree" model="ir.ui.view">
        <field name="name">{module_name}.model.tree</field>
        <field name="model">{module_name}.model</field>
        <field name="arch" type="xml">
            <tree>
                <field name="name"/>
                <field name="description"/>
                <field name="active"/>
            </tree>
        </field>
    </record>

    <!-- Form View -->
    <record id="{module_name}_model_form" model="ir.ui.view">
        <field name="name">{module_name}.model.form</field>
        <field name="model">{module_name}.model</field>
        <field name="arch" type="xml">
            <form>
                <sheet>
                    <group>
                        <field name="name"/>
                        <field name="description"/>
                        <field name="active"/>
                    </group>
                </sheet>
            </form>
        </field>
    </record>

    <!-- Action -->
    <record id="{module_name}_model_action" model="ir.actions.act_window">
        <field name="name">{structure["title"]}</field>
        <field name="res_model">{module_name}.model</field>
        <field name="view_mode">tree,form</field>
    </record>

    <!-- Menu -->
    <menuitem id="{module_name}_menu_root"
              name="{structure['title']}"
              sequence="100"/>
    <menuitem id="{module_name}_menu"
              name="{structure['title']}"
              parent="{module_name}_menu_root"
              action="{module_name}_model_action"
              sequence="10"/>
</odoo>
'''

    # tests/__init__.py
    files[f'{module_name}/tests/__init__.py'] = '''# -*- coding: utf-8 -*-
from . import test_{module_name}
'''.format(module_name=module_name)

    # tests/test_{module_name}.py
    files[f'{module_name}/tests/test_{module_name}.py'] = f'''# -*- coding: utf-8 -*-
from odoo.tests import TransactionCase, tagged


@tagged('post_install', '-at_install')
class Test{module_name.title().replace("_", "")}(TransactionCase):
    """Test cases for {structure["title"]}"""

    @classmethod
    def setUpClass(cls):
        super().setUpClass()
        cls.Model = cls.env['{module_name}.model']

    def test_create_record(self):
        """Test creating a new record."""
        record = self.Model.create({{
            'name': 'Test Record',
            'description': 'Created via Docs2Code pipeline',
        }})
        self.assertTrue(record.id)
        self.assertEqual(record.name, 'Test Record')
        self.assertTrue(record.active)

    def test_record_deactivation(self):
        """Test deactivating a record."""
        record = self.Model.create({{'name': 'Deactivate Test'}})
        record.active = False
        self.assertFalse(record.active)

    def test_search_records(self):
        """Test searching records."""
        self.Model.create({{'name': 'Search Test 1'}})
        self.Model.create({{'name': 'Search Test 2'}})

        records = self.Model.search([('name', 'like', 'Search Test')])
        self.assertGreaterEqual(len(records), 2)
'''

    # README.md
    files[f'{module_name}/README.md'] = f'''# {structure["title"]}

## Overview

This module was auto-generated from Google Docs via the InsightPulseAI Docs2Code pipeline.

## Installation

1. Copy this module to your Odoo addons directory
2. Update the module list: Settings ‚Üí Apps ‚Üí Update Apps List
3. Search for "{structure["title"]}" and install

## Features

- Auto-generated from documentation
- Includes model definitions, views, and security
- Test suite included

## Testing

```bash
odoo-bin -c odoo.conf -d test_db --test-enable --stop-after-init -i {module_name}
```

## Source

Generated from: {structure["title"]}
Generated at: {{current_date}}
Pipeline: InsightPulseAI Docs2Code

## License

LGPL-3
'''.replace('{current_date}', __import__('datetime').datetime.now().isoformat())

    print(f"‚úÖ Generated {len(files)} files for Odoo module: {module_name}")
    return files

## 5. Push to GitHub

In [None]:
from github import Github
from github.GithubException import GithubException

def push_to_github(files: Dict[str, str], repo_name: str, branch: str, path_prefix: str = 'generated/') -> List[str]:
    """
    Push generated files to GitHub repository.

    Args:
        files: Dict of file path -> content
        repo_name: Owner/repo format
        branch: Target branch
        path_prefix: Directory prefix for all files

    Returns:
        List of created/updated file URLs
    """
    g = Github(GITHUB_TOKEN)
    repo = g.get_repo(repo_name)

    created_files = []

    for filename, content in files.items():
        file_path = f"{path_prefix}{filename}"

        try:
            # Try to get existing file
            existing = repo.get_contents(file_path, ref=branch)

            # Update if content changed
            if existing.decoded_content.decode() != content:
                repo.update_file(
                    file_path,
                    f"docs2code: update {filename}",
                    content,
                    existing.sha,
                    branch=branch
                )
                print(f"üìù Updated: {file_path}")
            else:
                print(f"‚è≠Ô∏è Unchanged: {file_path}")

        except GithubException as e:
            if e.status == 404:
                # File doesn't exist, create it
                repo.create_file(
                    file_path,
                    f"docs2code: add {filename}",
                    content,
                    branch=branch
                )
                print(f"‚ú® Created: {file_path}")
            else:
                print(f"‚ùå Error: {file_path} - {e}")
                continue

        created_files.append(f"https://github.com/{repo_name}/blob/{branch}/{file_path}")

    print(f"\nüéâ Pushed {len(created_files)} files to GitHub")
    return created_files

## 6. Complete Pipeline Execution

In [None]:
def run_docs2code_pipeline(doc_id: str, output_path: str = 'generated/odoo/'):
    """
    Run the complete Docs2Code pipeline.

    Args:
        doc_id: Google Doc ID
        output_path: Target path in repository
    """
    print("üöÄ Starting Docs2Code Pipeline")
    print("=" * 50)

    # Step 1: Fetch document
    print("\nüì• Step 1: Fetching Google Doc...")
    doc = fetch_google_doc(doc_id)

    # Step 2: Parse structure
    print("\nüîç Step 2: Parsing document structure...")
    structure = parse_document_structure(doc)

    # Step 3: Generate code
    print("\n‚öôÔ∏è Step 3: Generating Odoo module...")
    files = generate_odoo_module(structure)

    # Step 4: Push to GitHub
    print("\nüì§ Step 4: Pushing to GitHub...")
    urls = push_to_github(files, REPO_NAME, BRANCH, output_path)

    # Summary
    print("\n" + "=" * 50)
    print("‚úÖ Pipeline Complete!")
    print(f"üìÑ Source: {doc.get('title')}")
    print(f"üì¶ Generated: {len(files)} files")
    print(f"üîó Repository: https://github.com/{REPO_NAME}/tree/{BRANCH}/{output_path}")

    return urls

# Example usage:
# DOC_ID = '1Qp4nf8nl7M8MnaNtmrBgP4B1mw2aSUqEzYMKmFBCzH4'
# urls = run_docs2code_pipeline(DOC_ID)

## 7. Interactive Mode

In [None]:
# Interactive mode - enter your document ID
print("="*50)
print("üìù DOCS2CODE INTERACTIVE MODE")
print("="*50)
print("\nEnter your Google Doc ID (from the URL).")
print("Example: https://docs.google.com/document/d/1Qp4nf8nl7M8MnaNtmrBgP4B1mw2aSUqEzYMKmFBCzH4/edit")
print("         The ID is: 1Qp4nf8nl7M8MnaNtmrBgP4B1mw2aSUqEzYMKmFBCzH4")
print()

doc_id = input("Google Doc ID: ").strip()

if doc_id:
    output_path = input("Output path (default: generated/odoo/): ").strip() or 'generated/odoo/'
    urls = run_docs2code_pipeline(doc_id, output_path)
else:
    print("‚ùå No document ID provided. Exiting.")

---

## Document IDs for Your 12 Deliverables

| Document | ID |
|----------|----|
| Comprehensive Testing Strategy | `1Qp4nf8nl7M8MnaNtmrBgP4B1mw2aSUqEzYMKmFBCzH4` |
| Google Docs to GitHub Workflow | `12cvYyZdPeLeLJSGX7OW8XQAwvsBVOaiO146UTkVcc7w` |
| Pulser-Agent-Framework Implementation | `1qL1fJT6mX4zjXFO_ui8VKKALTlACSa87VgTIc7HXqbo` |
| Odoo 18 CE/OCA Testing Strategy | `1Bfe2Lih6dj1Xw85T5xqjtQs5DvT1LMhSW218mnH657A` |
| GitHub Integration & Code Management | `1WY2GJz8IWTWNuTBIOeAoko5f1o_oMTQMd0kzpBLFxXM` |

---

*InsightPulseAI Docs2Code Pipeline*