# Markdown Meeting Notes ‚Üí Google Doc Converter

This notebook converts markdown-formatted meeting notes into a well-formatted Google Doc
using the Google Docs API.

**Features:**
- Heading styles (H1, H2, H3)
- Nested bullet points with proper indentation
- Checkboxes for action items
- Bold + colored styling for @mentions
- Distinct footer styling

**Prerequisites:**
- A Google account
- Run this notebook in [Google Colab](https://colab.research.google.com/)

## 1. Install Dependencies & Authenticate

In [None]:
# Install the required Google client libraries (pre-installed in Colab, but ensures availability)
!pip install --quiet google-auth google-auth-oauthlib google-api-python-client

from google.colab import auth
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
import re

# Authenticate the user ‚Äì this opens a browser popup in Colab
auth.authenticate_user()
print("‚úÖ Authentication successful!")

## 2. Define the Markdown Meeting Notes

In [None]:
MEETING_NOTES_MD = r"""
# Product Team Sync - May 15, 2023

## Attendees
- Sarah Chen (Product Lead)
- Mike Johnson (Engineering)
- Anna Smith (Design)
- David Park (QA)

## Agenda

### 1. Sprint Review
* Completed Features
  * User authentication flow
  * Dashboard redesign
  * Performance optimization
    * Reduced load time by 40%
    * Implemented caching solution
* Pending Items
  * Mobile responsive fixes
  * Beta testing feedback integration

### 2. Current Challenges
* Resource constraints in QA team
* Third-party API integration delays
* User feedback on new UI
  * Navigation confusion
  * Color contrast issues

### 3. Next Sprint Planning
* Priority Features
  * Payment gateway integration
  * User profile enhancement
  * Analytics dashboard
* Technical Debt
  * Code refactoring
  * Documentation updates

## Action Items
- [ ] @sarah: Finalize Q3 roadmap by Friday
- [ ] @mike: Schedule technical review for payment integration
- [ ] @anna: Share updated design system documentation
- [ ] @david: Prepare QA resource allocation proposal

## Next Steps
* Schedule individual team reviews
* Update sprint board
* Share meeting summary with stakeholders

## Notes
* Next sync scheduled for May 22, 2023
* Platform demo for stakeholders on May 25
* Remember to update JIRA tickets

---
Meeting recorded by: Sarah Chen
Duration: 45 minutes
""".strip()

## 3. Markdown Parser

Parse the raw markdown string into a list of structured tokens representing
headings, bullet points, checkboxes, horizontal rules, and plain text.

In [None]:
# ---------------------------------------------------------------------------
# Data classes for parsed tokens
# ---------------------------------------------------------------------------

class Token:
    """Base class for a parsed markdown token."""
    pass


class HeadingToken(Token):
    """Represents a markdown heading (# / ## / ###)."""
    def __init__(self, level: int, text: str):
        self.level = level   # 1, 2, or 3
        self.text = text


class BulletToken(Token):
    """Represents a plain bullet point (* or -)."""
    def __init__(self, indent_level: int, text: str):
        self.indent_level = indent_level  # 0-based nesting depth
        self.text = text


class CheckboxToken(Token):
    """Represents a checkbox item  - [ ] text."""
    def __init__(self, checked: bool, text: str):
        self.checked = checked
        self.text = text


class HorizontalRuleToken(Token):
    """Represents a --- horizontal rule."""
    pass


class TextToken(Token):
    """Represents plain text (e.g. footer lines)."""
    def __init__(self, text: str):
        self.text = text


# ---------------------------------------------------------------------------
# Parser
# ---------------------------------------------------------------------------

# Regex patterns for line classification
_HEADING_RE = re.compile(r'^(#{1,6})\s+(.*)')
_CHECKBOX_RE = re.compile(r'^-\s+\[([ xX])\]\s+(.*)')
_BULLET_RE = re.compile(r'^(\s*)([*-])\s+(.*)')
_HR_RE = re.compile(r'^---+\s*$')


def parse_markdown(md_text: str) -> list:
    """
    Parse a markdown string into a flat list of Token objects.

    The parser processes the text line-by-line and classifies each line into
    one of the supported token types.  Blank lines are skipped.

    Args:
        md_text: Raw markdown string.

    Returns:
        List of Token objects in document order.
    """
    tokens = []

    for line in md_text.splitlines():
        stripped = line.strip()

        # Skip blank lines
        if not stripped:
            continue

        # --- Horizontal rule ---
        if _HR_RE.match(stripped):
            tokens.append(HorizontalRuleToken())
            continue

        # --- Headings ---
        heading_match = _HEADING_RE.match(stripped)
        if heading_match:
            level = len(heading_match.group(1))
            text = heading_match.group(2).strip()
            tokens.append(HeadingToken(level=level, text=text))
            continue

        # --- Checkbox items (must be checked before generic bullets) ---
        checkbox_match = _CHECKBOX_RE.match(stripped)
        if checkbox_match:
            checked = checkbox_match.group(1).lower() == 'x'
            text = checkbox_match.group(2).strip()
            tokens.append(CheckboxToken(checked=checked, text=text))
            continue

        # --- Bullet points (*, -) with indentation ---
        bullet_match = _BULLET_RE.match(line)  # use original line to preserve indentation
        if bullet_match:
            leading_spaces = len(bullet_match.group(1))
            indent_level = leading_spaces // 2  # every 2 spaces = 1 nesting level
            text = bullet_match.group(3).strip()
            tokens.append(BulletToken(indent_level=indent_level, text=text))
            continue

        # --- Fallback: plain text ---
        tokens.append(TextToken(text=stripped))

    return tokens


# Quick sanity check
parsed_tokens = parse_markdown(MEETING_NOTES_MD)
for t in parsed_tokens:
    print(f"{type(t).__name__:20s} {vars(t) if hasattr(t, '__dict__') and vars(t) else ''}")

## 4. Google Doc Builder

Helper utilities that translate parsed tokens into Google Docs API `batchUpdate` requests.

The API uses a **character-index** insertion model, so we track the current cursor
position as we build the document sequentially.

In [None]:
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------

# Named style mappings for heading levels
HEADING_STYLE_MAP = {
    1: 'HEADING_1',
    2: 'HEADING_2',
    3: 'HEADING_3',
    4: 'HEADING_4',
    5: 'HEADING_5',
    6: 'HEADING_6',
}

# Helper: create a dimension dict in PT units (used by Google Docs API)
def pt(points):
    """Return a Google Docs API dimension object in PT units."""
    return {'magnitude': points, 'unit': 'PT'}

# Indentation per nesting level (in points)
INDENT_PER_LEVEL_PT = 36

# Color for @mentions (a readable blue: #0060BF)
MENTION_COLOR = {'red': 0.0, 'green': 0.376, 'blue': 0.749}


# ---------------------------------------------------------------------------
# GoogleDocBuilder class
# ---------------------------------------------------------------------------

class GoogleDocBuilder:
    """
    Accumulates Google Docs API batchUpdate requests from parsed markdown tokens.

    The builder tracks a cursor position and appends insertText / updateTextStyle /
    updateParagraphStyle requests so that the entire document can be formatted in
    a single batchUpdate call.

    Usage:
        builder = GoogleDocBuilder()
        builder.add_tokens(parsed_tokens)
        requests = builder.get_requests()
    """

    def __init__(self):
        self._requests = []    # accumulated API request dicts
        self._cursor = 1       # Google Docs body starts at index 1

    # ------------------------------------------------------------------
    # Public API
    # ------------------------------------------------------------------

    def add_tokens(self, tokens):
        """Process a list of parsed markdown tokens and build API requests."""
        for token in tokens:
            if isinstance(token, HeadingToken):
                self._add_heading(token)
            elif isinstance(token, CheckboxToken):
                self._add_checkbox(token)
            elif isinstance(token, BulletToken):
                self._add_bullet(token)
            elif isinstance(token, HorizontalRuleToken):
                self._add_horizontal_rule()
            elif isinstance(token, TextToken):
                self._add_footer_text(token)
            else:
                raise ValueError(f"Unknown token type: {type(token)}")

    def get_requests(self):
        """Return the accumulated batchUpdate requests list."""
        return list(self._requests)

    # ------------------------------------------------------------------
    # Private helpers ‚Äì text insertion
    # ------------------------------------------------------------------

    def _insert_text(self, text):
        """
        Insert plain text at the current cursor position.

        Returns:
            (start_index, end_index) of the inserted text.
        """
        start = self._cursor
        self._requests.append({
            'insertText': {
                'location': {'index': start},
                'text': text,
            }
        })
        self._cursor += len(text)
        return start, self._cursor

    def _insert_newline(self):
        """Insert a newline character and return the index where it was placed."""
        idx = self._cursor
        self._insert_text('\n')
        return idx

    # ------------------------------------------------------------------
    # Private helpers ‚Äì styling
    # ------------------------------------------------------------------

    def _apply_named_style(self, start, end, style_name):
        """Apply a named paragraph style (e.g. HEADING_1) to a range."""
        self._requests.append({
            'updateParagraphStyle': {
                'range': {'startIndex': start, 'endIndex': end},
                'paragraphStyle': {'namedStyleType': style_name},
                'fields': 'namedStyleType',
            }
        })

    def _apply_bullet(self, start, end):
        """Apply a bullet list preset to a paragraph range."""
        self._requests.append({
            'createParagraphBullets': {
                'range': {'startIndex': start, 'endIndex': end},
                'bulletPreset': 'BULLET_DISC_CIRCLE_SQUARE',
            }
        })

    def _apply_checkbox(self, start, end):
        """Apply a checklist (checkbox) preset to a paragraph range."""
        self._requests.append({
            'createParagraphBullets': {
                'range': {'startIndex': start, 'endIndex': end},
                'bulletPreset': 'BULLET_CHECKBOX',
            }
        })

    def _set_indent(self, start, end, level):
        """Set the indentation level for a paragraph (for nested bullets)."""
        indent_pt = level * INDENT_PER_LEVEL_PT
        self._requests.append({
            'updateParagraphStyle': {
                'range': {'startIndex': start, 'endIndex': end},
                'paragraphStyle': {
                    'indentStart': pt(indent_pt),
                    'indentFirstLine': pt(indent_pt),
                },
                'fields': 'indentStart,indentFirstLine',
            }
        })

    def _style_mentions(self, text, text_start):
        """
        Find all @mentions in the inserted text and apply bold + colored styling.

        Args:
            text: The plain-text string that was inserted.
            text_start: The document index where the text starts.
        """
        for match in re.finditer(r'@\w+', text):
            mention_start = text_start + match.start()
            mention_end = text_start + match.end()
            self._requests.append({
                'updateTextStyle': {
                    'range': {'startIndex': mention_start, 'endIndex': mention_end},
                    'textStyle': {
                        'bold': True,
                        'foregroundColor': {'color': {'rgbColor': MENTION_COLOR}},
                    },
                    'fields': 'bold,foregroundColor',
                }
            })

    def _style_footer(self, start, end):
        """Apply a distinct italic + gray style for footer lines."""
        self._requests.append({
            'updateTextStyle': {
                'range': {'startIndex': start, 'endIndex': end},
                'textStyle': {
                    'italic': True,
                    'fontSize': pt(9),
                    'foregroundColor': {
                        'color': {
                            'rgbColor': {'red': 0.4, 'green': 0.4, 'blue': 0.4}
                        }
                    },
                },
                'fields': 'italic,fontSize,foregroundColor',
            }
        })

    # ------------------------------------------------------------------
    # Private helpers ‚Äì token handlers
    # ------------------------------------------------------------------

    def _add_heading(self, token):
        """Insert a heading line and apply the appropriate heading style."""
        start, end = self._insert_text(token.text)
        nl_idx = self._insert_newline()
        style = HEADING_STYLE_MAP.get(token.level, 'HEADING_6')
        self._apply_named_style(start, nl_idx + 1, style)

    def _add_bullet(self, token):
        """Insert a bullet point with proper nesting."""
        start, end = self._insert_text(token.text)
        nl_idx = self._insert_newline()
        self._apply_bullet(start, nl_idx + 1)
        # Apply indentation for nested bullets (level >= 1)
        if token.indent_level > 0:
            self._set_indent(start, nl_idx + 1, token.indent_level)
        # Highlight any @mentions
        self._style_mentions(token.text, start)

    def _add_checkbox(self, token):
        """Insert a checkbox (checklist) item."""
        start, end = self._insert_text(token.text)
        nl_idx = self._insert_newline()
        self._apply_checkbox(start, nl_idx + 1)
        # Highlight @mentions in action items
        self._style_mentions(token.text, start)

    def _add_horizontal_rule(self):
        """Insert a visual horizontal separator line."""
        separator = '\u2500' * 50  # box-drawing light horizontal character
        start, end = self._insert_text(separator)
        nl_idx = self._insert_newline()
        # Style the rule line as small and light gray
        self._requests.append({
            'updateTextStyle': {
                'range': {'startIndex': start, 'endIndex': end},
                'textStyle': {
                    'fontSize': pt(6),
                    'foregroundColor': {
                        'color': {
                            'rgbColor': {'red': 0.7, 'green': 0.7, 'blue': 0.7}
                        }
                    },
                },
                'fields': 'fontSize,foregroundColor',
            }
        })

    def _add_footer_text(self, token):
        """Insert footer / plain text with distinct italic gray styling."""
        start, end = self._insert_text(token.text)
        nl_idx = self._insert_newline()
        self._style_footer(start, end)


print("‚úÖ GoogleDocBuilder class defined successfully.")

## 5. Create the Google Doc & Apply Formatting

This cell ties everything together:
1. Creates a new blank Google Doc via the API.
2. Parses the markdown into tokens.
3. Builds the batch update requests.
4. Executes them in a single API call.

In [None]:
def create_meeting_notes_doc(markdown_text, doc_title="Meeting Notes"):
    """
    End-to-end function: parse markdown, create a Google Doc, and format it.

    Args:
        markdown_text: The raw markdown string to convert.
        doc_title: Title for the new Google Doc.

    Returns:
        The URL of the created Google Doc.

    Raises:
        HttpError: If any Google API call fails.
        ValueError: If the markdown cannot be parsed.
        RuntimeError: If authentication or service creation fails.
    """
    # ---- Step 1: Parse the markdown ----
    print("üìù Parsing markdown...")
    tokens = parse_markdown(markdown_text)
    if not tokens:
        raise ValueError("Parsed markdown produced no tokens. Check the input text.")
    print(f"   Found {len(tokens)} tokens.")

    # ---- Step 2: Build the Google Docs API service ----
    print("üîó Connecting to Google Docs API...")
    try:
        docs_service = build('docs', 'v1')
    except Exception as exc:
        raise RuntimeError(
            "Failed to build the Docs API service. "
            "Make sure you ran the authentication cell above."
        ) from exc

    # ---- Step 3: Create a new blank document ----
    print(f"üìÑ Creating Google Doc: '{doc_title}'...")
    try:
        doc = docs_service.documents().create(body={'title': doc_title}).execute()
    except HttpError as exc:
        raise RuntimeError(
            f"Failed to create document. HTTP {exc.resp.status}: {exc.content.decode()}"
        ) from exc

    doc_id = doc['documentId']
    doc_url = f"https://docs.google.com/document/d/{doc_id}/edit"
    print(f"   Document created! ID: {doc_id}")

    # ---- Step 4: Build batch update requests from tokens ----
    print("üî® Building formatting requests...")
    builder = GoogleDocBuilder()
    builder.add_tokens(tokens)
    requests = builder.get_requests()
    print(f"   Generated {len(requests)} API requests.")

    # ---- Step 5: Execute batchUpdate ----
    if requests:
        print("üé® Applying formatting to the document...")
        try:
            docs_service.documents().batchUpdate(
                documentId=doc_id,
                body={'requests': requests},
            ).execute()
        except HttpError as exc:
            print(f"\n‚ö†Ô∏è  batchUpdate failed. HTTP {exc.resp.status}")
            print(f"   Detail: {exc.content.decode()[:500]}")
            print(f"   The document was created but may be incomplete: {doc_url}")
            raise

    print(f"\n‚úÖ Done! Your formatted Google Doc is ready:")
    print(f"   üîó {doc_url}")
    return doc_url


# ---- Run it! ----
document_url = create_meeting_notes_doc(
    markdown_text=MEETING_NOTES_MD,
    doc_title="Product Team Sync - May 15, 2023",
)

## 6. Open the Document

Click the link printed above, or run the cell below for a clickable button.

In [None]:
from IPython.display import HTML, display

display(HTML(
    f'<h3><a href="{document_url}" target="_blank"'
    f' style="color:#1a73e8;">üìÑ Open your Google Doc</a></h3>'
))