Skip to content

πŸ—οΈ Stage 2: Semantic Logic Parsing β€” structured instruction extractionΒ #85

@groupthinking

Description

@groupthinking

Architecture Reference

Stage 2: Semantic Logic Parsing β€” 'LLMs transform raw video transcripts and visual cues into structured, actionable instructions.'

Current State

The pipeline extracts basic metadata (title, technologies, features list) but does NOT produce structured, actionable instructions. The extracted_info dict contains:

  • title (often defaults to generic)
  • technologies (list of strings)
  • features (list of strings)
  • tutorial_steps (raw text, not structured)

Gap

The architecture calls for 'structured, actionable instructions' β€” step-by-step build plans that Stage 3 (code generation) can follow deterministically. Currently Stage 3 gets loose text and falls back to templates.

Implementation

  1. Add a structured output schema for Gemini's video analysis response (JSON mode)
  2. Extract ordered build steps with dependencies: 'Step 1: Create React app β†’ Step 2: Install tailwind β†’ Step 3: Create Header component...'
  3. Each step should include: action, file path, code snippet (if visible), dependencies
  4. Output a BuildPlan JSON artifact that Stage 3 consumes

Acceptance Criteria

  • Video analysis returns a structured BuildPlan with ordered steps
  • Each step has: action type, target file, code content, prerequisites
  • Code generator consumes BuildPlan instead of raw text

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions