I used to browse across Reddit, watching developers share their tools and receive incredibly detailed, technical feedback. These weren't just casual comments—they were comprehensive paragraphs filled with actionable insights, bug reports, feature requests, and architectural suggestions. The community would upvote the most valuable feedback, creating a natural signal for what mattered most.
I thought: Why not integrate this feedback directly into my development workflow? Instead of manually sifting through Reddit threads, what if my application could automatically:
- Monitor Reddit for feedback about my products
- Identify the most upvoted, valuable suggestions
- Generate Product Requirements Documents (PRDs) from user feedback
- Automatically create pull requests to address these issues
- Redeploy the application after merging changes
This vision became Node—a SaaS platform that transforms community feedback into code, creating a closed loop between user voices and product improvements.
Building Node taught me several critical lessons about building production AI systems:
Parsing Reddit threads isn't just about extracting text. I had to handle:
- Nested comment structures
- Markdown formatting
- Context preservation across comment threads
- Distinguishing between bug reports, feature requests, and general discussions
The clustering algorithm needed to understand semantic similarity, not just keyword matching. I learned that effective AI systems require careful prompt engineering and context management.
Generating production-ready code from natural language feedback is deceptively difficult. The system must:
- Understand the existing codebase architecture
- Generate code that fits existing patterns and conventions
- Handle edge cases and error scenarios
- Ensure type safety and follow linting rules
- Avoid breaking changes to critical systems (auth, billing, etc.)
I implemented a multi-stage pipeline:
- Planning: Analyze the issue and create a structured implementation plan
- Coding: Generate complete file contents (not patches)
- Verification: Run linting, type checking, and build processes
- Debugging: Use an AI agent to automatically fix errors iteratively
Early versions of Node could generate changes to any part of the codebase. This was dangerous—imagine automatically modifying authentication logic or payment processing based on user feedback!
I implemented strict safety constraints:
- Blocked paths:
.github/workflows,.env,auth/,billing/,payment/,secrets/ - Risk flagging system that identifies potentially dangerous changes
- Conservative change strategy: only modify what's necessary
When generated code fails verification (linting errors, type errors, build failures), the system uses an intelligent debugging agent that:
- Diagnoses errors by category (TypeScript, ESLint, import errors, etc.)
- Gathers context from related files
- Plans fixes using multiple strategies (type annotations, null checks, optional chaining, etc.)
- Executes fixes and verifies them
- Reflects on failures to improve future attempts
This creates a self-correcting system that can handle complex error scenarios.
Node is built as a Next.js 15 application with the following key components:
┌─────────────────┐
│ Reddit Monitor │ → Ingests threads/subreddits
└────────┬────────┘
│
▼
┌─────────────────┐
│ Clustering AI │ → Groups similar feedback, assigns severity
└────────┬────────┘
│
▼
┌─────────────────┐
│ Dashboard │ → Review clusters, approve builds
└────────┬────────┘
│
▼
┌─────────────────┐
│ Build Pipeline │ → Generate PRs automatically
└────────┬────────┘
│
▼
┌─────────────────┐
│ GitHub │ → PR created, ready for merge
└─────────────────┘
The ingestion system handles two modes:
- Thread URL Mode: Directly parse Reddit thread URLs (no API needed)
- Subreddit Mode: Monitor entire subreddits using Reddit's API
// Simplified ingestion flow
async function ingestThreadUrl(projectId: string, url: string) {
const thread = await parseRedditThread(url, { maxComments: 500 });
const items = threadToFeedbackItems(thread, projectId);
await upsertFeedbackItems(items);
return { success: true, postsIngested: 1, commentsIngested: items.length - 1 };
}The system normalizes Reddit data into a consistent feedback item format, preserving metadata like upvotes, author, timestamps, and raw JSON for later analysis.
The clustering system groups similar feedback using AI:
- Grouping: Each Reddit post becomes its own cluster (simplified approach for reliability)
- Type Detection: Classifies feedback as
bug,feature, orux - Severity Calculation: Based on upvotes, engagement, and comment count
- AI Summarization: Generates concise summaries, reproduction steps, and acceptance criteria
The severity score uses a weighted formula:
[ \text{severity} = \min\left(10, \frac{\text{upvotes} \times 2 + \text{comments} + \text{engagement}}{10}\right) ]
Where engagement includes both upvotes and comment count.
The build pipeline orchestrates the entire PR generation process:
Stage 1: Sandboxing
- Create isolated workspace
- Clone repository
- Create feature branch
- Install dependencies
Stage 2: Planning (lib/openrouter/planner.ts)
- Analyze cluster context (title, summary, repro steps, acceptance criteria)
- Introspect repository structure
- Generate implementation plan with files to modify
- Identify risks and complexity
Stage 3: Coding (lib/openrouter/coder.ts)
- Read current file contents
- Generate complete updated file contents (not patches)
- Handle large files with intelligent truncation
- Ensure code follows existing patterns
Stage 4: Verification
- Run linting (
npm run lint) - Run type checking (
tsc --noEmit) - Run build process (
npm run build) - If any step fails, trigger debugging agent
Stage 5: Debugging Agent (lib/agent/agent.ts)
- Diagnose errors by category
- Gather context from related files
- Plan fixes using multiple strategies
- Execute fixes iteratively (up to 3 attempts)
- Self-reflect on failures
Stage 6: Pushing & PR Creation
- Commit changes with descriptive messages
- Push branch to GitHub
- Create pull request with:
- Title: Generated from cluster title
- Body: Includes summary, evidence, acceptance criteria, and file changes
The sandbox provides an isolated environment for code generation:
- Isolation: Each build runs in a temporary directory
- Rollback Support: File snapshots allow reverting changes
- Command Execution: Run npm/yarn commands with timeout protection
- Repository Introspection: Analyze project structure, dependencies, and configuration
The dashboard provides:
- Clusters View: List all feedback clusters with severity, type, and status
- Cluster Detail: View full context, evidence, and generated PRD
- Builds Queue: Monitor build jobs in real-time with logs
- Integrations: Connect GitHub repositories
- Settings: Configure Reddit sources and project settings
The UI uses:
- Framer Motion for smooth animations
- Recharts for data visualization
- Tailwind CSS with dark theme and glass morphism effects
- Radix UI primitives for accessible components
- Framework: Next.js 15 (App Router)
- Language: TypeScript
- Database: Supabase (PostgreSQL)
- AI: OpenRouter API (access to multiple LLM providers)
- Version Control: GitHub API
- Styling: Tailwind CSS with CSS Variables
- UI Components: Custom shadcn/ui-style components
- Animation: Framer Motion
- Charts: Recharts
Problem: When generating code for large files, the LLM context window would overflow, causing incomplete or malformed code.
Solution: Implemented intelligent file truncation that:
- Always preserves imports (first section)
- Always preserves exports (last section)
- Truncates middle sections while maintaining context
- Limits files to ~8000 characters in prompts
function truncateFileContent(content: string, maxChars: number = 8000): string {
// Preserve imports and exports, truncate middle intelligently
const importSection = extractImports(content);
const exportSection = extractExports(content);
const middle = truncateMiddle(content, remainingSpace);
return `${importSection}\n...\n${middle}\n...\n${exportSection}`;
}Problem: LLMs often generated code using Pages Router patterns (next/document, next/head) instead of App Router patterns.
Solution: Added explicit system prompts that:
- Clearly distinguish App Router from Pages Router
- Provide correct code examples
- Block incorrect imports in the planner
- Include App Router rules in all code generation prompts
Problem: When generated code failed verification, the error output could be overwhelming (hundreds of TypeScript errors, ESLint violations, etc.).
Solution: Built a sophisticated error diagnosis engine that:
- Categorizes errors (TypeScript, ESLint, import, null/undefined, React hooks, etc.)
- Identifies root causes (often import or type definition issues)
- Prioritizes fixes (fix root cause first, then cascading errors)
- Uses multiple fix strategies (type annotations, optional chaining, null checks, etc.)
Problem: Reddit API has strict rate limits, and LLM API calls are expensive.
Solution:
- Reddit: Implemented thread URL parsing (no API needed for direct links)
- LLM: Used temperature settings (0.1-0.3) for deterministic outputs
- Caching: Cache repository introspection results
- Batching: Process multiple feedback items in single clustering runs
Problem: Automatically modifying codebases is dangerous. One wrong change could break authentication, billing, or critical infrastructure.
Solution: Multi-layered safety system:
- Path Blocking: Hard-coded list of dangerous paths
- Risk Flagging: AI identifies potentially risky changes
- Conservative Planning: Only modify what's necessary
- Verification: Run full build and lint before creating PR
- Human Approval: All PRs require manual review before merging
Problem: LLMs need sufficient context to generate good code, but too much context causes token overflow and increased costs.
Solution: Intelligent context engine that:
- Prioritizes files by relevance (error sources, imports, related files)
- Limits context to top 8 most relevant files
- Uses file size and error proximity as relevance signals
- Caches file contents to avoid redundant reads
The clustering algorithm uses several mathematical concepts:
[ S = \min\left(10, \frac{U \times 2 + C + E}{10}\right) ]
Where:
- (S) = Severity score (0-10)
- (U) = Total upvotes
- (C) = Total comments
- (E) = Engagement score (upvotes + comments)
For future semantic clustering (not yet implemented):
[ \text{similarity}(A, B) = \frac{\text{embedding}(A) \cdot \text{embedding}(B)}{||\text{embedding}(A)|| \times ||\text{embedding}(B)||} ]
Clusters are formed when (\text{similarity} > \theta), where (\theta = 0.3) is the similarity threshold.
- Semantic Clustering: Use embeddings to group truly similar feedback across different posts
- Multi-LLM Strategy: Try different models for different tasks (planning vs. coding vs. debugging)
- Incremental Updates: Instead of regenerating entire files, generate focused patches
- Test Generation: Automatically generate unit tests for generated code
- Feedback Loop: Learn from which PRs get merged vs. rejected to improve future generations
Node represents a vision where community feedback directly drives product development. By combining Reddit monitoring, AI-powered analysis, and automated code generation, it creates a seamless pipeline from user voices to deployed code.
The journey taught me that building production AI systems requires:
- Careful safety constraints
- Iterative debugging capabilities
- Intelligent context management
- Human oversight at critical decision points
While the system isn't perfect (and likely never will be), it demonstrates the potential of AI-assisted development workflows. The future of software development may involve AI agents that can understand user needs, plan implementations, write code, debug errors, and deploy changes, all while maintaining safety and code quality.
Built using Next.js, TypeScript, Supabase, and OpenRouter AI