Skip to content

feat: Enhanced Task Generation with Scheduling and Templates#301

Merged
frankbria merged 18 commits intomainfrom
feature/enhanced-task-generation
Jan 25, 2026
Merged

feat: Enhanced Task Generation with Scheduling and Templates#301
frankbria merged 18 commits intomainfrom
feature/enhanced-task-generation

Conversation

@frankbria
Copy link
Owner

@frankbria frankbria commented Jan 24, 2026

Summary

  • Add effort estimation to task decomposition with hours-based tracking
  • Implement critical path analysis and dependency conflict detection
  • Add CPM-based task scheduling with multi-agent support
  • Create task template system with 7 builtin templates
  • Add CLI commands for schedule management (cf schedule show/predict/bottlenecks)
  • Add CLI commands for templates (cf templates list/show/apply)
  • Add API endpoints for schedule and templates (Phase 6)

Changes

Phase 1: Task Decomposition

  • Tasks now support estimated_hours field
  • Task decomposer generates effort estimates

Phase 2: Dependency Analysis

  • Critical path analysis identifies longest task chains
  • Conflict detection finds circular dependencies
  • DependencyResolver updated for v2 API compatibility

Phase 3: Task Scheduling

  • TaskScheduler with CPM (Critical Path Method) algorithm
  • Multi-agent parallel scheduling support
  • Completion date prediction
  • Bottleneck identification

Phase 4: Task Templates

  • 7 builtin templates (api-endpoint, react-component, database-model, etc.)
  • Template categories (backend, frontend, database, testing, infrastructure)
  • Context variable substitution

Phase 5: CLI Commands

  • cf schedule show - Display project schedule
  • cf schedule predict - Predict completion date
  • cf schedule bottlenecks - Identify bottlenecks
  • cf templates list - List available templates
  • cf templates show <id> - Show template details
  • cf templates apply <id> - Create tasks from template

Phase 6: API Endpoints

  • Schedule endpoints: GET schedule, predict, bottlenecks
  • Templates endpoints: list, categories, get, apply

Test plan

  • Run uv run pytest -m v2 - 795 tests pass
  • Run uv run pytest tests/planning/ - All scheduling/template tests pass
  • Run uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py - CLI tests pass
  • Run uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py - API tests pass
  • Verify cf schedule show works in initialized workspace
  • Verify cf templates list shows 7 builtin templates

Summary by CodeRabbit

  • New Features

    • Project scheduling: timeline, multi‑agent assignments, optimization, completion prediction, bottleneck & critical‑path analysis.
    • Task templates: browse, view, and apply builtin templates with context substitution and estimated totals.
    • Effort estimation: tasks now carry estimated hours, complexity, uncertainty, and resource hints.
  • API / CLI / UI

    • New CLI commands and HTTP endpoints for schedule and template operations (show, predict, bottlenecks, list, show, apply).
  • Bug Fixes

    • Clearer, typed error responses for invalid phase transitions and missing projects.
  • Chores

    • DB migrations and a transaction API to support new task fields.
  • Tests

    • Extensive unit/integration tests for scheduling, templates, decomposition, and dependency analysis.

✏️ Tip: You can customize this high-level summary in your review settings.

Test User added 8 commits January 23, 2026 22:38
Enhance TaskDecomposer to request and parse effort estimation data from Claude:
- Add estimated_hours, complexity_score, uncertainty_level, resource_requirements
  fields to Task model
- Update build_decomposition_prompt() to request complexity (1-5), hours estimate,
  and uncertainty level for each task
- Add parse helper methods: _parse_complexity(), _parse_estimated_hours(),
  _parse_uncertainty() with validation and normalization
- Update database schema with new columns and backward-compatible migration
- Update task repository ALLOWED_TASK_FIELDS, _row_to_task(), and create_task_with_issue()

All 169 planning and persistence tests pass with no regressions.
Enhance DependencyResolver with advanced dependency analysis features:
- calculate_critical_path() using forward/backward pass algorithm
- calculate_task_slack() for float time calculation
- identify_parallel_opportunities() for execution wave grouping
- detect_dependency_conflicts() for bottleneck and long chain detection
- suggest_dependency_resolution() for resolution recommendations

Add dataclasses: TaskTiming, CriticalPathResult, DependencyConflict, ResolutionSuggestion

All 486 agent tests pass with no regressions.
Implements intelligent task scheduling with:
- schedule_tasks(): Assign start times based on dependencies and resources
- optimize_schedule(): Suggest parallelization improvements
- predict_completion_date(): Estimate project end with confidence intervals
- identify_bottlenecks(): Find duration and dependency bottlenecks

Uses Critical Path Method concepts with multi-agent resource allocation.
16 TDD tests passing.
Adds schedule_project_tasks() method to LeadAgent that:
- Loads project tasks and builds dependency graph
- Extracts estimated_hours with 1.0h default for missing estimates
- Uses TaskScheduler for CPM-based scheduling
- Supports configurable agent parallelism

10 TDD tests added covering scheduling integration.
…se 4)

Implements a task template system with:
- TaskTemplate and TemplateTask dataclasses
- TaskTemplateManager for managing and applying templates
- 7 builtin templates: api-endpoint, ui-component, database-migration,
  test-suite, feature-flag, bug-fix, refactoring
- Template application with context substitution
- Support for custom template registration

Each template includes predefined tasks with effort estimates, complexity
scores, and dependency relationships.

27 TDD tests passing.
Phase 5 Enhanced CLI Commands:
- Add schedule subcommand with show, predict, bottlenecks commands
- Add templates subcommand with list, show, apply commands
- Update DependencyResolver to support v2 Task model (list-based depends_on)
- Add estimated_hours field to tasks schema and create() API
- Convert schedule/templates commands from legacy to v2 API

Technical changes:
- DependencyResolver now handles both list and string depends_on formats
- DependencyResolver handles both enum and string task status
- Tasks table gains estimated_hours column with migration support
- tasks.create() accepts estimated_hours parameter
- SELECT queries updated to include estimated_hours column

All 795 v2 tests pass.
Phase 6 API Endpoints:
- Add schedule router with GET endpoints:
  - /api/schedule/{project_id} - Get project schedule
  - /api/schedule/{project_id}/predict - Predict completion date
  - /api/schedule/{project_id}/bottlenecks - Identify bottlenecks
- Add templates router with endpoints:
  - GET /api/templates/ - List all templates
  - GET /api/templates/categories - List categories
  - GET /api/templates/{template_id} - Get template details
  - POST /api/templates/{project_id}/apply - Apply template to project
- Register both routers in server.py
- Add unit tests for router imports and endpoint existence

All 11 API tests pass.
- Update status to v2 Phase 4 Complete
- Add task scheduling (CPM-based), templates, and effort estimation to "What's Working Now"
- Update last modified date
@frankbria frankbria linked an issue Jan 24, 2026 that may be closed by this pull request
33 tasks
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 24, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds Phase‑4 planning: effort‑estimation fields, critical‑path/slack analysis, dependency conflict detection and resolution suggestions, a wave‑based TaskScheduler (scheduling, optimization, prediction, bottleneck analysis), task templates, CLI/UI surfaces, DB migrations and transaction helper, LeadAgent scheduling integration, custom phase exceptions, and extensive tests. (49 words)

Changes

Cohort / File(s) Summary
Docs / Metadata
CLAUDE.md
Updated phase/date and documented scheduling, templates, and effort‑estimation features.
Core models & task CRUD
codeframe/core/models.py, codeframe/core/tasks.py, codeframe/persistence/repositories/task_repository.py
Added task effort fields (estimated_hours, complexity_score, uncertainty_level, resource_requirements); serialization, create/select SQL, repository whitelist and method signatures updated; row parsing made backward‑compatible.
Schema & Workspace migrations
codeframe/core/workspace.py, codeframe/persistence/schema_manager.py
v2 schema and migrations updated to add new task columns with constraints and idempotent add‑column logic.
Dependency analysis
codeframe/agents/dependency_resolver.py, tests/agents/test_dependency_resolver_critical_path.py
Added dataclasses (TaskTiming, CriticalPathResult, DependencyConflict, ResolutionSuggestion) and methods: critical path, slack, waves/parallel identification, conflict detection, and resolution suggestion logic; enhanced dependency parsing and validation.
Task scheduling engine
codeframe/planning/task_scheduler.py, tests/planning/test_task_scheduler.py
New TaskScheduler and dataclasses (TaskAssignment, ScheduleResult, CompletionPrediction, BottleneckInfo, ScheduleOptimization); implements wave‑based scheduling, optimization, completion prediction, and bottleneck identification.
Task templates system
codeframe/planning/task_templates.py, codeframe/cli/app.py, codeframe/ui/routers/templates.py, tests/planning/test_task_templates.py, tests/cli/*, tests/api/*
Added BUILTIN_TEMPLATES and TaskTemplateManager; CLI templates subapp (list/show/apply) and FastAPI templates router implemented; apply flow materializes tasks including effort fields.
Task decomposition
codeframe/planning/task_decomposer.py, tests/planning/test_task_decomposer_effort.py
Claude parsing refactor to extract complexity, estimated_hours, uncertainty; structured parsing helpers; Task creation updated to include effort fields; enforces task count bounds.
LeadAgent integration
codeframe/agents/lead_agent.py, tests/agents/test_lead_agent_scheduling.py
LeadAgent now initializes TaskScheduler and exposes schedule_project_tasks(agents_available) that loads tasks, builds graph, and delegates scheduling; integration tests added.
CLI surface
codeframe/cli/app.py, tests/cli/*
Added Typer subapps schedule and templates with commands: `schedule show
API routers & server
codeframe/ui/routers/schedule.py, codeframe/ui/routers/templates.py, codeframe/ui/server.py, tests/api/*
New FastAPI routers for schedule and templates with response models and endpoints (schedule, predict, bottlenecks, templates list/apply); server includes routers.
Phase manager / error handling
codeframe/core/phase_manager.py, codeframe/ui/routers/tasks.py, tests/core/test_phase_manager.py
Added ProjectNotFoundError and InvalidPhaseTransitionError; tasks router maps them to 404/400 and tests updated to expect typed exceptions.
Database helper
codeframe/persistence/database.py
Added synchronous transaction context manager for explicit transaction scopes with nested‑transaction guard and rollback/commit semantics; switched lock to RLock.
Tests — broad coverage
tests/agents/*, tests/planning/*, tests/cli/*, tests/api/*, tests/core/*
Extensive new tests covering dependency resolver (critical path/slack/conflicts), TaskScheduler, templates manager and API, task decomposition effort parsing, CLI commands, LeadAgent scheduling integration, and phase manager exceptions.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LeadAgent
    participant Tasks_DB
    participant DependencyResolver
    participant TaskScheduler

    User->>LeadAgent: schedule_project_tasks(agents_available=2)
    LeadAgent->>Tasks_DB: load_all_tasks()
    Tasks_DB-->>LeadAgent: tasks_with_estimates
    LeadAgent->>DependencyResolver: build_dependency_graph(tasks)
    DependencyResolver-->>LeadAgent: dependency_graph
    LeadAgent->>TaskScheduler: schedule_tasks(tasks, durations, resolver, agents=2)
    TaskScheduler->>DependencyResolver: identify_parallel_opportunities()
    DependencyResolver-->>TaskScheduler: waves_by_level
    TaskScheduler->>TaskScheduler: assign_tasks_to_agents()
    TaskScheduler-->>LeadAgent: ScheduleResult(assignments,timeline,total_duration)
    LeadAgent-->>User: schedule_result
Loading
sequenceDiagram
    participant User
    participant CLI
    participant TaskTemplateManager
    participant LeadAgent
    participant Tasks_DB

    User->>CLI: templates apply --template api-endpoint --workspace .
    CLI->>TaskTemplateManager: apply_template(template_id, context, issue_number)
    TaskTemplateManager-->>CLI: generated_task_dicts
    CLI->>LeadAgent: create_tasks_from_dicts(task_dicts)
    LeadAgent->>Tasks_DB: insert_tasks_with_effort_fields (transaction)
    Tasks_DB-->>LeadAgent: created_task_ids
    LeadAgent-->>CLI: created_task_summary
    CLI-->>User: "Created N tasks from template"
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

Poem

🐇 I nibbled deadlines, counted hours and hops,
I hopped the critical path and cleared the props,
Templates sowed tasks that knew their worth,
Agents danced in waves to shorten project mirth,
Phase‑four carrots bloom — a rabbit’s tidy hearth!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main feature additions: task scheduling and template management alongside enhanced task generation. It accurately reflects the primary objectives of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 98.99% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/enhanced-task-generation

Comment @coderabbitai help to get the list of available commands and usage tips.

- Remove extraneous f-prefix from string literal
- Remove unused imports (dataclass, Optional)
@macroscopeapp
Copy link
Contributor

macroscopeapp bot commented Jan 24, 2026

Add task effort estimation to decomposition, schedule tasks via new /api/schedule and CLI commands, and expose reusable task templates via /api/templates with database migrations for estimated_hours (REAL), complexity_score (INTEGER 1–5), and uncertainty_level (TEXT)

Introduce effort fields on Task and parse them in TaskDecomposer, implement TaskScheduler with dependency-aware assignments and predictions, add FastAPI routers for scheduling and templates, extend CLI with schedule commands, and migrate persistence to store effort metadata.

📍Where to Start

Start with scheduling flow in get_project_schedule in schedule.py, then review TaskScheduler.schedule_tasks in task_scheduler.py and effort parsing in TaskDecomposer.parse_claude_response in task_decomposer.py.


Macroscope summarized 86f64d2.

@github-actions
Copy link
Contributor

Code Review: PR #301 - Enhanced Task Generation with Scheduling and Templates

Overall, this is a well-structured PR that implements v2 Phase 4 (enhanced task generation). The code follows the established architecture patterns and adds significant functionality for project planning and scheduling.

✅ Strengths

  1. Clean Architecture: Code follows v2 patterns with proper separation between core, CLI, and API layers
  2. Idempotent Migrations: Database migrations are safely designed with
  3. Security: SQL injection protection in schema_manager.py:119-125
  4. Comprehensive Tests: 4 new test files covering scheduler, templates, critical path, and decomposer
  5. Headless Operation: CLI commands work independently of FastAPI server as required
  6. Documentation: Good docstrings and type hints throughout
  7. Builtin Templates: 7 practical templates covering common development patterns

⚠️ Issues to Address

1. Potential Division by Zero (Moderate)

File:

completed_percentage = (
    (completed_hours / total_hours * 100) if total_hours > 0 else 100
)

When is 0 (no tasks or all durations are 0), this returns 100% which is misleading. Should return 0% or handle empty schedules explicitly.

Recommendation: Return 0% when there are no tasks with durations, or add a check at the beginning of to handle empty schedules.

2. Inconsistent Null/Default Handling (Low)

File:

duration = getattr(task, "estimated_hours", None)
if duration is None or duration <= 0:
    duration = 1.0

This defaults to 1.0 for missing/negative estimates, but CLI code (app.py:743) has similar logic. This inconsistency could lead to different behavior between CLI and API for the same data. Consider centralizing this logic.

3. Status String Comparison (Low)

File:

if task.status.value == "DONE":

Using string comparison on status values. Since is an enum, using or would be safer. This also appears in other places like app.py:3836 and the code should be consistent.

4. Missing Complexity Score Propagation (Low)

File:

estimated_hours=task_dict.get("estimated_hours"),

When applying templates, and are not being passed to . The template task dictionary includes these fields (see task_templates.py:490-498), but they're not being used when creating tasks.

Recommendation: Add these fields:

estimated_hours=task_dict.get("estimated_hours"),
complexity_score=task_dict.get("complexity_score"),
uncertainty_level=task_dict.get("uncertainty_level"),

5. Performance: O(n²) Task Assignment Loop (Low - Potential)

File:

for agent_idx, agent_end in enumerate(agent_end_times):
    potential_start = max(earliest_start, agent_end)
    if potential_start < best_start:
        best_start = potential_start
        best_agent = agent_idx

For large numbers of tasks and agents, this nested loop could be slow. Consider using a priority queue or heap for finding the earliest available agent more efficiently.

🔍 Test Coverage

The test coverage appears comprehensive with new test files:

  • tests/planning/test_task_scheduler.py (450 lines)
  • tests/planning/test_task_templates.py (367 lines)
  • tests/agents/test_dependency_resolver_critical_path.py (369 lines)
  • tests/cli/test_schedule_commands.py (123 lines)
  • tests/cli/test_templates_commands.py (106 lines)

However, I couldn't verify test results as pytest/ruff are not available in this environment. Please verify:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py

✅ Security Considerations

  • No security issues identified
  • SQL injection protection in schema migrations
  • Input validation seems appropriate for the feature scope

✅ Architecture Compliance

Code follows v2 architecture guidelines:

  • ✅ Core is headless (no FastAPI imports in planning/)
  • ✅ CLI works without server
  • ✅ API routes are thin adapters over core logic
  • ✅ Database migrations are idempotent

📋 Summary

This is a substantial and well-implemented PR that adds valuable scheduling and template functionality. The issues identified are relatively minor and mostly related to edge case handling and consistency.

Recommended Actions Before Merge:

  1. Fix division by zero edge case (issue Add Claude Code GitHub Workflow #1)
  2. Add complexity_score and uncertainty_level propagation in templates apply endpoint (issue feat: Project schema refactoring with source types and workspace management #4)
  3. Consolidate null/default handling for estimated_hours (issue Brainstorming: Integrate remaining general concepts into specification #2)
  4. Verify all tests pass
  5. Consider performance optimization for large task sets (issue feat: Project schema refactoring with API endpoint integration #5)

Issues #3 (status comparison) is a code quality improvement that could be addressed in a follow-up PR.


Review based on PR files with 4963 additions across 24 files

Comment on lines +399 to +401
predicted_date = start_date + timedelta(days=days_needed)
early_date = start_date + timedelta(days=early_days)
late_date = start_date + timedelta(days=late_days)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicted dates are calculated from start_date, but days_needed represents remaining work. If the project is 80% done, adding 1 remaining day to start_date gives a past date. Consider using datetime.now() as the base for these calculations.

Suggested change
predicted_date = start_date + timedelta(days=days_needed)
early_date = start_date + timedelta(days=early_days)
late_date = start_date + timedelta(days=late_days)
now = datetime.now()
predicted_date = now + timedelta(days=days_needed)
early_date = now + timedelta(days=early_days)
late_date = now + timedelta(days=late_days)

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +4084 to +4090
task = tasks.create(
workspace,
title=task_dict["title"],
description=task_dict["description"],
status=TaskStatus.BACKLOG,
estimated_hours=task_dict.get("estimated_hours"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prd_record is fetched and validated but never used. Consider passing prd_id=prd_record.id to tasks.create() to link tasks to the PRD.

Suggested change
task = tasks.create(
workspace,
title=task_dict["title"],
description=task_dict["description"],
status=TaskStatus.BACKLOG,
estimated_hours=task_dict.get("estimated_hours"),
)
task = tasks.create(
workspace,
title=task_dict["title"],
description=task_dict["description"],
status=TaskStatus.BACKLOG,
prd_id=prd_record.id,
estimated_hours=task_dict.get("estimated_hours"),
)

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines 157 to +171
tasks = []
for idx, match in enumerate(matches, start=1):
if len(match) >= 2:
match[0]
title = match[1].strip()
description = match[2].strip() if len(match) > 2 and match[2] else title

# Create task number: {issue_number}.{task_idx}
task_number = f"{issue.issue_number}.{idx}"

task = Task(
project_id=issue.project_id,
issue_id=issue.id,
task_number=task_number,
parent_issue_number=issue.issue_number,
title=title,
description=description,
status=TaskStatus.PENDING,
depends_on="", # Will be set in create_dependency_chain
can_parallelize=False, # Always FALSE within issue
priority=issue.priority,
workflow_step=issue.workflow_step,
)

tasks.append(task)
for idx, block in enumerate(task_blocks, start=1):
# Parse title and description from the first line
title, description = self._parse_task_title_description(block)

if not title:
continue

# Parse effort estimation data
complexity = self._parse_complexity(block)
estimated_hours = self._parse_estimated_hours(block)
uncertainty = self._parse_uncertainty(block)

# Create task number: {issue_number}.{task_idx}
task_number = f"{issue.issue_number}.{idx}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a task counter that increments only when a task is created (instead of idx from enumerate) so skipping empty titles doesn't create numbering gaps.

-        tasks = []
-        for idx, block in enumerate(task_blocks, start=1):
+        tasks = []
+        task_idx = 0
+        for block in task_blocks:
             # Parse title and description from the first line
             title, description = self._parse_task_title_description(block)
 
             if not title:
                 continue
 
+            task_idx += 1
             # Parse effort estimation data
             complexity = self._parse_complexity(block)
             estimated_hours = self._parse_estimated_hours(block)
             uncertainty = self._parse_uncertainty(block)
 
             # Create task number: {issue_number}.{task_idx}
-            task_number = f"{issue.issue_number}.{idx}"

🚀 Want me to fix this? Reply ex: "fix it for me".

Returns:
ScheduleResult with task assignments and timeline
"""
# Get parallel execution waves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating agents_available >= 1 at the start of schedule_tasks to avoid indexing an empty agent_end_times.

+        if agents_available < 1:
+            raise ValueError("agents_available must be at least 1")
+
         # Get parallel execution waves

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +562 to +565
in_degree = {
task_id: len(self.dependencies.get(task_id, set()))
for task_id in self.all_tasks
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tasks depending on unknown task IDs will be silently excluded from results since their in_degree never reaches 0. Consider either filtering out unknown dependencies when computing in_degree, or logging a warning when tasks are omitted.

Suggested change
in_degree = {
task_id: len(self.dependencies.get(task_id, set()))
for task_id in self.all_tasks
}
in_degree = {
task_id: len([d for d in self.dependencies.get(task_id, set()) if d in self.all_tasks])
for task_id in self.all_tasks
}

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +497 to +498
"depends_on_indices": template_task.depends_on_indices,
"tags": template_task.tags,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider copying depends_on_indices and tags lists to prevent callers from mutating the original TemplateTask objects in BUILTIN_TEMPLATES.

Suggested change
"depends_on_indices": template_task.depends_on_indices,
"tags": template_task.tags,
"depends_on_indices": list(template_task.depends_on_indices),
"tags": list(template_task.tags),

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines +89 to +94
self._add_column_if_not_exists(
cursor, "tasks", "complexity_score", "INTEGER"
)
self._add_column_if_not_exists(
cursor, "tasks", "uncertainty_level", "TEXT"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration adds complexity_score and uncertainty_level without the CHECK constraints defined in _create_issue_task_tables. Migrated databases will allow invalid values while new databases won't. Consider adding the constraints to the migration (e.g., INTEGER CHECK(complexity_score BETWEEN 1 AND 5)) for consistency.

-        self._add_column_if_not_exists(
-            cursor, "tasks", "complexity_score", "INTEGER"
-        )
-        self._add_column_if_not_exists(
-            cursor, "tasks", "uncertainty_level", "TEXT"
-        )
+        self._add_column_if_not_exists(
+            cursor, "tasks", "complexity_score", "INTEGER CHECK(complexity_score BETWEEN 1 AND 5)"
+        )
+        self._add_column_if_not_exists(
+            cursor, "tasks", "uncertainty_level", "TEXT CHECK(uncertainty_level IN ('low', 'medium', 'high'))"
+        )

🚀 Want me to fix this? Reply ex: "fix it for me".

Returns:
ScheduleResult with task assignments and timeline
"""
# Get parallel execution waves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing agent_end_times[0] fails when agents_available == 0. Suggest validating agents_available >= 1 at the start of schedule_tasks.

-        # Get parallel execution waves
+        # Validate we have at least one agent
+        if agents_available < 1:
+            agents_available = 1
+
+        # Get parallel execution waves

🚀 Want me to fix this? Reply ex: "fix it for me".

Comment on lines 157 to +167
tasks = []
for idx, match in enumerate(matches, start=1):
if len(match) >= 2:
match[0]
title = match[1].strip()
description = match[2].strip() if len(match) > 2 and match[2] else title

# Create task number: {issue_number}.{task_idx}
task_number = f"{issue.issue_number}.{idx}"

task = Task(
project_id=issue.project_id,
issue_id=issue.id,
task_number=task_number,
parent_issue_number=issue.issue_number,
title=title,
description=description,
status=TaskStatus.PENDING,
depends_on="", # Will be set in create_dependency_chain
can_parallelize=False, # Always FALSE within issue
priority=issue.priority,
workflow_step=issue.workflow_step,
)

tasks.append(task)
for idx, block in enumerate(task_blocks, start=1):
# Parse title and description from the first line
title, description = self._parse_task_title_description(block)

if not title:
continue

# Parse effort estimation data
complexity = self._parse_complexity(block)
estimated_hours = self._parse_estimated_hours(block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task numbering gaps occur when empty-title blocks are skipped. Suggest using a counter that increments only when a task is created, not enumerate.

-        tasks = []
-        for idx, block in enumerate(task_blocks, start=1):
+        tasks = []
+        task_idx = 0
+        for block in task_blocks:
             # Parse title and description from the first line
             title, description = self._parse_task_title_description(block)
 
             if not title:
                 continue
 
+            task_idx += 1
             # Parse effort estimation data

🚀 Want me to fix this? Reply ex: "fix it for me".

@github-actions
Copy link
Contributor

Code Review: Enhanced Task Generation with Scheduling and Templates

Overall assessment: This is a well-structured PR with comprehensive functionality, good test coverage, and follows the v2 architecture principles. However, there is a critical bug that must be fixed before merging.

✅ Strengths

  1. Architecture Compliance: Core planning modules (, , updated ) do not import FastAPI, properly maintaining the headless/core-first design.

  2. Comprehensive Test Coverage: 9 new test files covering:

    • Scheduling algorithms ()
    • Template system ()
    • Critical path analysis ()
    • CLI commands (, )
    • API endpoints (, )
  3. Solid CPM Implementation: The critical path method (CPM) algorithm in correctly handles:

    • Forward/backward pass for earliest/latest times
    • Parallel execution waves via
    • Multi-agent resource allocation
    • Slack/float calculation
  4. Well-Designed Templates: 7 builtin templates (api-endpoint, ui-component, database-migration, test-suite, feature-flag, bug-fix, refactoring) with appropriate categories and reasonable effort estimates.

  5. SQL Injection Prevention: All database operations use parameterized queries; the schema migration includes identifier validation in .

  6. Clean CLI Integration: Schedule and templates commands follow existing Typer patterns in , with consistent help text and error handling.

🐛 Critical Issues

**1. Field Name Typo - vs ** (HIGH PRIORITY)

There's an inconsistent spelling throughout the codebase:

  • Database column: complexity_score (missing 'm')
  • Model field: complexity_score (correct)
  • Schema migration: complexity_score
  • ALLOWED_TASK_FIELDS whitelist: complexity_score
  • Model serialization: complexity_score (line 315)

Impact: This will cause data persistence/retrieval issues. The model's to_dict() method uses complexity_score but the database column is complexity_score.

Fix: Choose one consistent spelling and update all occurrences:

  • codeframe/persistence/schema_manager.py:90, 309
  • codeframe/persistence/repositories/task_repository.py:51, 406, 438
  • codeframe/core/models.py:315

📝 Minor Issues

**2. Architecture Violation in **

While the new planning modules correctly avoid FastAPI imports, codeframe/core/phase_manager.py:10 imports FastAPI. This violates the v2 rule that codeframe/core/** must NOT import FastAPI.

Recommendation: Move HTTPException usage to an adapter layer or use a custom exception in core.

3. Missing Input Validation

  • ComplexityScore in API response models has no range validation (should be 1-5 per templates)
  • estimated_hours has no minimum/maximum validation
  • uncertainty_level accepts any string (should be enum: "low", "medium", "high")

Fix: Add Pydantic validators:

complexity_score: int = Field(..., ge=1, le=5)
uncertainty_level: Literal["low", "medium", "high"]

4. Template Dependency Mapping

In TaskTemplateManager.apply_template(), the depends_on_indices are 0-based indices, but when creating actual tasks, these need to be mapped to real task IDs. The current implementation at passes raw indices as depends_on_indices, which may not resolve correctly.

Recommendation: Verify the mapping logic in db.create_task_with_issue() handles index-to-ID conversion properly.

5. Hardcoded Default Duration

In schedule.py:103-104, 165, 233:

duration = getattr(task, "estimated_hours", None) or 1.0

Using 1.0 as a default for tasks without estimates may skew scheduling calculations. Consider logging a warning or treating as 0 (excluded from critical path).

🔍 Security & Performance

  • Security: ✅ No SQL injection vulnerabilities (parameterized queries used throughout)
  • Performance: ✅ O(V+E) complexity for critical path is optimal for DAG traversal
  • Resource Management: ✅ No obvious memory leaks or unbounded loops

📊 Test Coverage

While the PR claims "795 tests pass", the new functionality is well-tested:

  • test_task_scheduler.py: ~450 lines
  • test_task_templates.py: ~367 lines
  • test_dependency_resolver_critical_path.py: ~367 lines

Suggestion: Add integration tests that exercise the full flow:

cf templates apply api-endpoint
cf schedule show
cf schedule predict

📌 Recommendations

  1. Fix the typo before merging (blocker)
  2. Remove FastAPI import from (blocker)
  3. Add Pydantic validators for effort estimation fields (high priority)
  4. Verify template dependency mapping logic (medium priority)
  5. Consider configurable default duration instead of hardcoded 1.0 (low priority)

Summary

This is a substantial, well-implemented PR that adds valuable scheduling and template capabilities. The CPM algorithm implementation is solid, and the template system provides good defaults for common patterns. However, the complexity_score field name inconsistency is a critical bug that will cause runtime failures and must be addressed before this can be merged.

Approval: 🟡 Approve with required changes (fix the typo and FastAPI import violation)

@github-actions
Copy link
Contributor

Comment posted successfully to PR #301: #301 (comment)

New%20session%20-%202026-01-24T19%3A03%3A06.157Z
opencode session  |  github run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codeframe/core/workspace.py (1)

113-129: Backfill estimated_hours for existing workspaces.

The new column is added during fresh initialization, but _ensure_schema_upgrades() doesn’t add it for existing DBs. That can break inserts/selects that now reference estimated_hours. Add a migration in _ensure_schema_upgrades().

✅ Proposed fix
 def _ensure_schema_upgrades(db_path: Path) -> None:
     """Ensure schema upgrades for existing databases."""
     conn = sqlite3.connect(db_path)
     cursor = conn.cursor()
@@
     # Add tech_stack column to workspace table if it doesn't exist
     cursor.execute("PRAGMA table_info(workspace)")
     columns = {row[1] for row in cursor.fetchall()}
     if "tech_stack" not in columns:
         cursor.execute("ALTER TABLE workspace ADD COLUMN tech_stack TEXT")
         conn.commit()
+
+    # Add estimated_hours column to tasks table if it doesn't exist
+    cursor.execute("PRAGMA table_info(tasks)")
+    task_columns = {row[1] for row in cursor.fetchall()}
+    if "estimated_hours" not in task_columns:
+        cursor.execute("ALTER TABLE tasks ADD COLUMN estimated_hours REAL")
+        conn.commit()
🤖 Fix all issues with AI agents
In `@codeframe/agents/dependency_resolver.py`:
- Around line 119-167: The depends_on parsing leaves dep_ids with string IDs
when the source is a JSON array of strings (or a list of strings), causing
mismatches against integer task IDs; update the parsing in
dependency_resolver.py (the block that reads task.depends_on into dep_ids) to
normalize each entry to the canonical task ID type (convert numeric strings to
int, skip/emit a logger.warning for non-numeric or invalid entries) before
iterating and adding to self.dependencies and self.dependents for task_id, and
ensure the same normalization is applied whether depends_on came from a list or
json.loads so comparisons against self.all_tasks work correctly.

In `@codeframe/cli/app.py`:
- Around line 4030-4109: Task creation from templates currently drops
depends_on, complexity_score, and uncertainty_level: call
TaskTemplateManager.apply_template to get depends_on_indices, then after
creating tasks with tasks.create (in templates_apply) map indices -> created
task IDs and call tasks.update_depends_on(task_id, depends_on_ids) for each task
to preserve ordering; for complexity_score and uncertainty_level either add
fields to the Task data model and extend tasks.create/update to accept and
persist them (and update any storage/schema migrations), or stop returning those
values from TaskTemplateManager.apply_template so they are not lost.

In `@codeframe/planning/task_scheduler.py`:
- Around line 340-345: _reschedule_with_agents builds TimelineEntry objects but
omits the agent_id field, causing inconsistency with schedule_tasks; update the
two TimelineEntry constructions inside _reschedule_with_agents to include
agent_id=agent_id (the same identifier used for the scheduled task) for both the
"start" and "end" entries so consumers receive the same fields as in
schedule_tasks.

In `@codeframe/ui/routers/schedule.py`:
- Around line 165-166: get_project_schedule currently treats None and falsy
estimated_hours inconsistently: replace the line that sets duration and
task_durations[task.id] so negative or zero values are handled the same as
earlier logic—retrieve estimated_hours via getattr on task (e.g., val =
getattr(task, "estimated_hours", None)) then set duration = val if val is not
None and val > 0 else 1.0, and assign task_durations[task.id] = duration; update
the assignment near the existing task_durations reference to use this guarded
check.

In `@codeframe/ui/routers/templates.py`:
- Around line 186-278: The apply_template endpoint (apply_template) performs
multiple DB writes (db.create_issue and multiple db.create_task_with_issue)
without transactional safety; wrap the issue creation and all task creations in
a single transaction on the Database so either all writes commit or all are
rolled back on error. Modify apply_template to begin a transaction via the
Database transactional API (e.g., db.transaction() context manager or
db.begin/commit/rollback), perform get_project, optional db.create_issue, and
the loop calling db.create_task_with_issue inside that transaction, commit on
success and rollback on any exception; if your Database class lacks transaction
methods, add a transaction context manager that exposes begin/commit/rollback
and use it from apply_template.

In `@tests/agents/test_dependency_resolver_critical_path.py`:
- Around line 10-18: Add a module-level v2 pytest marker so these tests run
under the v2 suite: after the existing imports (you can reference the resolver
fixture and DependencyResolver import in this file) add a top-level assignment
pytestmark = pytest.mark.v2 (using the already-imported pytest) to mark the
entire module for v2 tests.

In `@tests/agents/test_lead_agent_scheduling.py`:
- Around line 8-13: The test module for LeadAgent v2 functionality is missing
the required pytest v2 marker; add a module-level marker by defining pytestmark
= pytest.mark.v2 at the top of the file (e.g., above or immediately after the
import block) so tests referencing v2 behavior for LeadAgent and ScheduleResult
run under the v2 marker.
- Around line 72-73: This test must be marked with the v2 pytest marker: either
add `@pytest.mark.v2` alongside the existing `@pytest.mark.unit` decorator on the
test function(s) that contain the temp_db.add_task_dependency call, or add a
module-level line pytestmark = pytest.mark.v2 at the top of the test module;
ensure pytest is imported so the decorator or pytestmark reference resolves.

In `@tests/api/test_schedule_api.py`:
- Around line 9-11: The test file imports pytest but is missing the required v2
marker; add the v2 marker by either decorating the relevant test function(s)
with `@pytest.mark.v2` or (preferably) set a module-level marker by defining
pytestmark = pytest.mark.v2 near the top of tests/api/test_schedule_api.py so
all tests in the file are marked as v2; ensure the existing import pytest is
used for this marker.

In `@tests/api/test_templates_api.py`:
- Around line 10-15: This test module is missing the v2 marker; add
pytest.mark.v2 either as a module-level marker (set pytestmark = pytest.mark.v2
near the top of the file) or as a class decorator on TestTemplatesAPI (add
`@pytest.mark.v2` above the TestTemplatesAPI class) while preserving the existing
`@pytest.mark.unit`; ensure the marker name is exactly pytest.mark.v2 so the test
is properly categorized.

In `@tests/cli/test_schedule_commands.py`:
- Around line 10-23: Add a module-level pytest marker for v2 by declaring
pytestmark = pytest.mark.v2 in the test module
(tests/cli/test_schedule_commands.py). Locate the top of the file near existing
imports (pytest is already imported) and add the pytestmark assignment so the
entire module is marked with `@pytest.mark.v2` for the test runner to pick up.

In `@tests/cli/test_templates_commands.py`:
- Around line 9-19: This test module is missing the v2 marker required for tests
under tests/**; add a module-level marker by defining pytestmark =
pytest.mark.v2 (ensure pytest is imported) or annotate the test class
TestTemplatesListCommand with `@pytest.mark.v2` so the suite recognizes it as v2;
update the top-level imports if needed to reference pytest and place the
pytestmark assignment or class decorator accordingly.

In `@tests/planning/test_task_decomposer_effort.py`:
- Around line 10-15: This test module is missing the required v2 pytest marker;
add a module-level marker by importing pytest and setting pytestmark =
pytest.mark.v2 at the top of tests/planning/test_task_decomposer_effort.py so
the Phase 4 effort estimation tests (e.g., those exercising TaskDecomposer and
related Task/Issue behavior) are correctly labeled for v2 test runs.

In `@tests/planning/test_task_scheduler.py`:
- Around line 1-19: Add the required pytest v2 marker to the module by declaring
pytestmark = pytest.mark.v2 at the top of the test module (above imports) or by
decorating the test functions/classes that target v2 behavior with
`@pytest.mark.v2`; ensure this is applied in the module that contains
TaskScheduler, ScheduleResult, ScheduleOptimization imports (the test module
referencing TaskScheduler/DependencyResolver/Task) so all v2 tests are
discovered as v2.

In `@tests/planning/test_task_templates.py`:
- Around line 10-17: This test module lacks the required v2 marker; add the
pytest v2 marker to the module by importing pytest (if not already) and setting
pytestmark = pytest.mark.v2 at the top of tests/planning/test_task_templates.py
so all tests (that reference TaskTemplate, TaskTemplateManager, TemplateTask,
BUILTIN_TEMPLATES) are annotated as v2; alternatively, add `@pytest.mark.v2` to
each test function in this file.
🧹 Nitpick comments (10)
codeframe/persistence/schema_manager.py (1)

85-97: Consider enforcing CHECK constraints for existing databases.
The migration adds columns without the range/enum checks present in the CREATE TABLE, so existing DBs won’t enforce complexity_score and uncertainty_level constraints. If you want parity, add a follow-up migration (or validate on write).

codeframe/planning/task_templates.py (1)

463-504: Validate depends_on_indices before emitting tasks.
Custom templates with out-of-range indices will generate invalid dependency references later. Consider a quick bounds check in apply_template to fail fast.

🔎 Bounds validation for depends_on_indices
         template = self.get_template(template_id)
         if not template:
             raise ValueError(f"Template '{template_id}' not found")
 
+        max_index = len(template.tasks) - 1
+        for idx, template_task in enumerate(template.tasks):
+            invalid = [
+                dep for dep in template_task.depends_on_indices
+                if dep < 0 or dep > max_index
+            ]
+            if invalid:
+                raise ValueError(
+                    f"Template '{template_id}' task {idx} has invalid depends_on_indices: {invalid}"
+                )
+
         tasks = []
         for idx, template_task in enumerate(template.tasks):
             task_number = f"{issue_number}.{idx + 1}"
codeframe/core/models.py (1)

287-291: Consider adding validation hints in comments.

The fields are well-defined with appropriate types and defaults. Note that the database schema enforces constraints:

  • complexity_score: CHECK between 1 and 5
  • uncertainty_level: CHECK for values ('low', 'medium', 'high')

Since Task is a dataclass (not Pydantic), validation happens at the DB layer. This is acceptable, but consider adding inline comments noting valid ranges for clarity to consumers of this model.

📝 Optional: Add constraint hints
     # Effort estimation fields
-    estimated_hours: Optional[float] = None  # Time estimate in hours
-    complexity_score: Optional[int] = None  # Complexity rating (1-5)
-    uncertainty_level: Optional[str] = None  # "low", "medium", "high"
+    estimated_hours: Optional[float] = None  # Time estimate in hours (positive)
+    complexity_score: Optional[int] = None  # Complexity rating, valid: 1-5
+    uncertainty_level: Optional[str] = None  # Valid: "low", "medium", "high"
     resource_requirements: Optional[str] = None  # JSON string of required skills/tools
tests/planning/test_task_templates.py (2)

262-265: Weak assertion in dependency test.

The assertion has_dependencies or len(tasks) == 1 always passes for single-task templates, which doesn't effectively verify dependency handling for multi-task templates. Consider making the assertion more explicit.

♻️ Suggested improvement
         # Check if any task has depends_on_indices set
-        has_dependencies = any(task.get("depends_on_indices") for task in tasks)
-        # API endpoint template should have some dependencies
-        assert has_dependencies or len(tasks) == 1
+        # API endpoint template has multiple tasks with dependencies
+        assert len(tasks) > 1, "API endpoint template should have multiple tasks"
+        has_dependencies = any(task.get("depends_on_indices") for task in tasks)
+        assert has_dependencies, "Multi-task templates should define dependencies"

267-279: Test doesn't verify context substitution actually occurred.

The test test_apply_template_with_context_substitution applies context but only asserts that tasks were returned. It doesn't verify the substitution worked.

♻️ Suggested improvement
     def test_apply_template_with_context_substitution(self):
         """Test context variables are substituted in tasks."""
         manager = TaskTemplateManager()

         tasks = manager.apply_template(
             template_id="api-endpoint",
             context={"endpoint_name": "/products"},
         )

-        # Check that context was used in task descriptions
-        all_text = " ".join(t["title"] + t["description"] for t in tasks)
-        # Context should influence the output
-        assert len(tasks) > 0
+        # Verify context substitution occurred in task text
+        all_text = " ".join(t["title"] + t["description"] for t in tasks)
+        assert len(tasks) > 0
+        # If template uses {endpoint_name} placeholder, it should be substituted
+        # At minimum, verify no unsubstituted placeholders remain
+        assert "{endpoint_name}" not in all_text
tests/agents/test_lead_agent_scheduling.py (1)

267-277: Hard-coded duration assertion may be fragile.

The assertion assert result.total_duration == 7.0 assumes a specific scheduler behavior: A(2) + max(B(3), C(1)) + D(2) = 7 hours. While the comment explains the calculation, this test could become fragile if scheduler internals change.

Consider making this more resilient by asserting relationships rather than exact values:

♻️ Suggested alternative
     def test_get_project_duration_hours(self, project_with_tasks):
         """Test getting total project duration in hours."""
         db = project_with_tasks["db"]
         project_id = project_with_tasks["project_id"]

         agent = LeadAgent(project_id=project_id, db=db, api_key="sk-ant-test-key")

         result = agent.schedule_project_tasks(agents_available=2)

-        # With 2 agents: A(2) + max(B(3), C(1)) + D(2) = 7 hours
-        assert result.total_duration == 7.0
+        # With 2 agents: A(2) + max(B(3), C(1)) + D(2) = 7 hours
+        # Assert expected duration, accounting for parallel execution of B and C
+        assert result.total_duration == pytest.approx(7.0, abs=0.1)
codeframe/planning/task_decomposer.py (1)

208-233: Consider edge case: task blocks without content.

The _split_into_task_blocks method may produce empty strings in blocks if the regex captures a task number but the content is missing. The caller handles this by checking if not title, but an explicit filter could make intent clearer.

Optional: Filter empty blocks
         # Reconstruct blocks (alternating: empty/preamble, number, content, number, content...)
         blocks = []
         i = 1  # Skip first part (preamble before first task)
         while i < len(parts):
             if i < len(parts) and parts[i].isdigit():
                 block = parts[i + 1] if i + 1 < len(parts) else ""
-                blocks.append(block)
+                if block.strip():  # Skip empty blocks
+                    blocks.append(block)
                 i += 2
             else:
                 i += 1

         return blocks
codeframe/ui/routers/templates.py (1)

90-119: TaskTemplateManager is instantiated per request.

A new TaskTemplateManager() is created on each request. If template loading is expensive (e.g., reading from disk), consider caching or using dependency injection.

Suggested: Use dependency injection for caching
# At module level or in dependencies.py
from functools import lru_cache

`@lru_cache`()
def get_template_manager() -> TaskTemplateManager:
    return TaskTemplateManager()

# In endpoints:
`@router.get`("/", response_model=List[TemplateListResponse])
async def list_templates(
    category: Optional[str] = None,
    manager: TaskTemplateManager = Depends(get_template_manager),
) -> List[Dict[str, Any]]:
    ...
codeframe/planning/task_scheduler.py (2)

285-357: Code duplication between schedule_tasks and _reschedule_with_agents.

The scheduling logic in _reschedule_with_agents largely duplicates schedule_tasks. Consider refactoring to share the common implementation.

Suggested: Refactor to share implementation
def _reschedule_with_agents(
    self,
    task_ids: List[int],
    task_durations: Dict[int, float],
    resolver: DependencyResolver,
    agents_available: int,
) -> ScheduleResult:
    """Re-schedule tasks with a different number of agents."""
    # Create minimal task objects or use a filter approach
    # Then delegate to schedule_tasks
    return self.schedule_tasks(
        tasks=[],  # Not used if we filter by task_ids
        task_durations={k: v for k, v in task_durations.items() if k in task_ids},
        resolver=resolver,
        agents_available=agents_available,
    )

The current schedule_tasks doesn't filter by task_ids, so you'd need to adjust the implementation or pass a filter parameter.


469-489: Potential duplicate bottleneck entries.

A task can appear in both the "duration" bottleneck list (lines 449-467) and the "dependencies" bottleneck list (lines 469-489) if it's both long-duration AND has many dependents. Consider deduplicating or documenting this as intentional.

Optional: Track already-reported tasks
         bottlenecks: List[BottleneckInfo] = []
+        reported_tasks: set[int] = set()
 
         # ... duration bottlenecks ...
         for task_id in critical_set:
             duration = task_durations.get(task_id, 0)
 
             if duration > duration_threshold:
                 # ...
                 bottlenecks.append(...)
+                reported_tasks.add(task_id)
 
         # Identify tasks with many dependents
         for task_id in resolver.all_tasks:
             dependent_count = len(resolver.dependents.get(task_id, set()))
 
-            if dependent_count >= dependent_threshold and task_id in critical_set:
+            if dependent_count >= dependent_threshold and task_id in critical_set and task_id not in reported_tasks:
                 # ...

Architecture fixes:
- Remove FastAPI HTTPException import from core/phase_manager.py
- Add ProjectNotFoundError and InvalidPhaseTransitionError domain exceptions
- Update UI router to catch and convert core exceptions to HTTP responses

Input validation:
- Add Field validators to TemplateTaskResponse (complexity_score, uncertainty_level)
- Fix duration handling in schedule.py to properly check for None and <= 0

Bug fixes:
- Add agent_id to TimelineEntry in _reschedule_with_agents for consistency
- Update test assertions to use new domain exceptions

Test markers:
- Add pytestmark = pytest.mark.v2 to 9 test files for proper v2 suite inclusion
@github-actions
Copy link
Contributor

Code Review Summary

This is a well-structured PR implementing significant Phase 4 functionality (scheduling, templates, effort estimation). The code follows v2 architecture patterns with proper separation between core, CLI, and API layers.

🟡 Approve with Required Changes

Critical Issues (Must Fix Before Merge)

1. Missing pytest.mark.v2 Markers in All Test Files

All v2 test files need pytestmark = pytest.mark.v2 at module level to run with pytest -m v2.

2. Template Dependencies Lost During Application
In codeframe/cli/app.py:4109, the depends_on_indices from templates are not being wired to actual task IDs. After creating tasks, map indices to IDs and call tasks.update_depends_on().

3. Architecture Violation in codeframe/core/phase_manager.py:10
This file imports FastAPI (HTTPException), violating the v2 rule that codeframe/core/** must NOT import FastAPI. Move HTTP handling to adapter layer.

4. Missing Transaction Safety in Template Apply
In codeframe/ui/routers/templates.py:186-278, multiple DB operations occur without transactional safety. Wrap issue creation and all task creations in a single transaction.

5. TimelineEntry Missing agent_id in _reschedule_with_agents
In codeframe/planning/task_scheduler.py:340-345, the TimelineEntry objects omit the agent_id field, causing inconsistency with schedule_tasks.

High Priority Issues

6. Inconsistent estimated_hours Handling

  • codeframe/ui/routers/schedule.py:165-166: Uses or 1.0 which doesn't handle negative values
  • Should use explicit if duration is None or duration <= 0: duration = 1.0

7. Status String Comparison Instead of Enum
Multiple locations use string comparison (task.status.value == "DONE"). Should use enum comparison or normalize with upper() for both DONE and COMPLETED.

8. Missing Complexity/Uncertainty Field Propagation
Template tasks include complexity_score and uncertainty_level, but these fields don't exist in the Task data model. Either extend Task schema or remove from template returns.

Test Coverage

Comprehensive test coverage exists with 9 new test files. Before merging, verify:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py

✅ Strengths

  1. Clean architecture with proper v2 core/CLI/API separation
  2. Solid CPM algorithm implementation
  3. 7 practical builtin templates
  4. SQL injection prevention with parameterized queries
  5. Idempotent database migrations

Summary

This PR adds valuable scheduling and template functionality. The critical issues are primarily around test markers, dependency wiring, and ensuring v2 architecture compliance. Once these are addressed, this should be ready to merge.

@github-actions
Copy link
Contributor

Review comment posted successfully to PR #301.

New%20session%20-%202026-01-24T20%3A11%3A56.616Z
opencode session  |  github run

CompletionPrediction with predicted date and confidence interval
"""
# Calculate remaining hours
total_hours = schedule.total_duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_hours is the project span (max end time), but completed_hours sums task durations. With parallel tasks, completed_hours can exceed total_hours, yielding >100% completion. Consider either summing all task durations for total_hours, or using elapsed time for both metrics.

Suggested change
total_hours = schedule.total_duration
total_hours = sum(
a.end_time - a.start_time for a in schedule.task_assignments.values()
)

🚀 Want me to fix this? Reply ex: "fix it for me".

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/core/test_phase_manager.py (1)

7-17: Add v2 marker for this test module.

All v2 tests must be marked, but this module isn’t. As per coding guidelines, add pytestmark = pytest.mark.v2 (or decorate classes).

✅ Proposed fix
 import pytest
 from pathlib import Path
 
+pytestmark = pytest.mark.v2
+
 from codeframe.persistence.database import Database
🤖 Fix all issues with AI agents
In `@codeframe/ui/routers/schedule.py`:
- Around line 72-189: Both endpoints accept numeric query params that can be
zero or negative (agents in get_project_schedule and hours_per_day in
predict_completion), which can cause scheduling errors or division-by-zero; add
validation at the API boundary to reject non-positive values. In
get_project_schedule (function get_project_schedule) validate agents > 0 (use
FastAPI Query(gt=0) or explicit check and raise HTTPException 400 with a clear
message) before calling TaskScheduler.schedule_tasks; similarly in
predict_completion (function predict_completion) validate hours_per_day > 0
(Query(gt=0) or explicit check) before using predict_completion_date; keep the
error messages descriptive and return 4xx to avoid 500s from downstream
scheduler/predictor code.

In `@codeframe/ui/routers/templates.py`:
- Around line 29-40: The response model TemplateTaskResponse currently enforces
stricter validation than the underlying TemplateTask (notably estimated_hours
uses Field(..., gt=0) and uncertainty_level uses
Literal["low","medium","high"]), which can cause serialization errors; either
relax TemplateTaskResponse to accept zero-hours (change gt=0 to ge=0 for
estimated_hours) and allow a broader uncertainty type (e.g., str or
Optional[Literal[...]]) or add matching validation in the TemplateTask dataclass
constructor/validator to enforce the same constraints at creation time (validate
estimated_hours >= 0 and restrict/normalize uncertainty_level values) so that
TemplateTask instances always satisfy TemplateTaskResponse before serialization.
♻️ Duplicate comments (3)
codeframe/planning/task_scheduler.py (2)

119-128: Validate agents_available to prevent empty agent pool crashes.

agent_end_times[0] will raise an IndexError when agents_available <= 0. Guarding early avoids a 500 and keeps scheduling deterministic.

🔧 Proposed fix
     ) -> ScheduleResult:
         """
         Schedule tasks based on dependencies and available resources.
@@
-        # Get parallel execution waves
+        if agents_available < 1:
+            raise ValueError("agents_available must be >= 1")
+
+        # Get parallel execution waves
         waves = resolver.identify_parallel_opportunities()
     def _reschedule_with_agents(
         self,
         task_ids: List[int],
         task_durations: Dict[int, float],
         resolver: DependencyResolver,
         agents_available: int,
     ) -> ScheduleResult:
         """Re-schedule tasks with a different number of agents."""
+        if agents_available < 1:
+            raise ValueError("agents_available must be >= 1")
+
         # Get parallel waves
         waves = resolver.identify_parallel_opportunities()

388-411: Completion prediction should be anchored to “now,” not project start.

days_needed represents remaining work, so adding it to start_date can yield past dates when progress > 0. Consider anchoring to the current timestamp (or a provided “as_of” date).

🔧 Proposed fix
-        predicted_date = start_date + timedelta(days=days_needed)
-        early_date = start_date + timedelta(days=early_days)
-        late_date = start_date + timedelta(days=late_days)
+        as_of = datetime.now()
+        predicted_date = as_of + timedelta(days=days_needed)
+        early_date = as_of + timedelta(days=early_days)
+        late_date = as_of + timedelta(days=late_days)
codeframe/ui/routers/templates.py (1)

188-280: Transaction handling concern persists.

The multi-step database operations (create_issue, multiple create_task_with_issue calls) still lack transactional safety. If task creation fails midway, partial data may persist.

🧹 Nitpick comments (2)
tests/planning/test_task_templates.py (1)

269-281: Test does not verify context substitution behavior.

The test computes all_text but never uses it to verify that context variables were actually substituted. The current assertion only checks that tasks were returned, not that {endpoint_name} was replaced with /products.

Consider strengthening the assertion:

Suggested improvement
     def test_apply_template_with_context_substitution(self):
         """Test context variables are substituted in tasks."""
         manager = TaskTemplateManager()
 
         tasks = manager.apply_template(
             template_id="api-endpoint",
             context={"endpoint_name": "/products"},
         )
 
         # Check that context was used in task descriptions
         all_text = " ".join(t["title"] + t["description"] for t in tasks)
-        # Context should influence the output
-        assert len(tasks) > 0
+        # Verify context substitution occurred
+        assert len(tasks) > 0
+        # If template uses {endpoint_name}, it should appear in output
+        assert "/products" in all_text or "{endpoint_name}" not in all_text
codeframe/ui/routers/templates.py (1)

245-262: Redundant import statement.

TaskStatus is already imported at line 225 within this function. The second import at line 248 is unnecessary.

Suggested fix
         # Create tasks in database
         created_task_ids = []
         for task_dict in task_dicts:
-            from codeframe.core.models import TaskStatus
-
             task_id = db.create_task_with_issue(

Comment on lines +29 to +40
class TemplateTaskResponse(BaseModel):
"""Response model for a template task."""

title: str
description: str
estimated_hours: float = Field(..., gt=0, description="Estimated hours (must be positive)")
complexity_score: int = Field(..., ge=1, le=5, description="Complexity rating (1-5)")
uncertainty_level: Literal["low", "medium", "high"] = Field(
..., description="Uncertainty level"
)
depends_on_indices: List[int]
tags: List[str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find where TemplateTask is defined
find . -type f -name "*.py" | xargs grep -l "class TemplateTask"

Repository: frankbria/codeframe

Length of output: 137


🏁 Script executed:

# Look at the templates.py file to understand the context
head -100 codeframe/ui/routers/templates.py

Repository: frankbria/codeframe

Length of output: 2652


🏁 Script executed:

# Search for TemplateTask dataclass definition to see its constraints
rg "class TemplateTask" -A 20

Repository: frankbria/codeframe

Length of output: 2848


🏁 Script executed:

# Check if there are any template data files or initialization with estimated_hours values
find . -type f \( -name "*.py" -o -name "*.json" -o -name "*.yaml" \) | xargs grep -l "estimated_hours\|TemplateTask" | head -20

Repository: frankbria/codeframe

Length of output: 654


🏁 Script executed:

# Look at the TaskTemplate and TemplateTask definitions more carefully
cat codeframe/planning/task_templates.py | head -150

Repository: frankbria/codeframe

Length of output: 5239


🏁 Script executed:

# Check the endpoints to see how TemplateTaskResponse is used
sed -n '80,200p' codeframe/ui/routers/templates.py

Repository: frankbria/codeframe

Length of output: 3604


🏁 Script executed:

# Look for template data - check if there are sample templates or fixtures
find . -type f -name "*.py" -path "*/tests/*" | xargs grep -l "TemplateTask\|template" | head -5

Repository: frankbria/codeframe

Length of output: 280


🏁 Script executed:

# Check the templates test file
cat codeframe/planning/task_templates.py | tail -n +180 | head -100

Repository: frankbria/codeframe

Length of output: 3775


🏁 Script executed:

# Look at the API test file
cat tests/api/test_templates_api.py

Repository: frankbria/codeframe

Length of output: 2300


🏁 Script executed:

# Search for any TemplateTask instantiations with estimated_hours=0 or non-standard values
rg "TemplateTask\(" -A 5 | head -100

Repository: frankbria/codeframe

Length of output: 6940


🏁 Script executed:

# Look at the TaskTemplateManager to see how templates are managed
rg "class TaskTemplateManager" -A 50

Repository: frankbria/codeframe

Length of output: 3415


🏁 Script executed:

# Search for any other ways TemplateTask could be instantiated
rg "TemplateTask(" --type py | grep -v "test_" | grep -v "BUILTIN_TEMPLATES" | head -20

Repository: frankbria/codeframe

Length of output: 139


🏁 Script executed:

# Search for TemplateTask instantiations more carefully
rg "TemplateTask" codeframe/planning/task_templates.py | grep -v "class TemplateTask" | grep -v "List\[TemplateTask\]" | grep -v "^.*#"

Repository: frankbria/codeframe

Length of output: 1014


🏁 Script executed:

# Check if there's a register_template or custom template feature
rg "register_template\|custom.*template" -A 5

Repository: frankbria/codeframe

Length of output: 45


🏁 Script executed:

# Look for how the endpoints return data and if response_model validation is applied
sed -n '140,180p' codeframe/ui/routers/templates.py

Repository: frankbria/codeframe

Length of output: 1374


Response model constraints may cause serialization failures if constraints are violated.

The TemplateTaskResponse model has stricter validation than the underlying TemplateTask dataclass:

  • estimated_hours: Field(..., gt=0) will reject tasks with zero hours
  • uncertainty_level: Literal["low", "medium", "high"] will reject any other values

While all current builtin templates conform to these constraints, if custom template support is added or builtin templates are modified with non-conforming values, the API will return a 500 error during response serialization when Pydantic validates against the response model.

Consider either:

  1. Relaxing constraints to ge=0 if zero-hour tasks are valid
  2. Adding validation to the TemplateTask dataclass to enforce the same constraints at creation time
🤖 Prompt for AI Agents
In `@codeframe/ui/routers/templates.py` around lines 29 - 40, The response model
TemplateTaskResponse currently enforces stricter validation than the underlying
TemplateTask (notably estimated_hours uses Field(..., gt=0) and
uncertainty_level uses Literal["low","medium","high"]), which can cause
serialization errors; either relax TemplateTaskResponse to accept zero-hours
(change gt=0 to ge=0 for estimated_hours) and allow a broader uncertainty type
(e.g., str or Optional[Literal[...]]) or add matching validation in the
TemplateTask dataclass constructor/validator to enforce the same constraints at
creation time (validate estimated_hours >= 0 and restrict/normalize
uncertainty_level values) so that TemplateTask instances always satisfy
TemplateTaskResponse before serialization.

Template dependency wiring:
- CLI templates apply now maps depends_on_indices to actual task IDs
- API templates apply now calls add_task_dependency for each dependency

Status normalization:
- schedule predict command now handles both enum and string status
- Checks for both "DONE" and "COMPLETED" for consistency
raise HTTPException(status_code=500, detail=str(e))


@router.post("/{project_id}/apply", response_model=ApplyTemplateResponse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint is missing authentication and authorization. Consider adding current_user: User = Depends(get_current_user) and checking db.user_has_project_access(current_user.id, project_id) like other task-modifying endpoints.

🚀 Want me to fix this? Reply ex: "fix it for me".

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@codeframe/ui/routers/templates.py`:
- Around line 119-121: The except block in the templates listing endpoint
currently returns str(e) to clients; change it to return a generic message
(e.g., "Failed to list templates" or "Internal server error") in the
HTTPException.detail and stop exposing str(e) to API consumers, and move the
detailed error into server logs by calling logger.error with the exception
context (e.g., logger.error("Error listing templates", exc_info=True)) so
internal details remain in logs while clients receive a safe, non-sensitive
message; update the block that uses logger.error, HTTPException and str(e) in
templates.py accordingly.
♻️ Duplicate comments (3)
codeframe/ui/routers/templates.py (2)

29-40: Response model constraints already flagged.

The TemplateTaskResponse model's stricter validation (gt=0 for estimated_hours) compared to the underlying TemplateTask dataclass was already flagged in a previous review. This mismatch could cause 500 errors during response serialization if any template task has zero hours.


188-294: Transaction handling concern already flagged.

The multi-step database operations (issue creation + multiple task creations + dependency wiring) without transaction support was already identified in a previous review. Partial failures could leave the database in an inconsistent state.

codeframe/cli/app.py (1)

4061-4065: PRD validated but not linked to created tasks.

The prd_record is fetched and validated (lines 4061-4065) but the PRD ID is never passed to tasks.create() (lines 4085-4091). If tasks should be associated with the PRD, consider passing prd_id=prd_record.id to the create call. This was noted in a previous review.

Also applies to: 4085-4091

🧹 Nitpick comments (1)
codeframe/ui/routers/templates.py (1)

225-226: Remove duplicate import of TaskStatus.

TaskStatus is imported twice within the same function body (lines 225 and 246). The second import is redundant.

Suggested fix
         # Create tasks in database with dependency tracking
-        from codeframe.core.models import TaskStatus
-
         created_tasks = []  # List of (task_id, depends_on_indices)

Also applies to: 246-246

@github-actions
Copy link
Contributor

Code Review: PR #301 - Enhanced Task Generation with Scheduling and Templates

Overall assessment: This PR implements Phase 4 planning features well. Most issues from prior reviews have been addressed in commits f2baecc and 6083c9d.

✅ Issues Already Fixed

  1. pytest.mark.v2 markers - All test files now have the v2 marker
  2. Template dependencies wired - Both CLI and API properly wire up task dependencies
  3. Phase manager FastAPI import - No FastAPI import in codeframe/core/phase_manager.py
  4. TimelineEntry agent_id - All TimelineEntry instances now include agent_id
  5. Status comparison - Properly handles both DONE and COMPLETED with status.upper()
  6. Estimated hours handling - Consistently handles None and <= 0 values

⚠️ Remaining Issues

1. Missing API Parameter Validation (HIGH)

File: codeframe/ui/routers/schedule.py:75,140

The agents and hours_per_day query parameters lack FastAPI validation:

async def get_project_schedule(
    project_id: int,
    agents: int = 1,  # Should be: agents: int = Query(1, ge=1)
    db: Database = Depends(get_db),
) -> Dict[str, Any]:

async def predict_completion(
    project_id: int,
    hours_per_day: float = 8.0,  # Should be: hours_per_day: float = Query(8.0, gt=0)
    db: Database = Depends(get_db),
) -> Dict[str, Any]:

Impact: Zero or negative values will cause scheduling errors or division-by-zero.

Fix: Add from fastapi import Query and use Query(..., ge=1) for agents, Query(..., gt=0) for hours_per_day.


2. Missing Transaction Safety (MEDIUM)

File: codeframe/ui/routers/templates.py:236-275

The apply_template endpoint performs multiple database operations without transactional safety:

issue_id = db.create_issue(default_issue)  # Commit 1
for task_dict in task_dicts:
    task_id = db.create_task_with_issue(...)  # Commits 2, 3, 4, ...
    created_tasks.append((task_id, ...))
# Wire dependencies (more DB operations)
for task_id, dep_indices in created_tasks:
    db.tasks.add_task_dependency(task_id, dep_id)  # More commits

Impact: If any operation fails after some tasks are created, partial data persists in an inconsistent state.

Fix: Wrap all operations in a transaction. Since using SQLite, add a transaction context manager to Database class:

with db.transaction():  # Wrap all DB ops
    issue_id = db.create_issue(default_issue)
    for task_dict in task_dicts:
        task_id = db.create_task_with_issue(...)
        created_tasks.append((task_id, ...))
    # Wire dependencies
    for task_id, dep_indices in created_tasks:
        for dep_id in depends_on_ids:
            db.tasks.add_task_dependency(task_id, dep_id)

3. Effort Fields Not Persisted from Templates (MEDIUM)

Files: codeframe/cli/app.py:4084-4091, codeframe/ui/routers/templates.py:250-262

Templates generate complexity_score and uncertainty_level, but these fields are not being persisted:

CLI (uses v2 tasks.create()):

task = tasks.create(
    workspace,
    title=task_dict["title"],
    description=task_dict["description"],
    status=TaskStatus.BACKLOG,
    estimated_hours=task_dict.get("estimated_hours"),
    # complexity_score and uncertainty_level NOT passed - fields don't exist on tasks.create()
)

API (uses db.create_task_with_issue()):

task_id = db.create_task_with_issue(
    ...
    estimated_hours=task_dict.get("estimated_hours"),
    # complexity_score and uncertainty_level NOT passed
)

Impact: Template metadata about task complexity and estimate uncertainty is lost during task creation.

Fix Options:

  • Option A: Extend v2 codeframe/core/tasks.py create() to accept and persist these fields
  • Option B: Have templates API/CLI pass these fields to create_task_with_issue() (persistence layer accepts them)
  • Option C: Remove these fields from templates if they're not needed in v2 architecture

4. V2 Architecture Inconsistency (LOW - Informational)

There's an inconsistency between v2 and legacy layers:

  • v2 (codeframe/core/tasks.py): create() does NOT accept complexity_score, uncertainty_level, resource_requirements
  • Legacy (codeframe/persistence/repositories/task_repository.py): create_task_with_issue() DOES accept these fields (lines 169-170)
  • Database (codeframe/persistence/schema_manager.py:309): Columns exist in schema (line 309)
  • Model (codeframe/core/models.py:289-291): Fields exist on Task dataclass

Decision needed: Should v2 headless architecture support effort estimation fields, or should they be removed from templates?


✅ Strengths

  1. Clean architecture with proper v2 core/CLI/API separation
  2. Solid CPM implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (795 tests claimed)
  5. SQL injection prevention with parameterized queries
  6. Dependency normalization properly handles string/int conversions

📋 Summary

This PR adds valuable scheduling and template functionality. Most critical issues from prior reviews have been addressed. The remaining issues are:

  • Missing API validation (high priority, easy fix)
  • Transaction safety (medium priority, requires Database enhancement)
  • Effort fields not persisted (medium priority, requires architectural decision)

Recommended: Address issues #1-3 before merging. Issue #4 is informational and may be deferred to a follow-up PR clarifying v2 architecture for effort estimation.


Review based on HEAD commit of PR #301

@github-actions
Copy link
Contributor

Review posted to PR #301: #301 (comment)

The review identifies 4 remaining issues after the fixes in commits f2baecc and 6083c9d:

  1. Missing API Parameter Validation (HIGH) - agents and hours_per_day lack FastAPI Query constraints
  2. Missing Transaction Safety (MEDIUM) - Multiple DB operations in template apply without atomic commit
  3. Effort Fields Not Persisted (MEDIUM) - complexity_score and uncertainty_level from templates are dropped during task creation
  4. V2 Architecture Inconsistency (LOW) - v2 tasks.create() doesn't support effort estimation fields while persistence layer does

New%20session%20-%202026-01-24T20%3A25%3A03.001Z
opencode session  |  github run

API Parameter Validation:
- Add Query validators to schedule endpoints (agents>=1, hours_per_day>0)

Transaction Safety:
- Add transaction() context manager to Database class
- Wrap templates API apply_template in transaction for atomicity

V2 Effort Fields:
- Add complexity_score and uncertainty_level to v2 Task dataclass
- Extend tasks.create() to accept effort fields
- Add columns to v2 workspace schema with migrations
- Update SELECT queries and _row_to_task() for new columns
- CLI templates apply now passes effort fields to tasks.create()
@github-actions
Copy link
Contributor

Code Review Summary

This PR implements Phase 4 planning features (scheduling, templates, effort estimation) with comprehensive test coverage and good v2 architecture compliance. Most issues from prior reviews have been addressed in commits f2baecc and 6083c9d.

✅ Issues Already Fixed

Based on the review comments and commit history, these issues have been resolved:

  • ✅ pytest.mark.v2 markers added to all test files
  • ✅ Template dependencies properly wired to task IDs
  • ✅ TimelineEntry agent_id included in both schedule paths
  • ✅ Status comparison normalized to handle DONE/COMPLETED
  • ✅ Estimated hours handling consistent across CLI and API
  • ✅ FastAPI import removed from phase_manager.py

⚠️ Remaining Issues to Address

1. Missing API Parameter Validation (HIGH) — codeframe/ui/routers/schedule.py:72,140

The and query parameters lack FastAPI Query validation:

  • should be
  • should be

Zero/negative values will cause scheduling errors or division-by-zero.

Fix:

from fastapi import Query

async def get_project_schedule(
    project_id: int,
    agents: int = Query(1, ge=1),
    db: Database = Depends(get_db),
) -> Dict[str, Any]:

async def predict_completion(
    project_id: int,
    hours_per_day: float = Query(8.0, gt=0),
    db: Database = Depends(get_db),
) -> Dict[str, Any]:

2. Missing Transaction Safety (MEDIUM) — codeframe/ui/routers/templates.py:236-275

The endpoint performs multiple DB operations (create_issue + multiple create_task_with_issue + dependency wiring) without transactional safety. If any operation fails midway, partial data persists in an inconsistent state.

Recommendation: Wrap all operations in a single transaction using the Database class's transaction API.

3. Effort Fields Not Persisted (MEDIUM) — codeframe/cli/app.py:4084-4091, templates.py:250-262

Templates generate and , but these fields are not being persisted when tasks are created:

  • CLI uses which doesn't accept these fields
  • API uses which does accept them

Impact: Template metadata about task complexity and estimate uncertainty is lost.

Fix Options:

  • Option A: Extend v2 to accept these fields
  • Option B: Have templates API/CLI pass these fields to the persistence layer
  • Option C: Remove these fields from templates if not needed in v2

4. Avoid Exposing Internal Error Details (LOW) — codeframe/ui/routers/templates.py:119-121

The exception handler returns directly to API clients, which may leak internal implementation details.

Fix:

except Exception as e:
    logger.error(f"Error listing templates: {e}")
    raise HTTPException(status_code=500, detail="Internal server error")

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations

📋 Summary

This PR adds valuable scheduling and template functionality. The critical issues from prior reviews (test markers, dependency wiring, phase_manager import) have been addressed. The remaining issues are:

Recommendation: Address issues #1, #3, and #4 before merging. Issue #2 can be addressed in a follow-up if needed.


agents_available=1,
)

bottlenecks = scheduler.identify_bottlenecks(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculate_critical_path() depends on CriticalPathResult, TaskTiming, etc., but they aren’t defined/imported in dependency_resolver.py. Suggest adding the missing imports/definitions.

🚀 Want me to fix this? Reply ex: "fix it for me".


# Build dependency graph and schedule
resolver = DependencyResolver()
resolver.build_dependency_graph(tasks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DependencyResolver uses logger but it isn’t defined. Suggest importing logging and defining a module-level logger to avoid NameError.

🚀 Want me to fix this? Reply ex: "fix it for me".

Replace `detail=str(e)` with generic "Internal server error" message
in 500 error responses to prevent leaking implementation details.
Errors are still logged for debugging.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codeframe/core/workspace.py (1)

102-135: Backfill new task columns for existing workspaces.

Line 124 only runs the column additions inside _init_database, but existing workspaces go through _ensure_schema_upgrades() instead. That means older DBs will be missing depends_on, estimated_hours, complexity_score, and uncertainty_level, and inserts that now include those columns will fail.

🔧 Suggested fix (add upgrades in _ensure_schema_upgrades)
@@ def _ensure_schema_upgrades(db_path: Path) -> None:
+    # Ensure new task effort columns exist (older DBs won't have them)
+    cursor.execute("PRAGMA table_info(tasks)")
+    task_columns = {row[1] for row in cursor.fetchall()}
+    if "depends_on" not in task_columns:
+        cursor.execute("ALTER TABLE tasks ADD COLUMN depends_on TEXT DEFAULT '[]'")
+    if "estimated_hours" not in task_columns:
+        cursor.execute("ALTER TABLE tasks ADD COLUMN estimated_hours REAL")
+    if "complexity_score" not in task_columns:
+        cursor.execute("ALTER TABLE tasks ADD COLUMN complexity_score INTEGER")
+    if "uncertainty_level" not in task_columns:
+        cursor.execute("ALTER TABLE tasks ADD COLUMN uncertainty_level TEXT")
+    conn.commit()
🤖 Fix all issues with AI agents
In `@codeframe/cli/app.py`:
- Around line 3692-3752: The schedule_show command accepts an agents integer
that isn't validated, which can lead to invalid scheduling or errors; add a
guard at the start of schedule_show to validate the agents parameter (e.g.,
ensure agents is an int >= 1) and exit with a user-friendly message if invalid.
Locate the schedule_show function and check the agents Option value before
creating the DependencyResolver/TaskScheduler and calling
TaskScheduler.schedule_tasks; if agents < 1, call typer.Exit or console.print
with an error and return so downstream code relying on agents_available never
receives zero or negative values.
- Around line 3865-3931: In schedule_bottlenecks, tasks currently set duration
with "task.estimated_hours or 1.0" which lets negative/zero values through;
change the normalization to mirror schedule_show by treating any estimated_hours
that is None or <= 0 as 1.0 (e.g., compute duration = task.estimated_hours if
task.estimated_hours is not None and task.estimated_hours > 0 else 1.0) and
assign that to task_durations[task.id]; this ensures
TaskScheduler.schedule_tasks and identify_bottlenecks receive only positive
durations.
- Around line 3781-3856: The schedule_predict command currently accepts
non-positive hours_per_day and lets negative task.estimated_hours through;
update schedule_predict to validate and reject non-positive hours_per_day (e.g.,
if hours_per_day <= 0: console.print an error and raise typer.Exit(1) or raise
typer.BadParameter) and normalize each task estimate when building
task_durations by replacing the line duration = task.estimated_hours or 1.0 with
logic that uses a positive estimate (e.g., if getattr(task, "estimated_hours",
None) is not None and task.estimated_hours > 0: duration = task.estimated_hours
else: duration = 1.0) so negative or zero estimates are clamped to a sensible
default before calling scheduler.schedule_tasks / predict_completion_date.

In `@codeframe/persistence/database.py`:
- Around line 279-308: transaction() must acquire self._sync_lock and reject
re-entrancy to avoid corrupting shared connection state; wrap the
isolation_level change, BEGIN, commit/rollback and restoration in a lock (use
self._sync_lock.acquire()/release() or a with self._sync_lock) and before
starting set/check self.conn.in_transaction and raise a clear RuntimeError (or
use savepoints if you want nested semantics) to prevent nested transactions;
also use self.conn.execute("BEGIN") instead of creating a cursor, ensure
isolation_level is always restored in a finally block and the lock always
released.
♻️ Duplicate comments (5)
codeframe/ui/routers/templates.py (4)

29-40: Response model validation stricter than source dataclass.

The estimated_hours: float = Field(..., gt=0) constraint rejects zero-hour tasks during response serialization. If any TemplateTask instance has estimated_hours=0, the API will return a 500 error. Consider using ge=0 or enforcing the same constraint in the TemplateTask dataclass.


119-121: Avoid exposing internal error details to API clients.

The generic exception handler returns str(e) directly, which may leak implementation details. Return a generic error message instead.


188-193: Endpoint lacks authentication and authorization.

This endpoint modifies project data but does not verify user identity or project access permissions. Other task-modifying endpoints in the codebase typically include authentication checks.


292-294: Avoid exposing internal error details to API clients.

Same issue as other endpoints—str(e) may leak sensitive information.

codeframe/cli/app.py (1)

4031-4127: PRD linkage is still missing when applying templates.

Line 4061 fetches prd_record, but tasks created in Line 4085 aren’t linked to it. This was flagged earlier and still appears unresolved — consider passing prd_id=prd_record.id to tasks.create().

🧹 Nitpick comments (3)
codeframe/ui/routers/templates.py (3)

104-106: Consider caching or reusing TaskTemplateManager instance.

A new TaskTemplateManager() is instantiated on every request. If the manager loads builtin templates on initialization, this could be inefficient under load. Consider using a module-level singleton or FastAPI dependency injection to reuse the manager.

♻️ Suggested approach using dependency injection
# At module level
_template_manager: Optional[TaskTemplateManager] = None

def get_template_manager() -> TaskTemplateManager:
    global _template_manager
    if _template_manager is None:
        _template_manager = TaskTemplateManager()
    return _template_manager

# In endpoint
`@router.get`("/", response_model=List[TemplateListResponse])
async def list_templates(
    category: Optional[str] = None,
    manager: TaskTemplateManager = Depends(get_template_manager),
) -> List[Dict[str, Any]]:
    ...

226-226: Consider moving imports to module level.

Importing Issue and TaskStatus inside the function works but deviates from typical Python conventions. Unless this is intentional to avoid circular imports, consider moving these to the top of the file with other imports.


265-273: Dependency index mapping silently skips invalid indices.

The condition if 0 <= idx < len(created_tasks) silently ignores out-of-range dependency indices. If a template contains invalid depends_on_indices, no error is raised and dependencies are partially wired.

Consider logging a warning when an invalid index is encountered:

♻️ Suggested improvement
                     depends_on_ids = [
                         created_tasks[idx][0]
                         for idx in dep_indices
                         if 0 <= idx < len(created_tasks)
                     ]
+                    skipped = [idx for idx in dep_indices if not (0 <= idx < len(created_tasks))]
+                    if skipped:
+                        logger.warning(
+                            f"Task {task_id}: skipped invalid dependency indices {skipped}"
+                        )
                     for dep_id in depends_on_ids:
                         db.tasks.add_task_dependency(task_id, dep_id)

Comment on lines +3781 to +3856
@schedule_app.command("predict")
def schedule_predict(
repo_path: Optional[Path] = typer.Option(
None,
"--workspace", "-w",
help="Workspace path (defaults to current directory)",
),
hours_per_day: float = typer.Option(
8.0,
"--hours-per-day",
help="Working hours per day",
),
) -> None:
"""Predict project completion date.

Estimates when the project will be complete based on remaining
tasks and their estimated durations.
"""
from datetime import datetime
from codeframe.core.workspace import get_workspace
from codeframe.core import tasks as task_module
from codeframe.planning.task_scheduler import TaskScheduler
from codeframe.agents.dependency_resolver import DependencyResolver

workspace_path = repo_path or Path.cwd()

try:
workspace = get_workspace(workspace_path)

# Get tasks using v2 API
task_list = task_module.list_tasks(workspace)
if not task_list:
console.print("[yellow]No tasks found.[/yellow]")
raise typer.Exit(0)

# Build schedule
resolver = DependencyResolver()
resolver.build_dependency_graph(task_list)
scheduler = TaskScheduler()

task_durations = {}
for task in task_list:
duration = task.estimated_hours or 1.0
task_durations[task.id] = duration

schedule = scheduler.schedule_tasks(
tasks=task_list,
task_durations=task_durations,
resolver=resolver,
agents_available=1,
)

# Get current progress
current_progress = {}
for task in task_list:
status = task.status.value if hasattr(task.status, "value") else str(task.status)
if status.upper() in ("DONE", "COMPLETED"):
current_progress[task.id] = "completed"

# Predict completion
prediction = scheduler.predict_completion_date(
schedule=schedule,
current_progress=current_progress,
start_date=datetime.now(),
hours_per_day=hours_per_day,
)

console.print("\n[bold]Completion Prediction[/bold]\n")
console.print(f"Predicted Date: [green]{prediction.predicted_date.strftime('%Y-%m-%d')}[/green]")
console.print(f"Remaining Hours: {prediction.remaining_hours:.1f}h")
console.print(f"Completed: {prediction.completed_percentage:.1f}%")
console.print("\nConfidence Interval:")
console.print(f" Early: {prediction.confidence_interval['early'].strftime('%Y-%m-%d')}")
console.print(f" Late: {prediction.confidence_interval['late'].strftime('%Y-%m-%d')}")
console.print()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard hours_per_day and normalize negative estimates.

Line 3823 uses task.estimated_hours or 1.0, which lets negative values through. Also, hours_per_day can be zero/negative and will break prediction math.

🔧 Suggested fix
@@ def schedule_predict(...):
     try:
         workspace = get_workspace(workspace_path)
+
+        if hours_per_day <= 0:
+            console.print("[red]Error:[/red] --hours-per-day must be > 0")
+            raise typer.Exit(1)
@@
-        for task in task_list:
-            duration = task.estimated_hours or 1.0
+        for task in task_list:
+            val = task.estimated_hours
+            duration = val if val is not None and val > 0 else 1.0
             task_durations[task.id] = duration
🤖 Prompt for AI Agents
In `@codeframe/cli/app.py` around lines 3781 - 3856, The schedule_predict command
currently accepts non-positive hours_per_day and lets negative
task.estimated_hours through; update schedule_predict to validate and reject
non-positive hours_per_day (e.g., if hours_per_day <= 0: console.print an error
and raise typer.Exit(1) or raise typer.BadParameter) and normalize each task
estimate when building task_durations by replacing the line duration =
task.estimated_hours or 1.0 with logic that uses a positive estimate (e.g., if
getattr(task, "estimated_hours", None) is not None and task.estimated_hours > 0:
duration = task.estimated_hours else: duration = 1.0) so negative or zero
estimates are clamped to a sensible default before calling
scheduler.schedule_tasks / predict_completion_date.

@github-actions
Copy link
Contributor

Code Review Summary: PR #301

This is a well-structured PR implementing Phase 4 planning features (scheduling, templates, effort estimation). Most critical issues from prior reviews have been addressed in commits f2baecc, 6083c9d, and 8487fb0.

✅ Issues Already Fixed

  • pytest.mark.v2 markers added to all test files
  • Template dependencies properly wired to task IDs
  • TimelineEntry agent_id included in both schedule paths
  • Status comparison normalized to handle DONE/COMPLETED
  • Estimated hours handling consistent across CLI and API
  • FastAPI import removed from phase_manager.py
  • API parameter validation added (commit 8487fb0)

⚠️ Remaining Issues to Address

1. Missing Transaction Safety (HIGH) — codeframe/persistence/database.py:279-308

The transaction() context manager lacks thread safety - it doesn't acquire self._sync_lock before modifying self.conn.isolation_level. Concurrent transactions could corrupt connection state. Also missing guard against nested transactions.

Fix needed: Add with self._sync_lock: wrapper and check self.conn.in_transaction before starting.

2. Effort Fields Not Persisted (MEDIUM) — codeframe/cli/app.py:4084-4091, templates.py:250-262

Templates generate complexity_score and uncertainty_level, but these fields aren't being persisted. The v2 tasks.create() API doesn't accept these parameters, while the persistence layer (create_task_with_issue()) does.

Decision needed: Either extend v2 Task schema to store these fields or remove them from template returns.

3. CLI Parameter Validation (MEDIUM) — codeframe/cli/app.py

Three CLI commands lack input validation:

  • schedule_show: agents can be ≤ 0
  • schedule_predict: hours_per_day can be ≤ 0
  • schedule_bottlenecks: negative estimated_hours not normalized

Fix: Add guards at command entry (e.g., if agents < 1: raise typer.Exit(1))

4. Avoid Exposing Internal Error Details (LOW) — codeframe/ui/routers/templates.py:119-121

Returns str(e) directly to API clients, potentially leaking implementation details.

Fix: Replace with generic message like "Internal server error" and log details server-side.

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations

📋 Recommendation

Address the transaction safety issue (#1) before merging - this is a data integrity risk. The other issues (#2-#4) can be addressed in a follow-up PR, though addressing them now would be ideal.

Test verification required:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py

@github-actions
Copy link
Contributor

Review posted to PR #301: #301 (comment)

New%20session%20-%202026-01-24T20%3A47%3A09.358Z
opencode session  |  github run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@codeframe/ui/routers/templates.py`:
- Around line 230-262: The code may attach tasks to a mismatched parent issue
number when reusing an existing issue; update the logic around
db.get_project_issues / Issue creation so you capture the actual issue record
(e.g., matched_issue or created_issue) and use its issue_number when calling
db.create_task_with_issue instead of always passing request.issue_number; if
db.create_issue returns an ID only, change it to return the full Issue (or fetch
the created Issue by ID) so you can set
parent_issue_number=matched_issue.issue_number (and keep issue_id set to
matched_issue.id).
♻️ Duplicate comments (2)
codeframe/ui/routers/templates.py (2)

29-40: Keep response validation aligned with template creation rules.

If TemplateTask instances can ever contain estimated_hours == 0 or non‑{"low","medium","high"} uncertainty values, Pydantic will raise during response serialization and return a 500. Either enforce these constraints at template creation time or relax the response model accordingly.

#!/bin/bash
# Inspect TemplateTask constraints and usages for estimated_hours / uncertainty values
rg -n "class TemplateTask" -A 50 codeframe/planning/task_templates.py
rg -n "TemplateTask\(" -A 8 codeframe/planning/task_templates.py

188-193: Protect apply_template with authz checks.

This endpoint mutates project data but does not enforce authentication/authorization. Align it with other task‑modifying endpoints (e.g., Depends(get_current_user) + project access check).

#!/bin/bash
# Find existing auth patterns in routers to mirror here
rg -n "get_current_user|user_has_project_access" -C2 codeframe/ui/routers
🧹 Nitpick comments (2)
codeframe/ui/routers/schedule.py (2)

232-243: Extract repeated schedule-building logic into a helper function.

The duration extraction and schedule building pattern is duplicated across all three endpoints (lines 100-112, 163-174, 232-243). Extracting this into a helper would reduce duplication and ensure consistent behavior.

♻️ Suggested helper extraction
def _build_project_schedule(
    tasks: List[Any],
    agents_available: int = 1,
) -> tuple[Any, Dict[int, float], Any]:
    """Build dependency graph and schedule for tasks.
    
    Returns:
        Tuple of (resolver, task_durations, schedule)
    """
    resolver = DependencyResolver()
    resolver.build_dependency_graph(tasks)
    scheduler = TaskScheduler()

    task_durations = {}
    for task in tasks:
        val = getattr(task, "estimated_hours", None)
        duration = val if val is not None and val > 0 else 1.0
        task_durations[task.id] = duration

    schedule = scheduler.schedule_tasks(
        tasks=tasks,
        task_durations=task_durations,
        resolver=resolver,
        agents_available=agents_available,
    )
    return resolver, task_durations, schedule

184-189: Use timezone-aware datetime for start_date to match codebase patterns.

datetime.now() returns a naive datetime. The codebase consistently uses timezone-aware datetimes throughout (e.g., datetime.now(UTC) in websocket_broadcasts.py, datetime.now(timezone.utc) in tests), and the returned predicted_date is serialized to the client as an ISO string. Using a naive datetime here creates an inconsistency in API responses and could cause issues if the prediction is compared with timezone-aware datetimes elsewhere.

♻️ Suggested fix
-from datetime import datetime
+from datetime import datetime, timezone
...
         prediction = scheduler.predict_completion_date(
             schedule=schedule,
             current_progress=current_progress,
-            start_date=datetime.now(),
+            start_date=datetime.now(timezone.utc),
             hours_per_day=hours_per_day,
         )

When applying templates with an existing project issue, use the
matched issue's issue_number for parent_issue_number instead of
request.issue_number to prevent mismatches.

Args:
schedule: Current ScheduleResult
current_progress: Dict mapping task_id to status ("completed", "in_progress")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current_progress docstring mentions "in_progress" status, but the code only handles "completed". Consider either removing "in_progress" from the docstring or adding logic to handle partially completed tasks.

Suggested change
current_progress: Dict mapping task_id to status ("completed", "in_progress")
current_progress: Dict mapping task_id to status ("completed")

🚀 Want me to fix this? Reply ex: "fix it for me".

raise HTTPException(status_code=404, detail=f"Project {project_id} not found")

# Apply template to get task definitions
task_dicts = manager.apply_template(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task numbers are generated using request.issue_number before determining actual_issue_number. Consider moving the apply_template() call after the issue lookup so task numbers use the correct prefix.

🚀 Want me to fix this? Reply ex: "fix it for me".

@github-actions
Copy link
Contributor

Code Review Summary: PR #301

Overall assessment: This is a well-structured PR implementing Phase 4 planning features (scheduling, templates, effort estimation). Most critical issues from prior reviews have been addressed in recent commits (f2baecc, 6083c9d, 8487fb0, b8beae8, d0a1f92).

✅ Issues Already Fixed

Based on commit history, these issues are now resolved:

  • ✅ pytest.mark.v2 markers added to all 9 new test files
  • ✅ Template dependencies properly wired to task IDs (CLI + API)
  • ✅ TimelineEntry agent_id included in both schedule paths
  • ✅ Status comparison normalized to handle DONE/COMPLETED
  • ✅ Estimated hours handling consistent in schedule_show command
  • ✅ FastAPI import removed from phase_manager.py
  • ✅ API parameter validation added (agents>=1, hours_per_day>0)
  • ✅ Error detail exposure fixed (generic "Internal server error")
  • ✅ Issue number mismatch when reusing existing issue fixed

⚠️ Remaining Issues to Address

1. Missing Transaction Thread Safety (HIGH) — codeframe/persistence/database.py:279-308

The transaction() context manager modifies self.conn.isolation_level without acquiring self._sync_lock, and doesn't guard against nested transactions. Concurrent transactions could corrupt connection state.

Impact: Race conditions could corrupt database state in multi-threaded environments.

Fix:

@contextlib.contextmanager
def transaction(self):
    if not self.conn:
        self.initialize()

    with self._sync_lock:  # Add lock wrapper
        if self.conn.in_transaction:  # Guard against nested transactions
            raise RuntimeError("Nested transactions are not supported")
        
        old_isolation = self.conn.isolation_level
        self.conn.isolation_level = None
        self.conn.execute("BEGIN")  # Use conn.execute instead of cursor
        
        try:
            yield self
            self.conn.commit()
        except Exception:
            self.conn.rollback()
            raise
        finally:
            self.conn.isolation_level = old_isolation

2. Inconsistent Duration Handling in CLI (MEDIUM) — codeframe/cli/app.py:3823, 3901

Two CLI commands still use the old pattern that doesn't handle negative values:

  • schedule_predict (line 3823): duration = task.estimated_hours or 1.0
  • schedule_bottlenecks (line 3901): duration = task.estimated_hours or 1.0

The schedule_show command was already fixed (lines 3742-3745) to use proper validation.

Fix: Update both commands to match schedule_show:

duration = task.estimated_hours
if duration is None or duration <= 0:
    duration = 1.0

3. CLI Parameter Validation (MEDIUM) — codeframe/cli/app.py

Three CLI commands lack input validation at the entry point:

  • schedule_show (line 3706): agents can be ≤ 0
  • schedule_predict (line 3788): hours_per_day can be ≤ 0
  • schedule_bottlenecks (line 3871): implicitly relies on agents from parent app (no validation)

Fix: Add guards at command entry:

if agents < 1:
    console.print("[red]Error:[/red] --agents must be >= 1")
    raise typer.Exit(1)

4. Missing v2 Test Marker (LOW) — tests/core/test_phase_manager.py

This test module is missing pytestmark = pytest.mark.v2 at module level. All v2 tests under tests/** should have this marker for proper test suite categorization.

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files, 4K+ lines)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations
  7. Good progress on addressing code review feedback

📋 Recommendation

Address issue #1 (transaction thread safety) before merging - this is a data integrity risk in multi-threaded environments.

Issues #2-#3 are medium priority inconsistencies that should be addressed for quality but wouldn't block merge if resources are constrained.

Issue #4 is low priority and can be addressed in a follow-up PR.

Test verification required:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py

Review based on HEAD commit (d0a1f92)

@github-actions
Copy link
Contributor

Review comment posted to PR #301: #301 (comment)

New%20session%20-%202026-01-25T04%3A39%3A41.754Z
opencode session  |  github run

- Add thread safety to transaction() context manager using double-check
  pattern with _sync_lock to prevent concurrent transaction corruption
- Add nested transaction guard with RuntimeError for same-thread calls
- Add CLI parameter validation for schedule show (agents >= 1) and
  schedule predict (hours_per_day > 0 and <= 24) using Typer min/max
- Add explicit validation messages for better error UX
@github-actions
Copy link
Contributor

Code Review Summary: PR #301

This is a well-structured PR implementing Phase 4 planning features (scheduling, templates, effort estimation). Most critical issues from prior reviews have been addressed in recent commits (f2baecc, 6083c9d, 8487fb0, b8beae8, d0a1f92).

✅ Issues Already Fixed

Based on commit history, these issues are now resolved:

  • ✅ pytest.mark.v2 markers added to all 9 new test files
  • ✅ Template dependencies properly wired to task IDs (CLI + API)
  • ✅ TimelineEntry agent_id included in both schedule paths
  • ✅ Status comparison normalized to handle DONE/COMPLETED
  • ✅ Estimated hours handling consistent in schedule_show command
  • ✅ FastAPI import removed from phase_manager.py
  • ✅ API parameter validation added (agents>=1, hours_per_day>0)
  • ✅ Error detail exposure fixed (generic "Internal server error")
  • ✅ Issue number mismatch when reusing existing issue fixed

⚠️ Remaining Issues to Address

1. Missing Transaction Thread Safety (HIGH) — codeframe/persistence/database.py:279-308

The transaction() context manager modifies self.conn.isolation_level without acquiring self._sync_lock, and doesn't guard against nested transactions. Concurrent transactions could corrupt connection state.

Impact: Race conditions could corrupt database state in multi-threaded environments.

Fix:

@contextlib.contextmanager
def transaction(self):
    if not self.conn:
        self.initialize()

    with self._sync_lock:  # Add lock wrapper
        if self.conn.in_transaction:  # Guard against nested transactions
            raise RuntimeError("Nested transactions are not supported")
        
        old_isolation = self.conn.isolation_level
        self.conn.isolation_level = None
        self.conn.execute("BEGIN")  # Use conn.execute instead of cursor
        
        try:
            yield self
            self.conn.commit()
        except Exception:
            self.conn.rollback()
            raise
        finally:
            self.conn.isolation_level = old_isolation

2. Inconsistent Duration Handling in CLI (MEDIUM) — codeframe/cli/app.py:3823, 3901

Two CLI commands still use the old pattern that doesn't handle negative values:

  • schedule_predict (line 3823): duration = task.estimated_hours or 1.0
  • schedule_bottlenecks (line 3901): duration = task.estimated_hours or 1.0

The schedule_show command was already fixed (lines 3742-3745) to use proper validation.

Fix: Update both commands to match schedule_show:

duration = task.estimated_hours
if duration is None or duration <= 0:
    duration = 1.0

3. CLI Parameter Validation (MEDIUM) — codeframe/cli/app.py

Three CLI commands lack input validation at the entry point:

  • schedule_show (line 3706): agents can be ≤ 0
  • schedule_predict (line 3788): hours_per_day can be ≤ 0
  • schedule_bottlenecks (line 3871): implicitly relies on agents from parent app (no validation)

Fix: Add guards at command entry:

if agents < 1:
    console.print("[red]Error:[/red] --agents must be >= 1")
    raise typer.Exit(1)

4. Missing v2 Test Marker (LOW) — tests/core/test_phase_manager.py

This test module is missing pytestmark = pytest.mark.v2 at module level. All v2 tests under tests/** should have this marker for proper test suite categorization.

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files, 4K+ lines)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations
  7. Good progress on addressing code review feedback

📋 Recommendation

Address issue #1 (transaction thread safety) before merging - this is a data integrity risk in multi-threaded environments.

Issues #2-#3 are medium priority inconsistencies that should be addressed for quality but wouldn't block merge if resources are constrained.

Issue #4 is low priority and can be addressed in a follow-up PR.

Test verification required:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py

Review based on HEAD commit (d0a1f92)

@github-actions
Copy link
Contributor

Code review comment posted to PR #301: #301 (comment)

The review summarizes:

  • 9 issues already fixed across commits f2baecc, 6083c9d, 8487fb0, b8beae8, d0a1f92
  • 4 remaining issues: 1 HIGH (transaction thread safety), 2 MEDIUM (CLI consistency), 1 LOW (test marker)
  • Recommendation: Fix transaction safety before merging as it's a data integrity risk

New%20session%20-%202026-01-25T04%3A57%3A09.348Z
opencode session  |  github run

Comment on lines +314 to +321
# SQLite uses autocommit by default; disable it for this transaction
old_isolation = self.conn.isolation_level
self.conn.isolation_level = None # Manual transaction mode
cursor = self.conn.cursor()
cursor.execute("BEGIN")

try:
yield self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cursor() or execute("BEGIN") raises, isolation_level stays None since these calls are outside the try block. Consider moving the isolation level change and BEGIN into the try block so the finally always restores it.

-            # SQLite uses autocommit by default; disable it for this transaction
-            old_isolation = self.conn.isolation_level
-            self.conn.isolation_level = None  # Manual transaction mode
-            cursor = self.conn.cursor()
-            cursor.execute("BEGIN")
-
-            try:
+            old_isolation = self.conn.isolation_level
+            try:
+                # SQLite uses autocommit by default; disable it for this transaction
+                self.conn.isolation_level = None  # Manual transaction mode
+                cursor = self.conn.cursor()
+                cursor.execute("BEGIN")
                 yield self

🚀 Want me to fix this? Reply ex: "fix it for me".

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@codeframe/persistence/database.py`:
- Around line 314-327: The transaction code sets self.conn.isolation_level =
None then calls cursor.execute("BEGIN") outside the try, so if BEGIN raises the
isolation_level isn't restored; move cursor.execute("BEGIN") into the try block
(immediately after acquiring cursor) so that the finally always restores
isolation_level, and ensure that any failure during BEGIN triggers the except
block to call self.conn.rollback() (guarding rollback if no transaction started)
before re-raising; update references in the method where
self.conn.isolation_level is changed, cursor = self.conn.cursor(),
cursor.execute("BEGIN"), yield self, self.conn.commit(), and
self.conn.rollback() accordingly.
- Around line 298-312: Replace the non-reentrant _sync_lock with a
threading.RLock and move the nested-transaction guard so it runs only after
acquiring the lock: remove the pre-lock check of self.conn.in_transaction,
instantiate self._sync_lock as threading.RLock, then inside the with
self._sync_lock: block perform the conn.in_transaction check and raise the
RuntimeError if needed; this allows reentrant acquisitions by repository methods
(refer to _sync_lock and self.conn.in_transaction in the transaction-starting
code).
♻️ Duplicate comments (2)
codeframe/cli/app.py (2)

3834-3837: Clamp non‑positive estimates before prediction.

task.estimated_hours or 1.0 still allows negative/zero values, which can skew scheduling math. Mirror the guard used in schedule_show.

🔧 Proposed fix
-        for task in task_list:
-            duration = task.estimated_hours or 1.0
-            task_durations[task.id] = duration
+        for task in task_list:
+            val = task.estimated_hours
+            duration = val if val is not None and val > 0 else 1.0
+            task_durations[task.id] = duration

3912-3915: Normalize non‑positive estimates before bottleneck analysis.

Negative/zero estimates can pass through and distort bottleneck scoring.

🔧 Proposed fix
-        for task in task_list:
-            duration = task.estimated_hours or 1.0
-            task_durations[task.id] = duration
+        for task in task_list:
+            val = task.estimated_hours
+            duration = val if val is not None and val > 0 else 1.0
+            task_durations[task.id] = duration

- Replace threading.Lock with threading.RLock to allow reentrant
  acquisitions by repository methods that also use _sync_lock
- Move nested transaction check inside the lock (RLock handles
  same-thread reentry, check prevents actual nested DB transactions)
- Move cursor.execute("BEGIN") into try block so isolation_level
  is always restored in finally even if BEGIN fails
- Guard rollback with in_transaction check to avoid error when
  no transaction was started (e.g., BEGIN failure)
@github-actions
Copy link
Contributor

Code Review Summary: PR #301

This is a well-structured PR implementing Phase 4 planning features (scheduling, templates, effort estimation). The code follows v2 architecture patterns with proper separation between core, CLI, and API layers.

✅ Issues Already Fixed (Great progress!)

Based on commit history, the following issues have been resolved:

  • ✅ pytest.mark.v2 markers added to 9 new test files (f2baecc)
  • ✅ Template dependencies properly wired to task IDs (6083c9d)
  • ✅ TimelineEntry agent_id included in both schedule paths (f2baecc)
  • ✅ Status comparison normalized (DONE/COMPLETED) (6083c9d)
  • ✅ Estimated hours handling fixed in schedule_show (f2baecc)
  • ✅ FastAPI import removed from phase_manager.py (f2baecc)
  • ✅ API parameter validation added (agents>=1, hours_per_day>0) (8487fb0)
  • ✅ Error detail exposure fixed (generic "Internal server error") (b8beae8)
  • ✅ Issue number mismatch fixed when reusing existing issue (d0a1f92)
  • ✅ Transaction thread safety improved with RLock and nested checks (8428c8a, 9d03aeb)
  • ✅ CLI parameter validation added for schedule commands (8428c8a)

⚠️ Remaining Issues to Address

1. Inconsistent Duration Handling in CLI (MEDIUM) — codeframe/cli/app.py:3836, 3914

Two CLI commands still use the old pattern that doesn't handle negative values:

  • schedule_predict (line 3836): duration = task.estimated_hours or 1.0
  • schedule_bottlenecks (line 3914): duration = task.estimated_hours or 1.0

The schedule_show command has the correct pattern (lines 3748-3751):

duration = task.estimated_hours
if duration is None or duration <= 0:
    duration = 1.0
task_durations[task.id] = duration

Fix: Update both commands to match schedule_show's pattern.


2. Missing v2 Test Marker (LOW) — tests/core/test_phase_manager.py

This test module is missing pytestmark = pytest.mark.v2 at module level. All v2 tests under tests/** should have this marker for proper test suite categorization.

Fix: Add after imports:

import pytest
from pathlib import Path

pytestmark = pytest.mark.v2

from codeframe.persistence.database import Database
# ... rest of imports

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files, 4K+ lines)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations
  7. Excellent progress on addressing code review feedback

📋 Summary

This PR adds valuable scheduling and template functionality. Most critical issues from prior reviews have been addressed. The remaining issues are:

The transaction safety issue mentioned in previous reviews appears to have been addressed in recent commits (RLock, nested checks, proper isolation_level restoration).

Recommendation: Address issue #1 before merging for consistency across all scheduling commands. Issue #2 is low priority and can be addressed in a follow-up PR.

Test verification required:
```bash
uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py
```


Review based on HEAD commit (9d03aeb)

@github-actions
Copy link
Contributor

I've reviewed PR #301 and posted a summary comment identifying the remaining issues. The review acknowledges that most critical issues from prior reviews have been addressed (9 issues fixed across recent commits), and highlights 2 remaining items to address before merging:

  1. Inconsistent duration handling in CLI (MEDIUM) - schedule_predict and schedule_bottlenecks commands need to match the proper pattern used in schedule_show
  2. Missing v2 test marker (LOW) - tests/core/test_phase_manager.py needs pytestmark = pytest.mark.v2

The transaction safety issue appears to have been addressed in recent commits (9d03aeb) with RLock, nested transaction checks, and proper isolation_level restoration.

New%20session%20-%202026-01-25T21%3A29%3A36.921Z
opencode session  |  github run

Test User added 2 commits January 25, 2026 14:41
Update schedule_predict and schedule_bottlenecks to match schedule_show's
pattern for handling estimated_hours - explicitly check for None or <= 0
instead of using `or 1.0` which doesn't catch negative values.
Add warning log when schedule_tasks() doesn't schedule all expected tasks.
This can happen when tasks have unknown dependency IDs that cause them
to be excluded from parallel execution waves.
@github-actions
Copy link
Contributor

Code Review Summary: PR #301

This is a well-structured PR implementing Phase 4 planning features (scheduling, templates, effort estimation). The code follows v2 architecture patterns with proper separation between core, CLI, and API layers.

✅ Excellent Progress - All Critical Issues Addressed

Based on commit history, all major issues from prior reviews have been resolved:

  • ✅ pytest.mark.v2 markers added to 9 new test files (f2baecc)
  • ✅ Template dependencies properly wired to task IDs (6083c9d)
  • ✅ TimelineEntry agent_id included in both schedule paths (f2baecc)
  • ✅ Status comparison normalized to handle DONE/COMPLETED (6083c9d)
  • ✅ Estimated hours handling consistent in schedule_show (f2baecc)
  • ✅ FastAPI import removed from phase_manager.py (f2baecc)
  • ✅ API parameter validation added (agents>=1, hours_per_day>0) (8487fb0)
  • ✅ Error detail exposure fixed (generic "Internal server error") (b8beae8)
  • ✅ Issue number mismatch when reusing existing issue fixed (d0a1f92)
  • ✅ Transaction thread safety improved with RLock and nested checks (9d03aeb)
  • ✅ CLI parameter validation added for schedule commands (8428c8a)
  • ✅ Duration handling fixed in all schedule CLI commands (780f46a)

⚠️ One Remaining Minor Issue

Missing v2 Test Marker (LOW) — tests/core/test_phase_manager.py

This test module is missing pytestmark = pytest.mark.v2 at module level. All v2 tests under tests/** should have this marker for proper test suite categorization.

Fix: Add after imports:

import pytest
from pathlib import Path

pytestmark = pytest.mark.v2

from codeframe.persistence.database import Database
# ... rest of imports

✅ Strengths

  1. Clean v2 architecture with proper core/CLI/API separation
  2. Solid CPM algorithm implementation in TaskScheduler
  3. 7 practical builtin templates with appropriate effort estimates
  4. Comprehensive test coverage (9 new test files, 4K+ lines)
  5. SQL injection prevention with parameterized queries
  6. Idempotent database migrations
  7. Excellent responsiveness to code review feedback

📋 Recommendation

This PR is in excellent shape. The only remaining item is adding the v2 test marker to tests/core/test_phase_manager.py. This is a trivial change and the PR can be approved immediately after this fix.

Test verification required:

uv run pytest -m v2
uv run pytest tests/planning/
uv run pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py
uv run pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py
uv run ruff check codeframe/planning/ codeframe/cli/app.py

Review based on HEAD commit (9d03aeb)

@github-actions
Copy link
Contributor

Review posted successfully. The PR is in excellent shape with all critical issues addressed. Only one minor item remains: adding pytestmark = pytest.mark.v2 to tests/core/test_phase_manager.py.

New%20session%20-%202026-01-25T21%3A41%3A50.369Z
opencode session  |  github run

@github-actions
Copy link
Contributor

Code Review Summary: PR #301

Overall assessment: This is a well-executed PR implementing Phase 4 planning features (scheduling, templates, effort estimation). Excellent progress on addressing code review feedback - 11 issues have been fixed in recent commits.

✅ Issues Already Fixed (Great Progress!)

Based on commit history (86f64d2, 780f46a, 9d03aeb, 8428c8a, d0a1f92, b8beae8, 8487fb0, 6083c9d, f2baecc):

  • ✅ pytest.mark.v2 markers added to all new test files (8 modules)
  • ✅ Template dependencies properly wired to task IDs (CLI + API)
  • ✅ TimelineEntry agent_id included in both schedule paths
  • ✅ Status comparison normalized (handles DONE/COMPLETED)
  • ✅ Estimated hours handling consistent across all schedule CLI commands
  • ✅ FastAPI import removed from phase_manager.py
  • ✅ API parameter validation added (Query(ge=1), Query(gt=0, le=24))
  • ✅ Error detail exposure fixed (generic "Internal server error")
  • ✅ Issue number mismatch fixed (uses actual issue number)
  • ✅ Transaction safety improved (RLock, nested checks, proper isolation_level restoration)
  • ✅ CLI parameter validation added (min=1 in typer, explicit guards)
  • ✅ Warning log added when tasks omitted from schedule

⚠️ Remaining Issues to Address

Only 1 Remaining Issue (LOW Priority)

1. Missing v2 Test Marker — tests/core/test_phase_manager.py

This test module is missing `pytestmark = pytest.mark.v2` at module level. While this file existed before the PR, it now tests v2 functionality (`InvalidPhaseTransitionError`, `ProjectNotFoundError`) and should be marked for proper categorization.

Fix:

import pytest
from pathlib import Path

pytestmark = pytest.mark.v2

from codeframe.persistence.database import Database
# ... rest of imports

✅ Strengths

  1. Comprehensive functionality: Scheduling (CPM algorithm), templates (7 builtin), effort estimation
  2. Clean v2 architecture: Proper core/CLI/API separation, no FastAPI in core modules
  3. Solid CPM implementation: Critical path analysis, parallel execution waves, multi-agent support
  4. Transaction safety: RLock-based reentrant locking, nested transaction guards
  5. Input validation: API and CLI parameter validation throughout
  6. Error handling: Generic error messages for API clients, detailed logging server-side
  7. Test coverage: 9 new test files (4K+ lines) covering all new functionality
  8. SQL injection prevention: Parameterized queries throughout
  9. Idempotent migrations: Safe database schema upgrades
  10. Excellent responsiveness: 11 issues fixed across 9 commits since initial review

📋 Recommendation

This PR is APPROVE-ABLE with one minor cleanup:

Add the `pytestmark = pytest.mark.v2` marker to `tests/core/test_phase_manager.py`. This is a low priority, trivial fix that completes the v2 test categorization.

The PR represents significant work (5,160 additions, 28 files) and all previously-identified critical and medium priority issues have been addressed. The code quality is high and follows v2 architecture patterns.

Test verification recommended before merge:
```bash
python -m pytest tests/core/test_phase_manager.py -v
python -m pytest tests/planning/ -v
python -m pytest tests/cli/test_schedule_commands.py tests/cli/test_templates_commands.py -v
python -m pytest tests/api/test_schedule_api.py tests/api/test_templates_api.py -v
```


Review based on HEAD commit (86f64d2) - 18 commits total

@github-actions
Copy link
Contributor

Review Complete

I've posted a comprehensive review of PR #301 to the pull request. Here's a summary of my findings:

Final Assessment: ✅ APPROVE-ABLE with Minor Cleanup

Excellent progress! This PR has undergone 11 fixes across 9 commits addressing all critical and medium-priority issues from prior reviews:

What Was Fixed

  • ✅ v2 test markers added to 8 new test modules
  • ✅ Template dependencies properly wired to task IDs
  • ✅ TimelineEntry agent_id included consistently
  • ✅ Status comparison normalized (DONE/COMPLETED)
  • ✅ Duration handling consistent across CLI commands
  • ✅ FastAPI import removed from phase_manager.py
  • ✅ API parameter validation (Query(ge=1), Query(gt=0, le=24))
  • ✅ Generic error messages (no internal detail exposure)
  • ✅ Issue number mismatch fixed
  • ✅ Transaction safety improved (RLock, nested checks)
  • ✅ CLI parameter validation (min=1, explicit guards)
  • ✅ Warning log for omitted tasks

Only 1 Remaining Issue (LOW Priority)

Missing v2 test marker in tests/core/test_phase_manager.py

This file now tests v2 functionality (InvalidPhaseTransitionError, ProjectNotFoundError) but lacks pytestmark = pytest.mark.v2 at module level.

Quick fix:

import pytest
from pathlib import Path

pytestmark = pytest.mark.v2  # Add this line

from codeframe.persistence.database import Database
# ... rest of imports

Overall Impression

This is substantial, well-executed work (5,160 additions, 28 files) implementing Phase 4 features with:

  • Solid CPM-based scheduling algorithm
  • 7 practical builtin templates
  • Comprehensive test coverage (4K+ lines)
  • Clean v2 architecture (no FastAPI in core)

The code quality is high and follows v2 architecture patterns. All critical issues have been resolved.

New%20session%20-%202026-01-25T21%3A43%3A07.817Z
opencode session  |  github run

@frankbria frankbria merged commit e6ce1ad into main Jan 25, 2026
11 checks passed
@frankbria frankbria deleted the feature/enhanced-task-generation branch January 25, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[V2] Enhance task generation with dependency analysis and effort estimation

1 participant