ROADMAP 260304

* Evaluation
One of the big problem in this space is that there is no public benchmark for what thorough reviews should look like. We should have a scalable way to collect this benchmark.

* Algorithm design
The current approach uses incremental summarization. It will have trouble for long-term dependency.
Test recursive language modeling for this task.
Turn the key idea into a claude skill
Integrate with https://github.com/ChicagoHAI/MechEvalAgent/ to have execution-grounded evaluation

* Interaction
This tool would be much more useful if the users can address the comments interactively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROADMAP 260304 #23

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ROADMAP 260304 #23

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions