Replies: 1 comment
-
|
Hi @Victor-D, thanks for bringing this up — severity classification is definitely on our radar. We actually experimented with this last week by modifying the review prompt directly to include severity levels and categories. The results were mixed: while it did produce classified output, we observed a noticeable drop in F1 score for the review findings themselves — the model seemed to trade off detection quality for classification formatting. We have a few larger features in progress right now, and once those land, we plan to revisit prompt tuning for severity/category support. That said, our current thinking is that a more robust approach would be to run a separate LLM pass after the review to classify and categorize the findings. This decouples classification from detection, so the core review quality stays intact. The only trade-off is slightly increased latency and a small amount of additional token usage. In the meantime, your We'll keep this discussion updated as we make progress. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I think OCR is currently missing a feature that many coding agents now provide out of the box.
Review findings are not classified by severity, which makes it difficult to distinguish minor issues from critical ones and to prioritize feedback accordingly.
As a workaround, I've been using the -b flag with the following instruction: "You must classify review results by severity (minor, major, critical)". It works well.
Do you think there is a better way to handle this today? Perhaps severity classification could be added directly to the review prompt?
Victor
Beta Was this translation helpful? Give feedback.
All reactions