Skip to content

Add support for XML-style <cmt>comment</cmt> <comment>comment</comment> formats in refine-plan#78

Merged
SihaoLiu merged 4 commits intoPolyArch:devfrom
Lyken17:main
Apr 13, 2026
Merged

Add support for XML-style <cmt>comment</cmt> <comment>comment</comment> formats in refine-plan#78
SihaoLiu merged 4 commits intoPolyArch:devfrom
Lyken17:main

Conversation

@Lyken17
Copy link
Copy Markdown

@Lyken17 Lyken17 commented Apr 13, 2026

Summary

Extends the refine-plan command to support three comment annotation formats instead of just CMT:/ENDCMT, making plan reviews more intuitive and flexible
for different user preferences.

New Comment Formats

  • Classic format (existing): CMT: comment ENDCMT
  • Short tag format (new): <cmt>comment</cmt>
  • Long tag format (new): <comment>comment</comment>

Key Features

  • ✅ All formats support both inline and multi-line usage
  • ✅ Formats can be mixed within the same file
  • ✅ Proper nesting detection and error reporting
  • ✅ Code block and HTML comment exclusion works for all formats
  • ✅ 100% backward compatible with existing workflows

Examples

Inline annotations:

Text before CMT: classic comment ENDCMT text after.
Text before <cmt>short tag comment</cmt> text after.
Text before <comment>long tag comment</comment> text after.

Multi-line annotations:
CMT:
Multi-line classic comment
ENDCMT

<cmt>
Multi-line short tag comment
</cmt>

<comment>
Multi-line long tag comment
</comment>

Mixed formats in one file:
## Goal Description
Update the API CMT: why not use REST? ENDCMT to support GraphQL.

## Acceptance Criteria
- AC-1: <cmt>Should this be split into multiple ACs?</cmt>
- AC-2: <comment>Need to clarify the error handling requirements</comment>

Test plan

- Verify all three formats are correctly parsed by the validator
- Test inline and multi-line usage for each format
- Confirm mixed formats work in the same file
- Validate error detection for mismatched start/end markers
- Ensure code blocks and HTML comments properly exclude markers
- Run existing test suite to confirm backward compatibility

Implementation Notes

- Enhanced AWK parser with format-specific marker detection
- Added helper functions for marker length calculation and format validation
- Updated all documentation to reflect the new capabilities

Lyken17 and others added 2 commits April 10, 2026 02:28
Extends comment parsing to support three formats:
- Classic: CMT:/ENDCMT (existing)
- Short tag: <cmt></cmt> (new)
- Long tag: <comment></comment> (new)

All formats support inline and multi-line usage and can be mixed within the same file. Updated documentation and error messages to be format-agnostic.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 79714b73d5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (closest_marker == "") {
return ""
} else {
return closest_marker ":" closest_pos
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return marker metadata in an unambiguous format

find_comment_markers concatenates the marker and position as marker:pos, but callers parse it with split(..., ":"); this breaks for the legacy CMT: marker because the split result becomes "CMT", "", "<pos>". In scan_cmt_blocks, that leaves found_marker mismatched and marker_pos empty (0), so pos never advances and the validator loops indefinitely on inputs containing classic CMT: ... ENDCMT comments (reproducible with timeout 5s ./scripts/validate-refine-plan-io.sh --input <file>). This is a regression that hangs existing refine-plan workflows.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex fix

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex fix

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…T: marker

The find_comment_markers awk function used ":" as the delimiter to return
"marker:position" pairs. Since the CMT: marker itself contains a colon,
split("CMT::1", parts, ":") would parse incorrectly, producing
found_marker="CMT" instead of "CMT:" and an empty marker_pos. This caused
pos to never advance, resulting in an infinite loop that hung the CI for
nearly 5 hours.

Switch the internal delimiter from ":" to "|" which does not appear in any
comment marker string.
The PR renamed CMT-specific terminology to generic "comment" terminology
in both refine-plan.md and validate-refine-plan-io.sh, but the test
assertions in test-refine-plan.sh still referenced the old text. Update
all 9 affected assertions to match the current documentation and error
messages.
@SihaoLiu SihaoLiu merged commit 3bef3ef into PolyArch:dev Apr 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants