fix: "\%" early stop by wjiayis · Pull Request #160 · PaperDebugger/paperdebugger

wjiayis · 2026-04-19T17:07:38Z

Previously:

% is used to remove comments, which interfered with literal \% in LaTeX documents.

Now:

Handle % properly. Tested for %, \% and \\%.

4ndrelim

Left some comments and suggestions for your consideration, but overall lgtm.

4ndrelim · 2026-04-22T19:34:29Z

+// non-backslash character so that \% (an escaped percent) is preserved. Pairs
+// of backslashes (\\) before % are treated as a line-break followed by a real
+// comment, matching LaTeX semantics.
+var commentRegex = regexp.MustCompile(`(^|[^\\])((?:\\\\)*)%.*$`)


This should cover most cases. Think its fair not to expect multiple chained \ since it hardly serves practical purpose beyond 2 consecutive backslashes which represents newline. 3 consecutive followed by a % would denote newline followed by % which is hardly the intent.

But a more general solution that guards against malicious input is to count the number of backslashes preceding % and only treating the remaining as comment if the count is even. This logic is actually implemented in the xtramcp backend in LatexParser since it is not uncommon to encounter cases e.g.\\\{. https://github.com/PaperDebugger/xtramcp/blob/main/common/latex_parser.py#L780

That said, I personally feel it is reasonable enough and am also ok with this solution.

Actually, on a separate note, do you know why are we removing comments before passing to the LLM? I don't know the specifics, but assume this is the case. In general, even if it is a user comment not meant to be displayed, it might serve as good contextual cues or auxiliary information. The output may be informed with user's thoughts in the form of comments.

Might it be because some comments might be too long / redundant? So this is an attempt to save tokens?

@4ndrelim Sure, I can generalise this solution to count odd and even backslashes.

@Junyi-99 Was there more context as to why comments were removed previously?

@4ndrelim Wait actually this already handles 3 or more backslashes before the % as well. I've improved the test cases to cover 3 or more backslashes regardless.

Oh, i did not realise it already covers. nice.

fix: "/%" early stop

5d458c8

wjiayis changed the title ~~fix: "/%" early stop~~ fix: "\%" early stop Apr 19, 2026

wjiayis requested review from 4ndrelim and Junyi-99 April 19, 2026 17:12

4ndrelim previously approved these changes Apr 22, 2026

View reviewed changes

4ndrelim mentioned this pull request Apr 22, 2026

[BUG] LaTex format identification #52

Open

wjiayis added 2 commits April 23, 2026 23:39

Merge remote-tracking branch 'origin' into fix/gpt-early-stop

b89b0f2

feat: improve test cases

918316e

wjiayis dismissed 4ndrelim’s stale review via 918316e April 23, 2026 15:44

4ndrelim merged commit 9ecf372 into main Apr 24, 2026
1 check passed

4ndrelim deleted the fix/gpt-early-stop branch April 24, 2026 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: "\%" early stop#160

fix: "\%" early stop#160
4ndrelim merged 3 commits intomainfrom
fix/gpt-early-stop

wjiayis commented Apr 19, 2026 •

edited

Loading

Uh oh!

4ndrelim left a comment

Uh oh!

4ndrelim Apr 22, 2026

Uh oh!

4ndrelim Apr 22, 2026

Uh oh!

wjiayis Apr 23, 2026

Uh oh!

wjiayis Apr 23, 2026 •

edited

Loading

Uh oh!

4ndrelim Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wjiayis commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

4ndrelim left a comment

Choose a reason for hiding this comment

Uh oh!

4ndrelim Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

4ndrelim Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

wjiayis Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

wjiayis Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

4ndrelim Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wjiayis commented Apr 19, 2026 •

edited

Loading

wjiayis Apr 23, 2026 •

edited

Loading