-
-
Notifications
You must be signed in to change notification settings - Fork 2
Add support for backtick quotes and multi-quote strings with escaping (fixes #142) #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #142
This commit implements issue #142: 1. Backtick quotes: Added support for backtick (`) as a third quote type alongside double quotes (") and single quotes (') 2. Multi-quote strings: Support for 1-5+ quote sequences where: - N quotes open/close the string - 2*N quotes inside become N quotes in the output (escaping) Examples: - `hello` -> hello (backtick quoted) - ``code with ` inside`` -> code with ` inside - '''text with '' inside''' -> text with '' inside - """json with " in it""" -> json with " in it All four language implementations updated: - JavaScript: Updated grammar.pegjs - Rust: Updated parser.rs - C#: Updated Parser.peg - Python: Updated parser.py Tests added for all implementations covering: - All three quote types (", ', `) - Quote counts from 1 to 5 - Escape sequences - Complex scenarios (mixed quotes, code blocks, JSON, etc.) Version bumped to 0.13.0 Closes #142 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This reverts commit b21928f.
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T15:04:44.606Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
Preserve backtick quote support and multi-quote string handling while incorporating style changes from main branch. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Format tests/MultiQuoteParser.test.js with Prettier - Format Python parser.py and test_multi_quote_parser.py with Black - Format Rust parser.rs with rustfmt 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix Python import sorting (isort) in test_multi_quote_parser.py - Fix Rust formatting in multi_quote_parser_tests.rs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
We should support any number of quotes in a series, not only 1-5, but any number of N. Is it possible to do with PEG and other parsers? |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T15:21:21.795Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
Previously, the parsers only supported 1-5 quote characters for quoted strings. This change extends support to any N quotes (6+) across all language implementations: - JavaScript (PEG.js): Added highQuotedReference rule with procedural parsing - Python: Changed static range(5,0,-1) to dynamic quote counting - Rust: Added parse_dynamic_quote_string() function - C#: Added ParseHighQuoteString() helper method in PEG grammar The existing 1-5 quote rules are preserved for backwards compatibility and performance. The new unlimited quote support activates for 6+ quotes. Added comprehensive tests for 6, 7, 8, and 10-quote strings in all parsers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
🔄 Auto-restart 1/3Detected uncommitted changes from previous run. Starting new session to review and commit them. Uncommitted files: Auto-restart will stop after changes are committed or after 2 more iterations. Please wait until working session will end and give your feedback. |
These scripts demonstrate and test the unlimited quote parsing capability for 6+ quote sequences in JavaScript, Python, and Rust parsers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
Can we have a simpler and universal logic for parsing any N of quotes for each quote type? |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T15:50:45.353Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
Replace explicit quote rules (1-5 quotes) with a single universal
parseNQuoteString() function that handles any N quotes dynamically.
This change:
- Reduces quote-related grammar rules from ~90 lines to ~30 lines
- Uses a single procedural parser function for all quote types (", ', `)
- Maintains the same behavior: N opening quotes, N closing quotes,
2*N quotes inside escape to N quotes
The universal approach is simpler and more maintainable while
supporting unlimited quote levels.
Note: C# grammar kept with explicit rules due to Pegasus parser
framework limitations with greedy patterns.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The [^"] pattern in PEG.js doesn't match newlines by default. Update content patterns to explicitly include newline characters so multiline quoted strings work correctly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This reverts commit 185c1ef.
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
🔄 Auto-restart 1/3Detected uncommitted changes from previous run. Starting new session to review and commit them. Uncommitted files: Auto-restart will stop after changes are committed or after 2 more iterations. Please wait until working session will end and give your feedback. |
✅ PR Ready for Final ReviewI've reviewed the current state of PR #168 and can confirm: Implementation StatusAll requirements from issue #142 are fully implemented across all 4 languages:
Test ResultsAll tests passing locally:
CI StatusAll CI checks passing ✅ Implementation ApproachesJavaScript (Peggy.js): Uses a simplified universal approach with:
Python, Rust, C#: Use a hybrid approach with:
The C# investigation (documented in PR comments) explored using the same universal approach as JavaScript, but Pegasus's constraints require the hybrid approach for proper disambiguation of multiple quoted strings. Clean State
The PR is ready for final review and merge. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
Please collect all data related about the impossibility of simplify C# PEG parser in this repository, make sure we compile that data to Please try all solutions separately, make sure for each one we have separate folder with small example that does focus only on quotes parsing feature. We need to try every possible solution we can come up with and document all of them and all problems we will encounter. |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T18:58:15.388Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
Document comprehensive investigation into whether C# Pegasus can use
universal parsing approach similar to JavaScript Peggy.js for N-quote
string parsing.
Case study includes:
- Timeline of events and experiments conducted
- Root cause analysis (4 identified causes)
- Four solution approaches with documentation:
1. #parse{} expression - Failed due to MSBuild incompatibility
2. Capture-then-validate - Partial success, disambiguation issues
3. Semantic predicates - Failed, no input access
4. Hybrid approach - Current working solution
Key findings:
- Pegasus #parse{} not supported with <PegGrammar> MSBuild tag
- Semantic predicates cannot access input string directly
- PEG greedy operators prevent proper disambiguation
- Hybrid approach (explicit rules 1-5, procedural 6+) is best solution
Related to issue #142 and PR #168.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
📚 Case Study Documentation AddedI've created a comprehensive case study documenting the investigation into simplifying the C# Pegasus PEG parser in ContentsMain Documentation:
Solution Experiments:
Key Findings4 Root Causes Identified:
Conclusion: C# Pegasus cannot use the exact same universal approach as JavaScript due to parser generator constraints. The hybrid approach (explicit PEG rules for levels 1-5, procedural parsing for 6+) is the best solution. Files AddedCommit: 05bc8dc |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
How small we can do explicit matching rules? Can we do only 1 quote or 2 or at least 3? So less code in rules, and more usage of universal function even with hybrid approach in C#? |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T19:19:51.986Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
…r N>=3 Reduced explicit PEG rules from 5 quote levels to just 2: - N=1 (explicit): Required for disambiguation of multiple strings on same line - N=2 (explicit): Required for proper escape sequence handling - N>=3 (procedural): Universal ParseMultiQuoteString() handles unlimited quotes Grammar size reduction: 188 -> 155 lines (-17.5%) Key findings documented in case study: - N=1 explicit rules prevent PEG greedy capture issues with "a" "b" - N=2 explicit rules handle escape sequences like """" in "" strings - N>=3 can use procedural because content pattern '"'+ &[^"] works correctly Added experiments/minimal-peg-rules/ with test harness proving the approach. All tests pass: C# (180), JS (188), Python (176), Rust (39) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
C# Grammar Simplification CompleteBased on the user's question about minimizing explicit PEG rules, I investigated and successfully reduced the C# grammar. Key Findings
Results
All Tests Pass
The case study documentation has been updated with these findings at |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
🔄 Auto-restart 1/3Detected uncommitted changes from previous run. Starting new session to review and commit them. Uncommitted files: Auto-restart will stop after changes are committed or after 2 more iterations. Please wait until working session will end and give your feedback. |
This commit adds runnable test projects for each solution approach
documented in the case study. Each project demonstrates the specific
behavior (success, partial success, or failure) of that approach:
- Solution 01: #parse{} expression - demonstrates PEG0011 build error
- Solution 02: capture-validate - shows disambiguation failure
- Solution 03: semantic predicates - shows input access limitation
- Solution 04: hybrid approach - full working solution
- Solution 05: minimized hybrid - optimized production implementation
Each project can be run independently with:
cd solutions/<name>/project && dotnet build && dotnet run
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Case Study Complete: C# PEG Parser SimplificationI've completed the comprehensive case study analysis as requested. Here's what was done: Files AddedAll solutions now have standalone test projects in
How to Run Test ProjectsEach solution can be tested independently: cd docs/case-studies/csharp-peg-simplification/solutions/<name>/project
dotnet build # Solution 01 will fail here (demonstrating the error)
dotnet run # Solutions 02-05 will demonstrate their behaviorKey Findings
Root Causes DocumentedThe case study documents 4 fundamental limitations preventing universal parsing in C# Pegasus:
Solution 05 is OptimalThe minimized hybrid approach (current production implementation) achieves:
All CI checks pass ✅ |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
Based on what we learned in C# and JavaScript, can we apply it to simplify python/links_notation/parser.py and also Rust version of parser? Or we reached maximum possible simplification? |
|
🤖 AI Work Session Started Starting automated work session at 2025-12-01T19:54:31.037Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait working session to finish, and provide your feedback. |
Analysis: Python and Rust Parser SimplificationBased on the user's question: "Can we apply learnings from C# and JavaScript to simplify Python and Rust parsers?" Answer: Maximum simplification already achieved! ✅After analyzing all four parser implementations, here's the current state: Implementation Comparison Table
Why Python & Rust Are Already OptimalRust (lines 196-235 in fn parse_dynamic_quote_string(input: &str, quote_char: char) -> IResult<&str, String> {
// Count opening quotes dynamically
let quote_count = input.chars().take_while(|&c| c == quote_char).count();
if quote_count == 0 { return Err(...); }
parse_multi_quote_string(input, quote_char, quote_count)
}Python (lines 405-467 in def _parse_multi_quote_string(self, text, quote_char, quote_count):
# Universal parsing for any N quotes
open_close = quote_char * quote_count
escape_seq = quote_char * (quote_count * 2)
# ... handles all cases with one functionBoth implementations:
Why C# Cannot Match ThisC# Pegasus has fundamental limitations (documented in
The hybrid approach with explicit N=1,2 rules is the best possible for Pegasus. Summary
Conclusion: All four implementations have reached their maximum possible simplification given their respective parser generator constraints. The key difference is that Python and Rust are not PEG-based - they use hand-written procedural parsers with nom (Rust) or pure Python, giving them full flexibility for universal parsing. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
Summary
This PR implements issue #142 by adding:
Backtick quotes: Added support for backtick (`) as a third quote type alongside double quotes (") and single quotes (')
Multi-quote strings: Support for any N quote sequences where:
Examples:
Implementation Approach
All implementations use a universal procedural parser that:
JavaScript uses a simplified approach with:
inputandoffset()to peek at input directlyparseQuotedStringAt()function for all quote types and countsNon-PEG parsers (Python, Rust) use the procedural function directly.
Changes
grammar.pegjsusing universal N-quote parsing with global variablesparser.rswithparse_dynamic_quote_string()functionParser.pegwithParseHighQuoteString()helper methodparser.pywith_parse_multi_quote_string()methodTests
All comprehensive tests pass for:
Version
Bumped to 0.13.0 across all implementations.
Closes #142
🤖 Generated with Claude Code