Fix Telegram MarkdownV2 parsing errors with custom pulldown-cmark renderer#71
Merged
Fix Telegram MarkdownV2 parsing errors with custom pulldown-cmark renderer#71
Conversation
Replace manual markdown parsing with AST-based event-driven rendering using pulldown-cmark for robust Telegram MarkdownV2 formatting. Core implementation: - TelegramRenderer with state machine for context-aware escaping - 19 special chars escaped in text, only \ and ` in code blocks - Bold/italic/strikethrough/header/link/list/blockquote formatting - UTF-8 safe chunking with char boundary detection and newline preference - Dedicated escape_url() for link URLs (only ) and \ per spec) Performance: - Latency: 2.83µs for 500 chars, 25.73µs for 5KB (3,534x below 10ms target) - Throughput: 121-970 MiB/s depending on content type - Single allocation strategy, no excessive copying Quality: - 37 tests pass (30 markdown + 5 telegram + 2 memory profiling) - 99.32% line coverage for markdown.rs - 0 clippy warnings - 0 vulnerabilities (cargo deny pass) - No unsafe code Dependencies: - Added pulldown-cmark 0.13.0 (MIT license) - Added criterion 0.8.0 for benchmarking Files modified: - crates/zeph-channels/src/markdown.rs - Complete rewrite with TelegramRenderer - crates/zeph-channels/src/telegram.rs - UTF-8 safe chunking, removed has_unclosed_code_block() - crates/zeph-channels/src/lib.rs - Made markdown module public - crates/zeph-channels/benches/markdown_performance.rs - Comprehensive benchmark suite - crates/zeph-channels/tests/memory_profile.rs - Memory allocation profiling Fixes #69
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #71 +/- ##
==========================================
+ Coverage 75.60% 77.23% +1.62%
==========================================
Files 20 20
Lines 3489 3654 +165
==========================================
+ Hits 2638 2822 +184
+ Misses 851 832 -19
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes Telegram API parsing errors (Bad Request: can't parse entities) by implementing a custom MarkdownV2 escape function using pulldown-cmark event-driven rendering.
Implementation
Core changes:
TelegramRendererwith state machine for context-aware escaping\and`in code blocksArchitecture:
markdown.rs- Complete rewrite with TelegramRenderertelegram.rs- UTF-8 safe chunking integration, removedhas_unclosed_code_block()markdown_performance.rs- Comprehensive criterion benchmarksmemory_profile.rs- Allocation overhead profilingPerformance
Quality
Dependencies
Review Process
Phase 2 completed with full rust-lifecycle workflow:
See
.local/handoff/for detailed reports from each phase.Fixes #69