Skip to content

v0.1.3 - fix: RTF comment decoding and non-ASCII comment sanitization

Choose a tag to compare

@ssweber ssweber released this 07 Apr 14:56
· 6 commits to main since this release
465213d

Summary

  • fix(decode): parse RTF comments structurally instead of regex-based stripping, handling nested groups, font/color
    tables, and style toggles correctly
  • fix(decode): handle RTF \uN unicode escape sequences (signed 16-bit) and validate ansicpg1252 before decoding
  • fix(csv): preserve quoted string literals in AF token positional args — e.g. copy("d",TXT1) no longer loses its
    quotes during parse
  • feat(encode): auto-sanitize non-ASCII characters (curly quotes, em-dashes, etc.) in rung comments before RTF
    conversion via NFKD normalization

Test plan

  • make test — all golden fixtures pass
  • make lint — ruff + ty clean
  • RTF oracle tests validate decode against striprtf reference implementation