feat: add Escape Smart Characters operation#2274
Draft
min23asdw wants to merge 2 commits intogchq:masterfrom
Draft
feat: add Escape Smart Characters operation#2274min23asdw wants to merge 2 commits intogchq:masterfrom
min23asdw wants to merge 2 commits intogchq:masterfrom
Conversation
Converts smart characters (curly quotes, em/en dashes, arrows, copyright signs, ellipses, etc.) to plain ASCII equivalents. For characters with no obvious ASCII mapping, the user can choose to Include (keep as-is), Remove, or Replace with '.'. Closes gchq#419 Ref: gchq#1291 (closed due to inactivity)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new Escape Smart Characters operation that converts typographic/smart characters to their plain ASCII equivalents.
""'') → straight quotes (""'')—–) → ASCII dashes (---)→←↔⇒) → ASCII arrows (--><--<->==>)©®™) → ASCII ((C)(R)(TM))…) →...For characters with no obvious ASCII equivalent (e.g. ☣), the user can choose:
This follows the spec from the original issue discussion.
Closes #419
Ref: #1291 (closed due to repository inactivity, not code issues)
Changes
src/core/operations/EscapeSmartCharacters.mjstests/operations/tests/EscapeSmartCharacters.mjssrc/core/config/Categories.jsontests/operations/index.mjsTest plan
npx grunt lint— passesnpm test— 1905/1905 passing (Node 18)