Adds start and end character positions to tag structure - available to tag transformers #151
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change introduces tag character positions relative to the original string as part of the
Tag
structure. These can then be used byTagTransformer
transformation functions.It may not be immediately obvious why this change may be useful, but I have found it to be quite useful for extracting content that wouldn't be suitable for attributed string transformation from within content that is suitable for attributed string transformation.
For example, if the html being transformed is mostly transformable content, but contains an
iframe
tag, or a Twitterblockquote
somewhere within it, the positions of these tags (opening and/or closing) have been useful in order to split, extract, and treat them accordingly.I've used emojis with variations to include grapheme clusters in the unit test to ensure the
String.Index
values handle these correctly (via UTF16).