Adds start and end character positions to tag structure - available to tag transformers #151

SoftwareEngineerChris · 2022-01-01T22:55:14Z

This change introduces tag character positions relative to the original string as part of the Tag structure. These can then be used by TagTransformer transformation functions.

It may not be immediately obvious why this change may be useful, but I have found it to be quite useful for extracting content that wouldn't be suitable for attributed string transformation from within content that is suitable for attributed string transformation.

For example, if the html being transformed is mostly transformable content, but contains an iframe tag, or a Twitter blockquote somewhere within it, the positions of these tags (opening and/or closing) have been useful in order to split, extract, and treat them accordingly.

I've used emojis with variations to include grapheme clusters in the unit test to ensure the String.Index values handle these correctly (via UTF16).

Pulls changes from upstream

…o tag transformers

psharanda · 2023-06-04T10:22:09Z

@SoftwareEngineerChris FYI V5 was introduced recently and included new TagTuning API

SoftwareEngineerChris added 2 commits August 15, 2020 07:18

Merge pull request #1 from psharanda/master

598093c

Pulls changes from upstream

Adds start and end character positions to tag structure - available t…

93d401c

…o tag transformers

psharanda force-pushed the master branch from 63db837 to 34cd463 Compare June 4, 2023 09:34

psharanda closed this Jan 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds start and end character positions to tag structure - available to tag transformers #151

Adds start and end character positions to tag structure - available to tag transformers #151

SoftwareEngineerChris commented Jan 1, 2022

psharanda commented Jun 4, 2023

Adds start and end character positions to tag structure - available to tag transformers #151

Adds start and end character positions to tag structure - available to tag transformers #151

Conversation

SoftwareEngineerChris commented Jan 1, 2022

psharanda commented Jun 4, 2023