[closure segmenter] Add incremental glyph grouping to the closure segmenter merging loop. #124
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This refactors the closure segmenter implmentation to allow for incremental updates on glyph groupings after applying a segment merge. Prior to this change after each merge all glyph groupings would be fully recalculated from scratch, which results in O(N^2) like runtime.
By doing incremental updates we significantly reduce the amount of work performed since only things affected by the merge are recomputed on each iteration resulting in O(N) like runtime instead.
In an example test, the total time for computing the Noto Serif SC segmentation in the IFT demo (which starts with segments of one codepoint each) went from 2m15s to 45s. In this test case profiling shows that after this change computation is now primarily bottlenecked on glyph closure and brotli compression. Further follow on work is planned which should be able to significantly reduce the number of closure and brotli operations needed.
Additionally, to help better track the flow of information through the various parts of SegmentationContext a number of classes have been introduced that encapsulate specific groups of information. The following high level information is stored in context:
Information flows through these items:
These pieces all support incremental update. For example if 1. is updated we can incrementally update the down stream items 3. and 4. Only needing to recompute the parts that change as a result of the changes in 1.