-
-
Notifications
You must be signed in to change notification settings - Fork 130
feat(web): add method that evaluates precomputed tokenizations 🚂 #14874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jahorton
merged 4 commits into
epic/autocorrect
from
feat/web/evaluate-precomputed-tokenization
Oct 16, 2025
Merged
feat(web): add method that evaluates precomputed tokenizations 🚂 #14874
jahorton
merged 4 commits into
epic/autocorrect
from
feat/web/evaluate-precomputed-tokenization
Oct 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
User Test ResultsTest specification and instructions User tests are not required Test Artifacts
|
8d9bbdb to
fad631a
Compare
90b7607 to
b7a1f8d
Compare
fad631a to
68c8c0c
Compare
Relates-to: #14679 Build-bot: skip build:web Test-bot: skip
68c8c0c to
7ca6f27
Compare
jahorton
added a commit
that referenced
this pull request
Oct 2, 2025
While handy, this method did not adequately account for 'split' / 'merge' edit cases and also required special handling to avoid certain degenerate edit-path cases. The newer method put in place (see #14874) handles such edits and avoids the degeneracy problem that resulted when relying on edit paths for the edited portion of context. Build-bot: skip build:web Test-bot: skip
jahorton
added a commit
that referenced
this pull request
Oct 3, 2025
While handy, this method did not adequately account for 'split' / 'merge' edit cases and also required special handling to avoid certain degenerate edit-path cases. The newer method put in place (see #14874) handles such edits and avoids the degeneracy problem that resulted when relying on edit paths for the edited portion of context. Build-bot: skip build:web Test-bot: skip
jahorton
added a commit
that referenced
this pull request
Oct 7, 2025
While handy, this method did not adequately account for 'split' / 'merge' edit cases and also required special handling to avoid certain degenerate edit-path cases. The newer method put in place (see #14874) handles such edits and avoids the degeneracy problem that resulted when relying on edit paths for the edited portion of context. Build-bot: skip build:web Test-bot: skip
…luate-precomputed-tokenization
ermshiperete
approved these changes
Oct 9, 2025
Base automatically changed from
feat/web/compute-removed-token-count
to
epic/autocorrect
October 16, 2025 14:09
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Where #14790 introduced the new method for determining how input will transform context-tokenizations when applied, this method actually performs the transformation to produce the new (batched) tokenization, using subsets of inputs that all apply with the same properties in order to promote batching and efficiency in our correction-search goals.
Per #14876, note that this method will drop any previously-existing tokens that were erased by incoming input, replacing them if the input transforms have an
insertstring that results in a replacement.Relates-to: #14679
Build-bot: skip build:web
Test-bot: skip