You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add ByteBPEProcessor
This type of processor applies byte-level BPE encoding. The processor aims for
compatibility with RoBERTa/GPT-2 BPE vocabs.
Fixes#19.
* Apply improvements
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
* Fix return type of encode
* Fix doc of encode_as_pieces
* Pass pieces by pair to find_best_pair
* Use range-based for loop
* Add reference to hash_combine docs
* Validate that merges consist of two items
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
This is definitely lower priority, but it would be nice to have support for BPE decoding, so that we could support RoBERTa.
The text was updated successfully, but these errors were encountered: