-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenizers Support #6272
Tokenizers Support #6272
Conversation
Codecov Report
@@ Coverage Diff @@
## main #6272 +/- ##
==========================================
+ Coverage 68.42% 68.55% +0.13%
==========================================
Files 1144 1170 +26
Lines 244991 246881 +1890
Branches 25411 25666 +255
==========================================
+ Hits 167627 169250 +1623
- Misses 70697 70902 +205
- Partials 6667 6729 +62
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of minor questions but otherwise it looks good to me.
We are also going to release the tokenizers as their own nuget package right? |
Yes, I have included the following item inside the csproj: <Import Project="$(RepoRoot)eng/pkg/Pack.props" /> Isn't good enough to produce its own package? or do I need to do anything more? |
@michaelgsharp this failure is unrelated, I am wondering if it is a known issue we need to fix.
|
If I remember correctly yes this is one of the ones that is flaky every once and a while. I'll requeue, but we should be good to merge. |
This PR introduces the first version of Tokenizers support. This version will include the following:
Features to add later to teh tokenizers: