Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tokenizer base class. #7757

Merged
merged 13 commits into from
Jun 15, 2023
Merged

Conversation

pforderique
Copy link
Contributor

Add the Tokenizer base class from which BytePairEncoding and other future Tokenizers will inherit from.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

Copy link
Collaborator

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tfjs-layers/src/layers/nlp/tokenizers_test.ts Show resolved Hide resolved
Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I have a few minor changes, and some longer comments on some things I've discovered about BytePairTokenizer.

tfjs-layers/src/layers/nlp/tokenizers.ts Outdated Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers.ts Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers.ts Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers.ts Outdated Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers.ts Outdated Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers_test.ts Outdated Show resolved Hide resolved
Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of nits.

tfjs-layers/src/layers/nlp/tokenizers.ts Outdated Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers.ts Show resolved Hide resolved
tfjs-layers/src/layers/nlp/tokenizers_test.ts Outdated Show resolved Hide resolved
pforderique and others added 2 commits June 15, 2023 10:48
Co-authored-by: Matthew Soulanille <matthew@soulanille.net>
@mattsoulanille mattsoulanille merged commit 6b94f63 into tensorflow:master Jun 15, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants