Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for electra tokenizer as a special case of BERT tokenizer #28

Merged
merged 5 commits into from
Mar 24, 2024

Conversation

KennethEnevoldsen
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen commented Jan 8, 2024

Added support for the ELECTRA tokenizer. The ELECTRA models is waiting for the ELECTRA PR on curated transformers.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.
    • I have not added the documentation required for ELECTRA as it required the PR of curated transformers

Copy link
Collaborator

@shadeMe shadeMe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Just a couple of comments.

PS: The BuildKite CI failures can be ignored for now.

spacy_curated_transformers/tokenization/hf_loader.py Outdated Show resolved Hide resolved
spacy_curated_transformers/tokenization/hf_loader.py Outdated Show resolved Hide resolved
@KennethEnevoldsen
Copy link
Contributor Author

Ah yes thanks for the suggestions (getting too used to ruff cleaning up after me). I also formatted it to black (saw that the tests failed).

@KennethEnevoldsen
Copy link
Contributor Author

Hi @shadeMe seems like this PR is ready to merge?

@shadeMe shadeMe merged commit e8657ff into explosion:main Mar 24, 2024
7 of 9 checks passed
@KennethEnevoldsen KennethEnevoldsen deleted the add-electra branch March 24, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants