Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizers docs: Specify which class contains __call__ method #14379

Merged
merged 4 commits into from Nov 28, 2021

Conversation

xhluca
Copy link
Contributor

@xhluca xhluca commented Nov 13, 2021

Currently, the docs specify the following:

BatchEncoding holds the output of the tokenizer’s encoding methods (call, encode_plus and batch_encode_plus) and is derived from a Python dictionary.

It's not clear what tokenizer class this is referring to. Moreover, the main tokenizer page does not have any documentation for __call__; instead it is found in PreTrainedTokenizerBase.

The proposed change in this PR will make it clear where the user can found documentation about the __call__ function, which is very widely used now.

@xhluca
Copy link
Contributor Author

xhluca commented Nov 13, 2021

@n1t0 This is one instance of __call__, but maybe it would be beneficial if all __call__ in the docs links to this docstring? Let me know your thoughts!

@LysandreJik
Copy link
Member

Cool, I think this is a welcome change! cc @sgugger

Could you run the style utilities to fix the code quality issues? You can do so by running this from the root of the repository:

pip install -e ".[quality]"
make fixup

@xhluca
Copy link
Contributor Author

xhluca commented Nov 19, 2021

Hi @LysandreJik I'm having trouble with that command. Got the following issue:

(venv) xhlu@XHL-Desktop:~/dev/transformers$ make fixup
No library .py files were modified
python utils/custom_init_isort.py
python utils/style_doc.py src/transformers docs/source --max_len 119
running deps_table_update
updating src/transformers/dependency_versions_table.py
python utils/check_copies.py
python utils/check_table.py
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
python utils/check_dummies.py
python utils/check_repo.py
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Checking all models are included.
Checking all models are public.
Checking all models are properly tested.
Checking all objects are properly documented.
Checking all models are in at least one auto class.
utils/check_repo.py:400: UserWarning: Full quality checks require all backends to be installed (with `pip install -e .[dev]` in the Transformers repo, the following are missing: PyTorch, TensorFlow, Flax. While it's probably fine as long as you didn't make any change in one of those backends modeling files, you should probably execute the command above to be on the safe side.
  warnings.warn(
python utils/check_inits.py
python utils/tests_fetcher.py --sanity_check
Traceback (most recent call last):
  File "utils/tests_fetcher.py", line 23, in <module>
    from git import Repo
ModuleNotFoundError: No module named 'git'
make: *** [Makefile:42: repo-consistency] Error 1

Not sure what's causing. I do have git installed, but this seems like it's trying to import some git module

@LysandreJik
Copy link
Member

I believe the package to install is gitpython, we should add this to the setup

@LysandreJik
Copy link
Member

If you rebase on master and re-run the commands above it should work!

@xhluca
Copy link
Contributor Author

xhluca commented Nov 27, 2021

@LysandreJik thanks. I applied the make fixup and commited the change.

@sgugger sgugger merged commit ebbe8cc into huggingface:master Nov 28, 2021
@sgugger
Copy link
Collaborator

sgugger commented Nov 28, 2021

Thanks again for your PR!

@xhluca xhluca deleted the patch-1 branch November 29, 2021 01:55
@xhluca
Copy link
Contributor Author

xhluca commented Nov 29, 2021

Glad it was helpful :)

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants