Skip to content

Add type hint, update to pyo3 0.27, add automatic type hint generator#1928

Merged
ArthurZucker merged 52 commits intomainfrom
more-typehint
Feb 11, 2026
Merged

Add type hint, update to pyo3 0.27, add automatic type hint generator#1928
ArthurZucker merged 52 commits intomainfrom
more-typehint

Conversation

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker commented Jan 12, 2026

Ty pyo3!

PyO3/pyo3#5137

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker ArthurZucker changed the title something that is supposed to work but my env does not allow it, seems to be uv related Add type hint, update to pyo3 0.27, add automatic type hint generator Jan 27, 2026
@ArthurZucker ArthurZucker mentioned this pull request Feb 2, 2026
Copy link
Copy Markdown
Member

@McPatate McPatate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make sure the doc strings in the removed pyi files didn't have any drift with the docstrings in code? As in, any additional up to date documentation not present in the code files?

Looks great other than that! Nice work!

Digits = pre_tokenizers.Digits
FixedLength = pre_tokenizers.FixedLength
Metaspace = pre_tokenizers.Metaspace
PreTokenizer = pre_tokenizers.PreTokenizer
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you do that yourself or is it ty that auto-sorts alphabetically?

in any case, 💆🏻

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeppp

A tuple with the string representation of the CLS token, and its id
"""
def __init__(self, sep, cls):
def __init__(self, sep: tuple[str, int], _cls: tuple[str, int]):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is _cls used at all? do we leave it for backwards compat?

Comment thread bindings/python/py_src/tokenizers/__init__.pyi
Comment thread bindings/python/src/bin/stub_generation.rs Outdated
Comment thread bindings/python/src/bin/stub_generation.rs Outdated
Comment thread bindings/python/src/utils/pretokenization.rs
Comment thread bindings/python/src/decoders.rs
Comment thread bindings/python/pyproject.toml
Comment thread bindings/python/stub.py
@ArthurZucker ArthurZucker marked this pull request as draft February 2, 2026 16:14
@ArthurZucker
Copy link
Copy Markdown
Collaborator Author

Waiting for docstring support!

- Replace complex modifications dict with simple insertions list
- Remove nested process_function_or_method function
- Use bottom-to-top line replacement for cleaner logic
- Remove unused importlib import
@ArthurZucker ArthurZucker marked this pull request as ready for review February 6, 2026 09:00
ArthurZucker and others added 2 commits February 6, 2026 10:22
- Move stub_generation.rs to tools/stub-gen/ as standalone crate
- Remove stub-gen feature and pyo3-introspection from main crate
- Auto-detect PYTHONHOME for uv/venv environments
- Update Makefile and README with new instructions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@McPatate McPatate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌🏻

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty 🙏🏻

@ArthurZucker ArthurZucker merged commit 50352f7 into main Feb 11, 2026
32 checks passed
@ArthurZucker ArthurZucker deleted the more-typehint branch February 11, 2026 13:26
@davidhewitt davidhewitt mentioned this pull request Feb 12, 2026
MayCXC pushed a commit to MayCXC/tokenizers that referenced this pull request Apr 4, 2026
…huggingface#1928)

* something that is supposed to work but my env does not allow it, seems to be uv related

* ?

* up

* nits

* let' s try

* part of tthe update for pyo3 0.27

* more pyo3 fixes

* update

* does this help?

* help

* finally

* update stub accordingly

* export more of the submodules

* moooore

* add individual .pypi

* cleanup

* update pyo3 signatures and fix warning

* style

* update

* more updates

* sytle

* clippy happy

* does this help?

* fix

* fix

* ?

* what?

* add dwarwub case co

* up?

* update

* clippy and fmt

* this time it works

* remove offending one

* update

* remove shit

* remove more shit that was unwanted

* ?

* simplify a bit

* more verbose?

* more simplification

* fmt

* fix some of the typing in rust directly to please TY (but also just fix some typing.Any

* fix script running

* fix , ignore and exclude

* style

* update

* fmt + add it to style?

* cleanup

* Simplify stub.py docstring injection

- Replace complex modifications dict with simple insertions list
- Remove nested process_function_or_method function
- Use bottom-to-top line replacement for cleaner logic
- Remove unused importlib import

* isolate stub generation into separate tools/stub-gen crate

- Move stub_generation.rs to tools/stub-gen/ as standalone crate
- Remove stub-gen feature and pyo3-introspection from main crate
- Auto-detect PYTHONHOME for uv/venv environments
- Update Makefile and README with new instructions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants