Skip to content

stubgen: emit reduced field()/Field() in class stubs (with optional doc metadata)#3223

Open
tobyh-canva wants to merge 1 commit intofacebook:mainfrom
tobyh-canva:fix/stubgen-dataclass-field-ellipsis
Open

stubgen: emit reduced field()/Field() in class stubs (with optional doc metadata)#3223
tobyh-canva wants to merge 1 commit intofacebook:mainfrom
tobyh-canva:fix/stubgen-dataclass-field-ellipsis

Conversation

@tobyh-canva
Copy link
Copy Markdown

@tobyh-canva tobyh-canva commented Apr 24, 2026

Context

As identified in #3221, stub generation for class body annotated assignments only handled simple RHS values (e.g. literals, ...). dataclasses.field / Pydantic Field calls are not "simple" text, so stubs either dropped the real __init__ / field behavior or were forced into lossy workarounds. Callers that care about kw_only, defaults, and field metadata need something closer to the real declaration without copying the whole module.

Intent

Make .pyi output for field(...) and Field(...) type-checking oriented: keep a reduced call that still reflects init-altering options, strip runtime-only or validation noise, and avoid echoing heavy default_factory bodies. Optional Pydantic schema doc strings on Field should follow the same include_docstrings switch as other stub doc content.

Changes

  • Stub extract (extract.rs)
    • For class AnnAssign RHS, after simple literals, try a structured strip of field / Field (by callee name, including dataclasses.field / pydantic.Field-style attributes).
    • Whitelist kwargs per "kind" (dataclass vs Pydantic Field): defaults, default_factory, init, kw_only (dataclass), aliasing-related kwargs (Pydantic), etc.; drop the rest (e.g. metadata, min_length, …).
    • default_factory: keep only simple callable refs (bare name or a.b chains); otherwise emit default_factory=lambda: ....
    • With include_docstrings, Pydantic Field also keeps description and title.
    • Fallback: if the RHS is not a re-printable field/Field (e.g. *args), use the solver plus ClassField::dataclass_flags_of to emit = ... when the solved field’s init parameter has a default, matching the existing init-default notion (with the documented Pydantic caveat for dual name/alias params).
  • ClassField: dataclass_flags_of is pub(crate) for stubgen.
  • Tests (stubgen/mod.rs): dataclass + Pydantic coverage for stripping, default_factory, complex factory placeholder, mixed fields, and docstring retention under config.

Testing

  • cargo test -p pyrefly stubgen::tests::

Fixes #3221.

…oc metadata)

- Re-emit dataclass field() / Pydantic Field() with type-relevant kwargs; strip metadata/validation
- Simplify default_factory to simple refs or lambda: ... placeholder
- include_docstrings keeps Field description/title; solver fallback for = ...

Made-with: Cursor
@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Apr 24, 2026

Hi @tobyh-canva!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@tobyh-canva tobyh-canva changed the title stubgen: emit reduced field()/Field() in class stubs (with optional doc metadata) stubgen: emit reduced field()/Field() in class stubs (with optional doc metadata) Apr 24, 2026
@tobyh-canva
Copy link
Copy Markdown
Author

tobyh-canva commented Apr 24, 2026

I'm not 100% sure if this is the best approach, I considered emitting any field with a default as = ..., but unfortunately this causes information like kw_only = True and init = False to be lost. Plus, keeping the field description when --include-docstrings is enabled is kinda neat.

I guess an alternative approach would be for stubgen to emit an explicit __init__() constructor, derived from the fields. Presumably, this would make the type-checker perform better when run on these stubs, since more "inference" work is done ahead of time and materialised in the stubs.

Please let me know what approach you would prefer :)

@NathanTempest
Copy link
Copy Markdown
Contributor

Hey @tobyh-canva thanks for starting contributions to our repo, a necessary check before importing this into our repo is to sign the CLA - https://code.facebook.com/cla

@NathanTempest
Copy link
Copy Markdown
Contributor

Also a quick heads-up, this diff is also trying to achieve the same goal - #3225

Maybe this could be a collab between you both, I would suggest to reach out to them on discord to prevent any redundant commits / conflicts

@tobyh-canva
Copy link
Copy Markdown
Author

tobyh-canva commented Apr 27, 2026

Thanks @NathanTempest, I'll try and get myself added to Canva's corporate CLA tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(stubgen): dataclass fields with only field(...) / non-literal defaults lose = ... in generated stubs, breaking __init__ checking

2 participants