Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PEP 681: dataclass_transform #7437

Open
mjhaye opened this issue Sep 8, 2022 · 10 comments
Open

Add support for PEP 681: dataclass_transform #7437

mjhaye opened this issue Sep 8, 2022 · 10 comments
Labels
Astroid Related to astroid dataclasses Help wanted 🙏 Outside help would be appreciated, good for new contributors Needs astroid update Needs an astroid update (probably a release too) before being mergable

Comments

@mjhaye
Copy link

mjhaye commented Sep 8, 2022

Bug description

When building my own dataclass decorator based on the stdlib dataclass, the type of the fields initialized via a field call is inferred incorrectly. This triggers errors every time the field is accessed (and thus this cannot be silenced locally).

# pylint: disable=missing-docstring,unnecessary-lambda,too-few-public-methods
from dataclasses import dataclass, field

def mydataclass(cls):
    cls.myfun = lambda self: print(self)
    return dataclass(cls)

@mydataclass
class TestDC:
    a_dict: dict[int, int] = field(default_factory=dict)

dc = TestDC()
dc.a_dict[0] = 0

I did some research in the issues here and at astroid and it appears to be related to the difficulty to determine if an object, TestDC in this case, is a dataclass. It seems to me, as a laymen, that in brains_dataclasses.py it is more or less hard-coded if something is a dataclass (stdlib, pydantic, and marshmallow_dataclass are supported).

A suggestion: Stdlib provides the is_dataclass() function that checks for a __dataclass_fields__ attribute. Maybe this could be a "generic" way of checking if a class is a dataclass. It would work for stdlib, pydantic and marshmallow_dataclass, but also on all homebrew dataclass adoptations that are derived from that.

Configuration

No response

Command used

pylint dataclass_pylint.py

Pylint output

************* Module dataclass_pylint
dataclass_pylint.py:13:0: E1137: 'dc.a_dict' does not support item assignment (unsupported-assignment-operation)

Expected behavior

No error.

Pylint version

pylint 2.15.2
astroid 2.12.9
Python 3.10.7 (tags/v3.10.7:6cc6b13, Sep  5 2022, 14:08:36) [MSC v.1933 64 bit (AMD64)]

OS / Environment

W10

Additional dependencies

No response

@mjhaye mjhaye added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label Sep 8, 2022
@DanielNoord DanielNoord added Regression dataclasses and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Sep 9, 2022
@DanielNoord DanielNoord self-assigned this Sep 9, 2022
@DanielNoord
Copy link
Collaborator

The issue with using __dataclass_fields__ is that we don't have access to it when we determine what is and what isn't a dataclass.

This recognition happens when we look at the definition of a class and check its decorators. However, there is no way to determine that mydataclass sets the __dataclass_fields__ attribute.

This problem is solved by https://docs.python.org/3.11/whatsnew/3.11.html#pep-681-data-class-transforms in Python 3.11. We should probably open an issue on the astroid checker about support for such transforms.

@DanielNoord DanielNoord closed this as not planned Won't fix, can't repro, duplicate, stale Sep 9, 2022
@Pierre-Sassoulas
Copy link
Member

We should probably open an issue on the astroid checker about support for such transforms.

Should we move this issue to astroid ?

@mjhaye
Copy link
Author

mjhaye commented Sep 11, 2022

Ah, I see. Good that 3.11 has a robust solution for this.

Maybe another solution that can work now already: I guess you can detect that a class variable is initialised via the field function of the stdlib dataclass module. Although this clearly does not catch all dataclasses, it does catch the ones that suffer from above false positives and thus could resolve this issue.

Is this a viable suggestion?

@DanielNoord
Copy link
Collaborator

Should we move this issue to astroid ?

We might need to open one there.

Ah, I see. Good that 3.11 has a robust solution for this.

Maybe another solution that can work now already: I guess you can detect that a class variable is initialised via the field function of the stdlib dataclass module. Although this clearly does not catch all dataclasses, it does catch the ones that suffer from above false positives and thus could resolve this issue.

Is this a viable suggestion?

I'm not sure. That seems like a workaround for something that doesn't happen too often. It might incur too much performance loss for the cases where it does happen. In 3.11 this is fixed anyway so I think it might be better to focus on supporting that syntax and then potentially think about such heuristics for lower versions.

@mjhaye
Copy link
Author

mjhaye commented Oct 20, 2022

@DanielNoord
I tried to implement a work-around locally via a pylint plugin. Within the plugin I modified the contents of DATACLASS_MODULES in brain_dataclasses.py to include my own decorator. However, this seems to have no effect. Do you have any suggestions if there is a fix I can do locally (without modifying astroid / pylint themselves)?

@DanielNoord
Copy link
Collaborator

Could you show your changes? Something like that should work..

@mjhaye
Copy link
Author

mjhaye commented Oct 21, 2022

The plugin code looks like this:

from typing import TYPE_CHECKING
import astroid.brain.brain_dataclasses as brain_dataclasses

brain_dataclasses.DATACLASS_MODULES = brain_dataclasses.DATACLASS_MODULES | {"mymodule"}

if TYPE_CHECKING:
    from pylint.lint import PyLinter

def register(linter: "PyLinter") -> None:
    pass

Some debugging shows that the transforms in brain_dataclasses are a being applied before the plugin is imported and thus the patch is applied too late.

Currently my work-around (just in the code, not a plugin) is a bit too hacky for my liking:

mod = ModuleType("marshmallow_dataclass")
setattr(mod, "dataclass", mydataclass)
sys.modules["marshmallow_dataclass"] = mod
from marshmallow_dataclass import dataclass 

@Pierre-Sassoulas
Copy link
Member

Reopening so the latest discussion is not lost.

@Pierre-Sassoulas
Copy link
Member

Here's an useful example from @Apxdono in #8833:

Bug description

Custom dataclass-like decorators (marked with dataclass_transform) exhibit issues similar to ones described in #4899, #7884 etc.

Whenever a dataclass "container" property is declared as field(default_factory=list|dict|etc...), pylint argues that particular methods and operations aren't applicable/available for this property.

Below example will produce 4 distinct pylint errors. The only way to resolve them is to remove = field(...) declarations.

# issue.py

import dataclasses
from dataclasses import field
from typing import Dict, List, Optional, TypeVar, dataclass_transform

_C = TypeVar("_C", bound=type)


@dataclass_transform(field_specifiers=(dataclasses.field,), kw_only_default=True)
def real_class(cls: _C) -> _C:
    data_cls = dataclasses.dataclass(cls, kw_only=True, slots=True)  # type: ignore
    return data_cls


@real_class
class Vehicle:
    model: str
    wheel_count: int = field(default=4)


@real_class
class Parking:
    parked_vehicles: List[Vehicle] = field(default_factory=list)
    spots: Dict[int, Optional[Vehicle]] = field(default_factory=dict)

    def park_vehicle(self, vehicle: Vehicle, spot: int):
        self.parked_vehicles.append(vehicle)
        self.spots[spot] = vehicle


def main():
    parking = Parking()
    parking.park_vehicle(Vehicle(model="lamborghini"), 5)
    parking.park_vehicle(Vehicle(model="prius"), 3)
    parking.park_vehicle(Vehicle(model="volkswagen"), 22)

    for spot, veh in parking.spots.items():
        if veh and veh in parking.parked_vehicles:
            print(f"Spot #{spot}: {veh.model}")


if __name__ == "__main__":
    main()

Command used

$ pylint issue.py

Pylint output

************* Module attrcheck.issue
issue.py:26:8: E1101: Instance of 'Field' has no 'append' member (no-member)
issue.py:27:8: E1137: 'self.spots' does not support item assignment (unsupported-assignment-operation)
issue.py:36:21: E1101: Instance of 'Field' has no 'items' member (no-member)
issue.py:37:26: E1135: Value 'parking.parked_vehicles' doesn't support membership test (unsupported-membership-test)

------------------------------------------------------------------
Your code has been rated at 2.31/10 (previous run: 2.31/10, +0.00)

Expected behavior

$ pylint issue.py
--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

@mjhaye
Copy link
Author

mjhaye commented Aug 10, 2023

Has this now become the formal ticket for request for PEP681 support (I could not find another one)? If yes, maybe it is better to change the name, or open a new one.

@DanielNoord DanielNoord changed the title False positives with modified dataclass decorator and default value via field Add support for PEP 681: ``dataclass_transform§§ Aug 10, 2023
@DanielNoord DanielNoord changed the title Add support for PEP 681: ``dataclass_transform§§ Add support for PEP 681: dataclass_transform Aug 10, 2023
@DanielNoord DanielNoord added Help wanted 🙏 Outside help would be appreciated, good for new contributors Astroid Related to astroid labels Dec 5, 2023
@DanielNoord DanielNoord added the Needs astroid update Needs an astroid update (probably a release too) before being mergable label Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Astroid Related to astroid dataclasses Help wanted 🙏 Outside help would be appreciated, good for new contributors Needs astroid update Needs an astroid update (probably a release too) before being mergable
Projects
None yet
Development

No branches or pull requests

3 participants