Stubtest: Fix `__mypy-replace` false positives #15689

AlexWaygood · 2023-07-16T15:47:12Z

Running stubtest on typeshed's stdlib stubs with the mypy master branch (or the mypy 1.5 branch) produces the following output currently:

error: pstats.FunctionProfile.__mypy-replace is not present at runtime
Stub: in file stdlib\pstats.pyi:32
def (*, ncalls: builtins.str =, tottime: builtins.float =, percall_tottime: builtins.float =, cumtime: builtins.float =, percall_cumtime: builtins.float =, file_name: builtins.str =, line_number: builtins.int =)
Runtime:
MISSING

error: pstats.StatsProfile.__mypy-replace is not present at runtime
Stub: in file stdlib\pstats.pyi:41
def (*, total_tt: builtins.float =, func_profiles: builtins.dict[builtins.str, pstats.FunctionProfile] =)
Runtime:
MISSING

Found 2 errors (checked 541 modules)

This is due to recent changes @ikonst made to mypy's dataclasses plugin, where mypy now generates internal __mypy-replace methods so as to perform more accurate typechecking of dataclasses.replace().

To fix this false positive, I've added a new slot to SymbolTableNode to indicate whether plugin-generated methods actually exist at runtime or not. If we don't want to do this, there are other approaches we could take, such as hardcoding in stubtest.py a list of problematic plugin-generated method names to avoid checking. I went for this approach as it felt more principled, but I'm happy to change the approach if adding a new slot to SymbolTableNode is problematic.

I tried adding this test to mypy/test/teststubtest.py:

diff --git a/mypy/test/teststubtest.py b/mypy/test/teststubtest.py
index 661d46e9f..80ece38ad 100644
--- a/mypy/test/teststubtest.py
+++ b/mypy/test/teststubtest.py
@@ -1842,6 +1842,27 @@ class StubtestUnit(unittest.TestCase):
             error=None,
         )

+    @collect_cases
+    def test_dataclasses(self) -> Iterator[Case]:
+        yield Case(
+            stub="from dataclasses import dataclass",
+            runtime="from dataclasses import dataclass",
+            error=None,
+        )
+        yield Case(
+            stub="""
+            @dataclass
+            class Foo:
+                x: int
+            """,
+            runtime="""
+            @dataclass
+            class Foo:
+                x: int
+            """,
+            error=None,
+        )
+

Unfortunately the test fails due to (I think) the fact that mypy doesn't currently generate __eq__ and __repr__ methods for dataclasses (see #12186). There's also something to do with _DT that I don't really understand:

======================================================================== FAILURES =========================================================================
______________________________________________________________ StubtestUnit.test_dataclasses ______________________________________________________________
[gw3] win32 -- Python 3.11.2 C:\Users\alexw\coding\mypy\venv\Scripts\python.exe

args = (<mypy.test.teststubtest.StubtestUnit testMethod=test_dataclasses>,), kwargs = {}
cases = [<mypy.test.teststubtest.Case object at 0x0000027188E74210>, <mypy.test.teststubtest.Case object at 0x0000027188E74250>], expected_errors = set()
c = <mypy.test.teststubtest.Case object at 0x0000027188E74250>, @py_assert1 = False
@py_format3 = "{'test_module...Foo.__repr__'} == set()\n~Extra items in the left set:\n~'test_module.Foo._DT'\n~'test_module.Foo.__eq__'\n~'test_module.Foo.__repr__'\n~Use -v to get more diff"
@py_format5 = "test_module.Foo._DT\n~test_module.Foo.__eq__\n~test_module.Foo.__repr__\n~\n>assert {'test_module...Foo.__repr__'} ==...he left set:\n~'test_module.Foo._DT'\n~'test_module.Foo.__eq__'\n~'test_module.Foo.__repr__'\n~Use -v to get more diff"
output = 'test_module.Foo._DT\ntest_module.Foo.__eq__\ntest_module.Foo.__repr__\n'
actual_errors = {'test_module.Foo._DT', 'test_module.Foo.__eq__', 'test_module.Foo.__repr__'}

    def test(*args: Any, **kwargs: Any) -> None:
        cases = list(fn(*args, **kwargs))
        expected_errors = set()
        for c in cases:
            if c.error is None:
                continue
            expected_error = c.error
            if expected_error == "":
                expected_error = TEST_MODULE_NAME
            elif not expected_error.startswith(f"{TEST_MODULE_NAME}."):
                expected_error = f"{TEST_MODULE_NAME}.{expected_error}"
            assert expected_error not in expected_errors, (
                "collect_cases merges cases into a single stubtest invocation; we already "
                "expect an error for {}".format(expected_error)
            )
            expected_errors.add(expected_error)
        output = run_stubtest(
            stub="\n\n".join(textwrap.dedent(c.stub.lstrip("\n")) for c in cases),
            runtime="\n\n".join(textwrap.dedent(c.runtime.lstrip("\n")) for c in cases),
            options=["--generate-allowlist"],
        )

        actual_errors = set(output.splitlines())
>       assert actual_errors == expected_errors, output
E       AssertionError: test_module.Foo._DT
E         test_module.Foo.__eq__
E         test_module.Foo.__repr__
E
E       assert {'test_module...Foo.__repr__'} == set()
E         Extra items in the left set:
E         'test_module.Foo._DT'
E         'test_module.Foo.__eq__'
E         'test_module.Foo.__repr__'
E         Use -v to get more diff

mypy\test\teststubtest.py:171: AssertionError
================================================================= short test summary info =================================================================
FAILED mypy/test/teststubtest.py::StubtestUnit::test_dataclasses - AssertionError: test_module.Foo._DT
============================================================== 1 failed, 48 passed in 6.78s ===============================================================

I verified manually, however, that this gets rid of all the new false positives when running stubtest on typeshed's stdlib.

ikonst · 2023-07-16T16:04:52Z

Seems reasonable, maybe I'd pick a name that's either closer to the aspect that interests stubtest ("exists_at_runtime") or closer to the reason it's used in the plugin (a way to serialize/cache types across phases).

Another option is to add an is_valid_identifier property that'll return False for those names since they contain a hyphen. I probably prefer this option :)

AlexWaygood · 2023-07-16T17:09:01Z

exists_at_runtime

I considered this, but felt like it could raise more questions than it answered. If I'm reading nodes.py, it makes me wonder why we might ever create a SymbolTableNode to represent a method/attribute that didn't exist at runtime. So I wanted to include "plugin" in the name somewhere.

or closer to the reason it's used in the plugin (a way to serialize/cache types across phases).

Not hugely keen on this idea either — what if we start generating fictional methods/attributes in a plugin in the future for something that isn't specifically related to this?

Another option is to add an is_valid_identifier property that'll return False for those names since they contain a hyphen.

Yeah, I wondered about just doing:

to_check = {name for name in stub.names if name.isidentifier()}

But it felt slightly less principled/relying on an implementation detail of a separate, unrelated part of mypy.

I probably prefer this option :)

As in, you prefer what I have currently or you'd prefer it if I changed it? :)

ikonst · 2023-07-16T20:47:10Z

Ok, I agree with you now.

Maybe a name that also suggests the use case of a symbol used internally?

How about internal? I assume you'd set plugin_generated anyway, so the flag doesn't have to encompass that aspect.

I'm starting to think we should've added a separate list of "internal symbols" and modified the serializer. This is becoming more bothersome. Another case in point: having to make the symbols class-private because otherwise I'd have a child class make an "incompatible override" of a mypy-replace of a parent class... We should nip this mess at the bud.

AlexWaygood · 2023-07-16T22:25:47Z

I'm starting to think we should've added a separate list of "internal symbols" and modified the serializer. This is becoming more bothersome. Another case in point: having to make the symbols class-private because otherwise I'd have a child class make an "incompatible override" of a mypy-replace of a parent class... We should nip this mess at the bud.

Yes, that sounds like it could be a better approach... Even if we make sure that we fix all bits of mypy now that assume all nodes in the symbol table actually exist at runtime, there's a risk we might introduce other bits of mypy in the future that accidentally make that assumption. So if we just don't add it to the symbol table at all, and store it somewhere else, that's probably much more future-proof

mypy/nodes.py

AlexWaygood · 2023-07-16T23:03:41Z

Maybe a name that also suggests the use case of a symbol used internally?

How about internal? I assume you'd set plugin_generated anyway, so the flag doesn't have to encompass that aspect.

Renamed the flag to internal_fictional.

Are you going to look at the alternative broader-refactoring solution you mentioned? I can try to take a look tomorrow, if not.

github-actions · 2023-07-16T23:22:22Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

ikonst · 2023-07-17T01:04:46Z

sgtm, we'll see how much free time I'll have next week. Starting a new job, so perhaps not much.

hauntsaninja · 2023-07-22T18:49:32Z

For potentially no reason at all, I'm a little nervous about adding fields to SymbolTableNode. I think the alternative of just filtering invalid identifiers in stubtest is quite sane and would be very safe to cherry pick to 1.5

AlexWaygood · 2023-07-22T18:53:36Z

For potentially no reason at all, I'm a little nervous about adding fields to SymbolTableNode. I think the alternative of just filtering invalid identifiers in stubtest is quite sane and would be very safe to cherry pick to 1.5

What's mypy's internal representation of a call-based typeddict? Would Foo here be represented as a TypeInfo, and if so, would x-y be a non-identifier entry in the symbol table that would be incorrectly filtered out if we just filtered out all non-identifer entries in stubtest?

Foo = TypedDict("Foo", {"x-y": int})

hauntsaninja · 2023-07-22T19:19:36Z

Good question! I think all the typeddict related stuff is in the typeddict_type attribute on TypeInfo, not in the symbol table

hauntsaninja · 2023-07-22T19:23:45Z

Okay, I verified that's the case. I also ran all over typeshed seeing when we'd have invalid identifiers, and the thing that came up was #10545 where I think this would be desirable

AlexWaygood · 2023-07-22T19:26:10Z

Okay, I verified that's the case. I also ran all over typeshed seeing when we'd have invalid identifiers, and the thing that came up was #10545 where I think this would be desirable

Okay, great! I'll go with that heuristic for now then

AlexWaygood · 2023-07-22T20:13:25Z

Okay, this is now a purely stubtest-specific workaround!

hauntsaninja

Thanks! We can still pursue the SymbolTableNode change if we want it for other reasons, but I can release this in the 1.5 series

…#15751)

Stubtest: Fix __mypy-replace false positives

9daef78

AlexWaygood requested a review from hauntsaninja July 16, 2023 15:47

AlexWaygood mentioned this pull request Jul 16, 2023

Mypy 1.5 Release Planning #15588

Closed

This comment has been minimized.

Sign in to view

Rename flag to internal_fictional

b1c008b

AlexWaygood commented Jul 16, 2023

View reviewed changes

mypy/nodes.py Outdated Show resolved Hide resolved

Update mypy/nodes.py

54347e5

AlexWaygood added 3 commits July 22, 2023 20:28

Revert alll changes

e83b124

Merge branch 'master' into stubtest-dataclasses

20a17ca

Just filter out all non-identifiers from stubtest names to check

d9ca332

hauntsaninja approved these changes Jul 22, 2023

View reviewed changes

hauntsaninja merged commit 6bdcc92 into python:master Jul 22, 2023
13 checks passed

AlexWaygood deleted the stubtest-dataclasses branch July 22, 2023 20:47

AlexWaygood added a commit to AlexWaygood/mypy that referenced this pull request Jul 22, 2023

stubtest: Fix __mypy-replace false positives (python#15689)

e1599c4

hauntsaninja pushed a commit that referenced this pull request Jul 22, 2023

[release 1.5] stubtest: Fix __mypy-replace false positives (#15689) (…

2ff7c0d

…#15751)

wenkokke mentioned this pull request Aug 21, 2023

Fix for missing stub files Issue #120 wenkokke/py-tree-sitter-talon#121

Open

JelleZijlstra mentioned this pull request Aug 26, 2023

Syntax error in generated stub when using dataclass #15966

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stubtest: Fix `__mypy-replace` false positives #15689

Stubtest: Fix `__mypy-replace` false positives #15689

AlexWaygood commented Jul 16, 2023

ikonst commented Jul 16, 2023

This comment has been minimized.

AlexWaygood commented Jul 16, 2023

ikonst commented Jul 16, 2023

AlexWaygood commented Jul 16, 2023

AlexWaygood commented Jul 16, 2023

github-actions bot commented Jul 16, 2023

ikonst commented Jul 17, 2023

hauntsaninja commented Jul 22, 2023 •

edited

AlexWaygood commented Jul 22, 2023 •

edited

hauntsaninja commented Jul 22, 2023

hauntsaninja commented Jul 22, 2023

AlexWaygood commented Jul 22, 2023

AlexWaygood commented Jul 22, 2023

hauntsaninja left a comment

Stubtest: Fix __mypy-replace false positives #15689

Stubtest: Fix __mypy-replace false positives #15689

Conversation

AlexWaygood commented Jul 16, 2023

ikonst commented Jul 16, 2023

This comment has been minimized.

AlexWaygood commented Jul 16, 2023

ikonst commented Jul 16, 2023

AlexWaygood commented Jul 16, 2023

AlexWaygood commented Jul 16, 2023

github-actions bot commented Jul 16, 2023

ikonst commented Jul 17, 2023

hauntsaninja commented Jul 22, 2023 • edited

AlexWaygood commented Jul 22, 2023 • edited

hauntsaninja commented Jul 22, 2023

hauntsaninja commented Jul 22, 2023

AlexWaygood commented Jul 22, 2023

AlexWaygood commented Jul 22, 2023

hauntsaninja left a comment

Choose a reason for hiding this comment

Stubtest: Fix `__mypy-replace` false positives #15689

Stubtest: Fix `__mypy-replace` false positives #15689

hauntsaninja commented Jul 22, 2023 •

edited

AlexWaygood commented Jul 22, 2023 •

edited