Skip to content

stubtest crash with special unicode characters #19071

Closed
@Avasam

Description

@Avasam
Contributor

Crash Report

I tried running stubtest on typeshed's networkx stubs

Traceback

networkx... Note: networkx is not currently tested on win32 in typeshed's CI.
(49.63 s) fail

**********************************************************************

Commands run:

C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Scripts\pip.exe install networkx[]==3.4.2 mypy==1.15.0 numpy>=1.20 pandas
MYPYPATH=stubs\networkx C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Scripts\python.exe -m mypy.stubtest --mypy-config-file C:\Users\Avasam\AppData\Local\Temp\tmpyz4as0cw --custom-typeshed-dir . networkx --allowlist stubs\networkx\@tests\stubtest_allowlist.txt

**********************************************************************

Command output:

error: networkx.readwrite.text.AsciiDirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'!'

error: networkx.readwrite.text.AsciiUndirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'|'

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2127, in <module>
    sys.exit(main())
             ~~~~^^
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2123, in main
    return test_stubs(parse_options(sys.argv[1:]))
  File "C:\Users\Avasam\AppData\Local\Temp\stubtest-y0dwwp7i\Lib\site-packages\mypy\stubtest.py", line 2015, in test_stubs
    print(error.get_description(concise=args.concise))
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Avasam\AppData\Roaming\uv\data\python\cpython-3.13.1-windows-x86_64-none\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u257d' in position 263: character maps to <undefined>

**********************************************************************

Python version: Python 3.13.1 (main, Dec 19 2024, 14:38:48) [MSC v.1942 64 bit (AMD64)]

Ran with the following environment:
mypy==1.15.0
mypy_extensions==1.1.0
networkx==3.4.2
numpy==2.2.5
pandas==2.2.3
pip==25.1.1
python-dateutil==2.9.0.post0
pytz==2025.2
six==1.17.0
typing_extensions==4.13.2
tzdata==2025.2

To Reproduce

  1. Clone type shed at commit ee5fcf264bcfddb14259f9f20601781f84834f29
  2. Comment out line networkx.readwrite.text.* in stubs/networkx/@tests/stubtest_allowlist.txt
  3. Run ./tests/stubtest_third_party.py networkx

Your Environment

  • Mypy version used: mypy 1.15.0 (compiled: yes)
  • Mypy command-line flags: python.exe -m mypy.stubtest --mypy-config-file C:\Users\Avasam\AppData\Local\Temp\tmpyz4as0cw --custom-typeshed-dir . networkx --allowlist stubs\networkx\@tests\stubtest_allowlist.txt
  • Mypy configuration options from mypy.ini (and other config files): None afaik
  • Python version used: Python 3.13.1
  • Operating system and version: Version 10.0.19045 Build 19045

Could be related to #11031

Activity

sterliakov

sterliakov commented on May 10, 2025

@sterliakov
Collaborator

Does it crash if you execute a windows equivalent of export PYTHONUTF8=1 (docs) before running stubtest?

A sensible default for mypy would be to use open(encoding="utf-8") for all files that can contain parts of python code, as (hopefully) no one uses # -*- coding comments to set non-utf-8 source encoding nowadays, and python sources are by default UTF-8 (we can add parsing of those later if needed). This is indeed the same root cause as #11031 but different symptoms.

Avasam

Avasam commented on May 12, 2025

@Avasam
SponsorContributorAuthor

Adding {"PYTHONUTF8": "1"} to the subprocess.run's env parameter that calls stubtest did work to prevent the crash: python/typeshed@080fb80#diff-468e55534823989bcfd0a85e82ccf8e4376c4115cc80c27a64795e35c1b5853c

sterliakov

sterliakov commented on May 12, 2025

@sterliakov
Collaborator

Amazing! I'll try to open an exploration PR in a few days. Any file that contains python code obtained from .py files should be UTF8-encoded, excluding cases with magic coding comments. Defaulting to system encoding is worse than defaulting to UTF8 - python code has been UTF8 on all platforms for a long time. This won't be a problem in 3.15 (PEP686), but we probably don't want to wait that long:)

added a commit that references this issue on Jun 20, 2025
f97a56e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @Avasam@sterliakov

      Issue actions

        stubtest crash with special unicode characters · Issue #19071 · python/mypy