Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe performance degradation for tracing under 3.11 #93516

Open
nedbat opened this issue Jun 5, 2022 · 59 comments
Open

Severe performance degradation for tracing under 3.11 #93516

nedbat opened this issue Jun 5, 2022 · 59 comments
Assignees
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@nedbat
Copy link
Member

nedbat commented Jun 5, 2022

Bug report

Coverage.py is seeing a significant increase in overhead for tracing code in 3.11 compared to 3.10: nedbat/coveragepy#1287

As an example:

cov proj python3.10 python3.11 3.11 vs 3.10
none bug1339.py 0.184 s 0.142 s 76%
none bm_sudoku.py 10.789 s 9.901 s 91%
none bm_spectral_norm.py 14.305 s 9.185 s 64%
6.4.1 bug1339.py 0.450 s 0.854 s 189%
6.4.1 bm_sudoku.py 27.141 s 55.504 s 204%
6.4.1 bm_spectral_norm.py 36.793 s 67.970 s 184%

(This is the output of lab/benchmark.py.)

Your environment

  • CPython versions tested on: 3.10, 3.11
  • Operating system and architecture: MacOS, Intel

Linked PRs

@nedbat nedbat added the type-bug An unexpected behavior, bug, or error label Jun 5, 2022
@nedbat
Copy link
Member Author

nedbat commented Jun 5, 2022

@pablogsal @markshannon

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Jun 5, 2022

Related issue where a user reported that code with cProfile slowed down 1.6x, and the only thing I could pinpoint was that tracing itself slowed down, not cProfile #93381.

@sweeneyde

This comment was marked as off-topic.

@sweeneyde

This comment was marked as off-topic.

@sweeneyde
Copy link
Member

I now realize this is probably due to using the Python tracer rather than the C tracer, will try again.

@sweeneyde
Copy link
Member

Okay, now with Code coverage for Python, version 6.4.1 with C extension.

Script used:
import coverage
from time import perf_counter

def fib(n):
    if n <= 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

t0 = perf_counter()
cov = coverage.Coverage()
cov.start()
fib(35)
cov.stop()
cov.save()
t1 = perf_counter()
print(t1 - t0)
Profile on 3.10 branch: took 13.3 seconds

Functions taking more than 1% of CPU:

Function Name Total CPU [unit, %] Self CPU [unit, %] Module
| - _PyEval_EvalFrameDefault 12372 (99.78%) 3540 (28.55%) python310.dll
| - [External Call] tracer.cp310-win_amd64.pyd 2769 (22.33%) 1474 (11.89%) tracer.cp310-win_amd64.pyd
| - maybe_call_line_trace 3872 (31.23%) 1206 (9.73%) python310.dll
| - _PyCode_CheckLineNumber 1168 (9.42%) 988 (7.97%) python310.dll
| - call_trace 3431 (27.67%) 548 (4.42%) python310.dll
| - PyLong_FromLong 370 (2.98%) 369 (2.98%) python310.dll
| - _PyFrame_New_NoTrack 385 (3.11%) 338 (2.73%) python310.dll
| - lookdict_unicode_nodummy 324 (2.61%) 324 (2.61%) python310.dll
| - call_trace_protected 2107 (16.99%) 287 (2.31%) python310.dll
| - frame_dealloc 270 (2.18%) 270 (2.18%) python310.dll
| - set_add_entry 264 (2.13%) 264 (2.13%) python310.dll
| - PyDict_GetItem 579 (4.67%) 255 (2.06%) python310.dll
| - _PyEval_MakeFrameVector 633 (5.11%) 248 (2.00%) python310.dll
| - _PyEval_Vector 12372 (99.78%) 196 (1.58%) python310.dll
| - PyLineTable_NextAddressRange 180 (1.45%) 180 (1.45%) python310.dll
| - call_function 12372 (99.78%) 169 (1.36%) python310.dll
| - PyObject_RichCompare 457 (3.69%) 169 (1.36%) python310.dll
| - binary_op1 486 (3.92%) 157 (1.27%) python310.dll
| - long_richcompare 207 (1.67%) 157 (1.27%) python310.dll
Profile on main branch: took 23.0 seconds

Functions taking more than 1% of CPU:

Function Name Total CPU [unit, %] Self CPU [unit, %] Module
| - _PyEval_EvalFrameDefault 22180 (99.87%) 4448 (20.03%) python312.dll
| - _PyCode_CheckLineNumber 5389 (24.27%) 2583 (11.63%) python312.dll
| - maybe_call_line_trace 6461 (29.09%) 1966 (8.85%) python312.dll
| - retreat 2242 (10.10%) 1644 (7.40%) python312.dll
| - [External Call] tracer.cp312-win_amd64.pyd 6274 (28.25%) 1428 (6.43%) tracer.cp312-win_amd64.pyd
| - get_line_delta 1162 (5.23%) 1162 (5.23%) python312.dll
| - unicodekeys_lookup_unicode 844 (3.80%) 728 (3.28%) python312.dll
| - _Py_dict_lookup 1476 (6.65%) 632 (2.85%) python312.dll
| - call_trace 9802 (44.14%) 630 (2.84%) python312.dll
| - siphash13 442 (1.99%) 442 (1.99%) python312.dll
| - PyUnicode_New 745 (3.35%) 376 (1.69%) python312.dll
| - pymalloc_alloc 362 (1.63%) 362 (1.63%) python312.dll
| - _PyType_Lookup 1786 (8.04%) 319 (1.44%) python312.dll
| - set_add_entry 271 (1.22%) 271 (1.22%) python312.dll
| - PyDict_GetItem 824 (3.71%) 242 (1.09%) python312.dll
| - call_trace_protected 8562 (38.55%) 236 (1.06%) python312.dll
| - PyNumber_Subtract 449 (2.02%) 221 (1.00%) python312.dll

@sweeneyde
Copy link
Member

This seems to suggest to me that this could be made much faster via a dedicated C API function:

https://github.com/nedbat/coveragepy/blob/master/coverage/ctracer/util.h#L41

#define MyCode_GetCode(co)      (PyObject_GetAttrString((PyObject *)(co), "co_code"))

@sweeneyde
Copy link
Member

Indeed, on main, if I add printf("lookup %s\n", (const char *)PyUnicode_DATA(key)); to the top of unicodekeys_lookup_unicode and run that fib script, I get the following hot loop repeated over and over:

lookup fib
lookup C:\Users\sween\Source\Repos\cpython2\cpython\cover.py
lookup C:\Users\sween\Source\Repos\cpython2\cpython\cover.py
lookup co_code
lookup fib
lookup C:\Users\sween\Source\Repos\cpython2\cpython\cover.py
lookup C:\Users\sween\Source\Repos\cpython2\cpython\cover.py
lookup co_code
...

Each of these calls to PyObject_GetAttrString(..., "co_code") requires PyUnicode_FromString->unicode_decode_utf8->PyUnicode_New->_PyObject_Malloc, followed by PyObject_GetAttr->_PyObject_GenericGetAttrWithDict->_PyType_Lookup->find_name_in_mro->(unicode_hash+_Py_Dict_Lookup)->unicodekeys_lookup_unicode, where it should be just a C API function call.

@sweeneyde
Copy link
Member

Ah! PyCode_GetCode exists since #92168

@markshannon
Copy link
Member

Some performance degradation under tracing is expected, but not as much as reported.
This is a deliberate tradeoff: faster execution when not tracing, for slower execution when tracing.

From the profiles that @sweeneyde gathered it looks like we are seeing slowdowns in:

  • _PyEval_EvalFrameDefault, which is expected
  • Calculation of line numbers. This is probably due to the new line table format.

The new line number table gives better error messages without taking up too much extra space, but it is more complex and thus slower to parse.
We should be able to speed it up, but don't expect things to be as fast as 3.10. Sorry.

Once nedbat/coveragepy#1394 is merged, we can look at the profile again to see if much has changed.

@sweeneyde
Copy link
Member

sweeneyde commented Jun 6, 2022

New profile of fib_cover.py on main branch, with the PyCode_GetCode addition (took 18.1 seconds)
Function Name Total CPU [unit, %] Self CPU [unit, %] Module
| - _PyEval_EvalFrameDefault 19239 (99.66%) 4481 (23.21%) python312.dll
| - _PyCode_CheckLineNumber 5344 (27.68%) 2676 (13.86%) python312.dll
| - maybe_call_line_trace 6397 (33.14%) 1997 (10.34%) python312.dll
| - retreat 2197 (11.38%) 1629 (8.44%) python312.dll
| - [External Call] tracer.cp312-win_amd64.pyd 3073 (15.92%) 1360 (7.04%) tracer.cp312-win_amd64.pyd
| - get_line_delta 1038 (5.38%) 1038 (5.38%) python312.dll
| - call_trace 6630 (34.34%) 605 (3.13%) python312.dll
| - unicodekeys_lookup_unicode 549 (2.84%) 548 (2.84%) python312.dll
| - _Py_dict_lookup 1060 (5.49%) 511 (2.65%) python312.dll
| - PyDict_GetItem 929 (4.81%) 266 (1.38%) python312.dll
| - set_add_entry 265 (1.37%) 265 (1.37%) python312.dll
| - PyNumber_Subtract 447 (2.32%) 226 (1.17%) python312.dll
| - call_trace_protected 5408 (28.01%) 222 (1.15%) python312.dll
| - initialize_locals 199 (1.03%) 199 (1.03%) python312.dll

I wonder: would there be any way to retain some specializations during tracing? Some specialized opcodes are certainly incorrect to use during tracing, e.g., STORE_FAST__LOAD_FAST. However, couldn't others be retained, e.g. LOAD_GLOBAL_MODULE, or any specialized opcode that covers only one unspecialized opcode? (EDIT: maybe not, if it uses NOTRACE_DISPATCH())

@pablogsal
Copy link
Member

pablogsal commented Jun 6, 2022

We should be able to speed it up, but don't expect things to be as fast as 3.10. Sorry.

Well, two times slower than 3.10, as some of @nedbat's benchmarks show, is not acceptable in my opinion. We should try to get this below 20% at the very least.

I marked this as a release blocker and this will block any future releases.

@markshannon
Copy link
Member

The problem with saying that something is unacceptable, is that implies there is an acceptable alternative.
What is that alternative?

What does it mean making this a release blocker? Are you not going to release 3.11 if coverage is >20% slower?

I don't think cutoffs, such as 20%, are constructive. We should do as well as we are reasonably able, regardless of whether we achieve some arbitrary number.
I expect that will be better than a 20% slowdown, but I'm not making any pronouncements.

@pablogsal
Copy link
Member

pablogsal commented Jun 7, 2022

What is that alternative?

I don't know, but there is always the possibility to revert the changes that made coverage slower (worst case scenario).

Are you not going to release 3.11 if coverage is >20% slower?

I will check with the SC, but I can already tell you that I would not be comfortable releasing 3.11 if is 2 times slower in coverage, unless the SC mandates otherwise.

I don't think cutoffs, such as 20%, are constructive. We should do as well as we are reasonably able, regardless of whether we achieve some arbitrary number.

The problem here is that this tradeoff was not discussed anywhere or advertised and no one had the possibility to pronounce or to make tradeoffs or otherwise, and therefore we cannot just "decide" out of the blue what is and what is not acceptable because the community didn't had the change to object.

@markshannon
Copy link
Member

The largest contributor to the slowdown is computing line numbers. The root cause of that is PEP 657. The proximate cause is my PR to reduce the memory consumed of PEP 657.
Would you revert PEP 657, or just accept a large increase in memory use?

I think it would be a lot more helpful if we fixed the issue, rather than making dire pronouncements.

the community didn't had the change to object.

Everything we do is public. I even opened nedbat/coveragepy#1287 back in November, so that @nedbat was aware that there might be slowdowns to coverage.py

@pablogsal
Copy link
Member

pablogsal commented Jun 7, 2022

The largest contributor to the slowdown is computing line numbers. The root cause of that is PEP 657.

That's not true and I feel is misleading. When line numbers and column offsets were separated, PEP 657 had 0 overhead on computing line numbers, so the root cause is certainly not PEP 657.

The root cause was merging line numbers and column offsets so you need to pay for decoding both just to get line numbers.

Everything we do is public. I even opened nedbat/coveragepy#1287 back in November, so that @nedbat was aware that there might be slowdowns to coverage.py

That's not enough. If that were true we would not need to do PEPs because "what we do is already public". The point of having discussion on mailing lists is precisely so everyone can be aware of the tradeoff and give us their opinions.

@pablogsal
Copy link
Member

In any case, there is no point to discuss in these terms because I am not the one making decisions here. As RM my only authority is to consider the current slowdown a blocker and ask for a bigger discussion to be taken if we cannot fix the issue.

Let's see what we can do and then let's discuss how to proceed once we understand our options.

@markshannon
Copy link
Member

The root cause was merging line numbers and column offsets so you need to pay for decoding both just to get line numbers.

No, that was the proximate cause. It wouldn't have been necessary if not for PEP 657.
I didn't implement it to make things slower, I implemented it because column and endline tables were consuming a lot of memory.

Claiming that PEP 657 had zero overhead is not true. The original implementation had a large overhead in terms on memory use.

@pablogsal
Copy link
Member

pablogsal commented Jun 7, 2022

I didn't implement it to make things slower, I implemented it because column and endline tables were consuming a lot of memory.

Mark, I understand that and we are on the same page. I know why you made that change, and IIRC I even reviewed it. The problem here is that that change (or other changes) had unexpected consequences on timing and we need to make decisions either trying to fix together it if we can or potentially reverting it we cannot. That's all I am saying

Claiming that PEP 657 had zero overhead is not true. The original implementation had a large overhead in terms on memory use.

The claim was that PEP 657 had zero overhead on the time computing the line numbers, not on memory.My sentence was:

PEP 657 had 0 overhead on computing line numbers, so the root cause is certainly not PEP 657.

@gpshead
Copy link
Member

gpshead commented Jun 8, 2022

Marking this as a release or deferred blocker is fine and entirely what I'd expect any RM to want to do. That just means we collect more information and make a decision before we exit beta stages.

I don't personally have a problem with coverage being slower, even by 2x. It's more about understanding what all is impacted by this and how so we can explain our measured rationale to users. ie: Look beyond just coverage which should only be a Developer/CI time tool rather than something slowing people's "production" release tool/application/notebook code.

Q1: What else uses tracing?

Q1b: What of those use it at other than development time?

These questions are informational and do not block performance work on reducing the impact to tracing.

@sweeneyde
Copy link
Member

How bad would it be to store an extra field into code objects that is the "decompressed" line number table, lazily allocated for only those code objects which are actually traced? Could it occupy co_extra or similar?

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Jun 8, 2022

How bad would it be to store an extra field into code objects that is the "decompressed" line number table, lazily allocated for only those code objects which are actually traced? Could it occupy co_extra or similar?

In #93383 (comment) I benchmarked adding an additional field to the code object and there was no slowdown in pyperformance. I do not know how memory consumption is affected.

I'm not sure co_extra is the right place for that info considering its for consumers of the PEP 523 API. Maybe I can merge your idea into a simple lazily created _PyCodeTracingInfo * C struct?

PyCodeObject {
    ...
     void *co_tracing_info
}

struct _PyCodeTracingInfo {
     // cached code bytes object
    PyObject *co_code;
     // decompressed line number table
    ...
}

@markshannon
Copy link
Member

How bad would it be to store an extra field into code objects that is the "decompressed" line number table, lazily allocated for only those code objects which are actually traced? Could it occupy co_extra or similar?

An early design for the compressed line number table did just this.
My concern, then and now, is that the uncompressed table is a lot bigger, and never goes away.

It isn't just tracing that needs line numbers, tracebacks need them too.
Which means that the uncompressed table would be created every time a code object ends up in a traceback.

I think we should see how much faster we can make tracing with the compressed table first. If that isn't good enough, then we can reconsidered the decompressed table.

@markshannon
Copy link
Member

@gpshead

Q1: What else uses tracing?

Debuggers and profiling tools, like cProfile.

Q1b: What of those use it at other than development time?

Hopefully, no one. Tracing has always imposed a considerable overhead.
Anyone using a profiler in production should be using a statistical profiler.

@fabioz
Copy link
Contributor

fabioz commented Jun 24, 2022

@fabioz are you using the latest released beta or the tip of the 3.11 branch?

@pablogsal I've tried with both (the benchmarks I posted are for Python 3.11 beta 3 and Python 3.11 tip which I compiled locally -- note that I did not compile with PGO just release -- i.e.: PCbuild\build.bat -e -p x64)... The differences from the tip to the release version weren't significant (some were a bit slower, some were a bit faster some almost the same, so, I'd say it's in the noise range).

@fabioz
Copy link
Contributor

fabioz commented Jun 24, 2022

Better formatted table for pydevd benchmarks.

Benchmark Python 3.10 (time in s) Python 3.11 beta 3 (time in s) Slower than 3.10 in % Python 3.11 tip ( 41e4b42 ) (time in s) Slower than 3.10 in %
method_calls_with_breakpoint 0,25 0,383 53,20% 0,38 52,00%
method_calls_without_breakpoint 0,247 0,381 54,25% 0,368 48,99%
method_calls_with_step_over 0,236 0,378 60,17% 0,357 51,27%
method_calls_with_exception_breakpoint 0,249 0,375 50,60% 0,38 52,61%
global_scope_1_with_breakpoint 0,557 0,854 53,32% 0,92 65,17%
global_scope_2_with_breakpoint 0,238 0,364 52,94% 0,337 41,60%

@hodgestar
Copy link

As a random Python developer, I would be unhappy with a 20% slowdown in the speed of coverage, under the assumption that that would make all of my continuous integration runs take 20% longer. :/

@nedbat
Copy link
Member Author

nedbat commented Jun 24, 2022

I did a benchmark, and the latest tip of 3.11 (commit c966e08) is looking better, though still slower than 3.10:

cov proj v3.10.5 v3.11.0b3 3.11_tip 3.11b3 vs 3.10 3.11 tip vs 3.10
6.4.1 bug1339.py 0.399 s 0.832 s 0.469 s 208.41% 117.42%
6.4.1 bm_sudoku.py 26.273 s 55.805 s 34.877 s 212.40% 132.75%
6.4.1 bm_spectral_norm.py 32.660 s 71.043 s 44.546 s 217.52% 136.39%

@gpshead
Copy link
Member

gpshead commented Jun 25, 2022

Quite frankly, those 3.11 tip numbers look well within reason to accept.

Tracing is a relatively rare operation in terms of all CPU cycles consumed by CPython processes on all computers around the world. It is worth sacrificing some tracing performance there if deemed necessary in order to gain performance where it is actually statistically significant: in deployed python programs.

@alex
Copy link
Member

alex commented Jun 25, 2022

As a top line conclusion, I agree that with these improvements this no longer needs to block the release.

That said, I think it's important to note that coverage performance has a substantial impact on developer experience. For many projects tests-with-coverage is a blocking element of their CI pipeline, so this performance ultimately dictates their iteration cycle time. It cannot happen for 3.11, but I think on a longer timeline it's worth exploring what can be done to improve coverage's performance (which perhaps is another external tool like https://github.com/plasma-umass/slipcover) -- particularly if it's going to be in tension with other optimizations we want to land for the general case!

@Fidget-Spinner
Copy link
Member

We still have Mark's PR #94231. Once that's merged I suspect the benchmark numbers for 3.11 tip will drop to below a 30% slowdown over 3.10.

@pablogsal
Copy link
Member

@nedbat @fabioz Can you repeat the benchmarks with #94231?

@nedbat
Copy link
Member Author

nedbat commented Jun 25, 2022

Yup, I just did that. About the same as without the change:

cov proj v3.10.5 v3.11.0b3 94231 3.11b3 vs 3.10 94231 vs 3.10
6.4.1 bug1339.py 0.517 s 1.155 s 0.572 s 223.22% 110.64%
6.4.1 bm_sudoku.py 29.552 s 59.221 s 36.008 s 200.40% 121.85%
6.4.1 bm_spectral_norm.py 33.720 s 77.685 s 49.937 s 230.38% 148.09%

serhiy-storchaka added a commit that referenced this issue Jun 26, 2022
* GH-93444: remove redundant fields from basicblock: b_nofallthrough, b_exit, b_return (GH-93445)

* netrc: Remove unused "import shlex" (#93311)

* gh-92886: Fix test that fails when running with `-O` in `test_imaplib.py` (#93237)

* Fix missing word in sys.float_info docstring (GH-93489)

* [doc] Correct a grammatical error in a docstring. (GH-93441)

* gh-93442: Make C++ version of _Py_CAST work with 0/NULL. (#93500)

Add C++ overloads for _Py_CAST_impl() to handle 0/NULL.  This will allow
C++ extensions that pass 0 or NULL to macros using _Py_CAST() to
continue to compile.  Without this, you get an error like:

    invalid ‘static_cast’ from type ‘int’ to type ‘_object*’

The modern way to use a NULL value in C++ is to use nullptr.  However,
we want to not break extensions that do things the old way.

Co-authored-by: serge-sans-paille

* gh-93442: Add test for _Py_CAST(nullptr). (gh-93505)

* gh-90473: wasmtime does not support absolute symlinks (GH-93490)

* gh-89973: Fix re.error in the fnmatch module. (GH-93072)

Character ranges with upper bound less that lower bound (e.g. [c-a])
are now interpreted as empty ranges, for compatibility with other glob
pattern implementations. Previously it was re.error.

* Document LOAD_FAST_CHECK opcode (#93498)

* gh-93247: Fix assert function in asyncio locks test (#93248)

* gh-90473: WASI requires proper open(2) flags (GH-93529)

* GH-92308 What's New: list pending removals in 3.13 and future versions (#92562)

* gh-90473: Skip POSIX tests that don't apply to WASI (GH-93536)

* asyncio.Barrier docs: Fix typo (#93371)

taks -> tasks

* gh-83728: Add hmac.new default parameter deprecation (GH-91939)

* gh-90473: Make chmod a dummy on WASI, skip chmod tests (GH-93534)

WASI does not have the ``chmod(2)`` syscall yet.

* Remove action=None kwarg from Barrier docs (GH-93538)

* [docs] fix some asyncio.Barrier.wait docs grammar (GH-93552)

* gh-93475: Expose FICLONE and FICLONERANGE constants in fcntl (#93478)

* gh-89018: Improve documentation of `sqlite3` exceptions (#27645)

- Order exceptions as in PEP 249
- Reword descriptions, so they match the current behaviour

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

* bpo-42658: Use LCMapStringEx in ntpath.normcase to match OS behaviour for case-folding (GH-32010)

* Fix contributor name in WhatsNew 3.11 (GH-93556)

* Grammar fix to socket error string (GH-93523)

* gh-86986: bump min sphinx version to 3.2 (GH-93337)

* gh-79096: Protect cookie file created by {LWP,Mozilla}CookieJar.save() (GH-93463)

Note: This change is not effective on Microsoft Windows.

Cookies can store sensitive information and should therefore be protected
against unauthorized third parties. This is also described in issue #79096.

The filesystem permissions are currently set to 644, everyone can read the
file. This commit changes the permissions to 600, only the creater of the file
can read and modify it. This improves security, because it reduces the attack
surface. Now the attacker needs control of the user that created the cookie or
a ways to circumvent the filesystems permissions.

This change is backwards incompatible. Systems that rely on world-readable
cookies will breake. However, one could argue that those are misconfigured in
the first place.

* gh-93162: Add ability to configure QueueHandler/QueueListener together (GH-93269)

Also, provide getHandlerByName() and getHandlerNames() APIs.

Closes #93162.

* gh-57539: Increase calendar test coverage (GH-93468)

Co-authored-by: Sean Fleming
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* gh-88831: In docs for asyncio.create_task, explain why strong references to tasks are needed (GH-93258)

Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* Shrink the LOAD_METHOD cache by one codeunit. (#93537)

* Fix MSVC compiler warnings in ceval.c (#93569)

* gh-93162: test_config_queue_handler requires threading (GH-93572)

* gh-84461: Emscripten's faccessat() does not accept flags (GHß92353)

* gh-92592: Allow logging filters to return a LogRecord. (GH-92591)

* Fix `PurePath.relative_to` links in the pathlib documentation. (GH-93268)

These are currently broken as they refer to :meth:`Path.relative_to` rather than :meth:`PurePath.relative_to`, and `relative_to` is a method on `PurePath`.

* GH-93481: Suppress expected deprecation warning in test_pyclbr (GH-93483)

* gh-93370: Deprecate sqlite3.version and sqlite3.version_info (#93482)

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>

* GH-93521: For dataclasses, filter out `__weakref__` slot if present in bases (GH-93535)

* gh-93421: Update sqlite3 cursor.rowcount only after SQLITE_DONE (#93526)

* gh-93584: Make all install+tests targets depends on all (GH-93589)

All install targets use the "all" target as synchronization point to
prevent race conditions with PGO builds. PGO builds use recursive make,
which can lead to two parallel `./python setup.py build` processes that
step on each others toes.

"test" targets now correctly compile PGO build in a clean repo.

* gh-87961: Remove outdated notes from functions that aren't in the Limited API (GH-93581)

* Remove outdated notes from functions that aren't in the Limited API

Nowadays everything that *is* in the Limited API has a note added
automatically.
These notes could mislead people to think that these functions
could never be added to the limited API. Remove them.

* Also remove forgotten note on tp_vectorcall_offset not being finalized

* gh-93180: Update os.copy_file_range() documentation (#93182)

* gh-93575: Use correct way to calculate PyUnicode struct sizes (GH-93602)

* gh-93575: Use correct way to calculate PyUnicode struct sizes

* Add comment to keep test_sys and test_unicode in sync

* Fix case code < 256

* gh-90473: Define HOSTRUNNER for WASI (GH-93606)

* gh-79096: Fix/improve http cookiejar tests (GH-93614)

Fixup of GH-93463:
- remove stray print
- use proper way to check file mode
- add working chmod decorator

Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* gh-93616: Fix env changed issue in test_modulefinder (GH-93617)

* gh-90494: Reject 6th element of the __reduce__() tuple (GH-93609)

copy.copy() and copy.deepcopy() now always raise a TypeError if
__reduce__() returns a tuple with length 6 instead of silently ignore
the 6th item or produce incorrect result.

* Doc: Update references and examples of old, unsupported OSes and uarches (GH-92791)

* bpo-45383: Get metaclass from bases in PyType_From* (GH-28748)

This checks the bases of of a type created using the FromSpec
API to inherit the bases metaclasses.  The metaclass's alloc
function will be called as is done in `tp_new` for classes
created in Python.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>

* Improve logging documentation with example and additional cookbook re… (GH-93644)

* gh-90473: disable user site packages on WASI/Emscripten (GH-93633)

* gh-90473: Skip get_config_h() tests on WASI (GH-93645)

* gh-90549: Fix leak of global named resources using multiprocessing spawn (#30617)

Co-authored-by: XD Trol <milestonejxd@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>

* gh-92434: Silence compiler warning in Modules/_sqlite/connection.c on 32-bit systems (#93090)

* gh-90763: Modernise xx template module initialisation (#93078)

Use C APIs such as PyModule_AddType instead of PyModule_AddObject.
Also remove incorrect module decrefs if module fails to initialise.

* gh-93491: Add support tier detection to configure (GH-93492)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>

* gh-93466: Document PyType_Spec doesn't accept repeated slot IDs; raise where this was problematic (GH-93471)

* gh-93671: Avoid exponential backtracking in deeply nested sequence patterns in match statements (GH-93680)

Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* gh-81790: support "UNC" device paths in `ntpath.splitdrive()` (GH-91882)

* GH-93621: reorder code in with/async-with exception exit path to reduce the size of the exception table (GH-93622)

* gh-93461: Invalidate sys.path_importer_cache entries with relative paths (GH-93653)

* gh-91317: Document that Path does not collapse initial `//` (GH-32193)



Documentation for `pathlib` says:

> Spurious slashes and single dots are collapsed, but double dots ('..') are not, since this would change the meaning of a path in the face of symbolic links:

However, it omits that initial double slashes also aren't collapsed.

Later, in documentation of `PurePath.drive`, `PurePath.root`, and `PurePath.name` it mentions UNC but:

- this abbreviation says nothing to a person who is unaware about existence of UNC (Wikipedia doesn't help either by [giving a disambiguation page](https://en.wikipedia.org/wiki/UNC))
- it shows up only if a person needs to use a specific property or decides to fully learn what the module provides.

For context, see the BPO entry.

* gh-92886: Fix tests that fail when running with optimizations (`-O`) in `test_zipimport.py` (GH-93236)

* gh-92930: _pickle.c: Acquire strong references before calling save() (GH-92931)

* gh-84461: Use HOSTRUNNER to run regression tests (GH-93694)

Co-authored-by: Brett Cannon <brett@python.org>

* gh-90473: Skip test_queue when threading is not available (GH-93712)

* gh-90153:  whatsnew: "z" option in format spec (GH-93624)

Add what's new entry for PEP 682 in Python 3.11.

* gh-86404: [doc] A make sucpicious false positive. (GH-93710)

* Change list to view object (#93661)

* gh-84508: tool to generate cjk traditional chinese mappings (gh-93272)

* Remove usage of _Py_IDENTIFIER from math module (#93739)

* gh-91162: Support splitting of unpacked arbitrary-length tuple over TypeVar and TypeVarTuple parameters (alt) (GH-93412)

For example:

  A[T, *Ts][*tuple[int, ...]] -> A[int, *tuple[int, ...]]
  A[*Ts, T][*tuple[int, ...]] -> A[*tuple[int, ...], int]

* gh-93728: fix memory leak in deepfrozen code objects (GH-93729)

* gh-93747: Fix Refleak when handling multiple Py_tp_doc slots (gh-93749)

* GH-90699: use statically allocated strings in typeobject.c (gh-93751)

* Add more FOR_ITER specialization stats (GH-32151)

* gh-89653: PEP 670: Convert PyFunction macros (#93765)

Convert PyFunction macros to static inline functions.

* Remove ANY_VARARGS() macro from the C API (#93764)

The macro was exposed by mistake.

* gh-84623: Remove unused imports in stdlib (#93773)

* gh-91731: Don't define 'static_assert' in C++11 where is a keyword to avoid UB (GH-93700)

* gh-84623: Remove unused imports in tests (#93772)

* gh-93353: Fix importlib.resources._tempfile() finalizer (#93377)

Fix the importlib.resources.as_file() context manager to remove the
temporary file if destroyed late during Python finalization: keep a
local reference to the os.remove() function. Patch by Victor Stinner.

* gh-84461: Fix parallel testing on WebAssembly (GH-93768)

* gh-89653: PEP 670: Macros always cast arguments in cpython/ (#93766)

Header files in the Include/cpython/ are only included if
the Py_LIMITED_API macro is not defined.

* gh-93353: Add test.support.late_deletion() (#93774)

* gh-93741: Add private C API _PyImport_GetModuleAttrString() (GH-93742)

It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.

Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.

* gh-79512: Fixed names and __module__ value of weakref classes (GH-93719)

Classes ReferenceType, ProxyType and CallableProxyType have now correct
atrtributes __module__, __name__ and __qualname__.
It makes them (types, not instances) pickleable.

* gh-91810: Fix regression with writing an XML declaration with encoding='unicode' (GH-93426)

Suppress writing an XML declaration in open files in ElementTree.write()
with encoding='unicode' and xml_declaration=None.

If file patch is passed to ElementTree.write() with encoding='unicode',
always open a new file in UTF-8.

* gh-93761: Fix test to avoid simple delay when synchronizing. (GH-93779)

* gh-89546: Clean up PyType_FromMetaclass (GH-93686)



When changing PyType_FromMetaclass recently (GH-93012, GH-93466, GH-28748)
I found a bunch of opportunities to improve the code. Here they are.

Fixes: #89546

Automerge-Triggered-By: GH:encukou

* gh-91321: Fix compatibility with C++ older than C++11 (#93784)

Fix the compatibility of the Python C API with C++ older than C++11.

_Py_NULL is only defined as nullptr on C++11 and newer.

* GH-93662: Make sure that column offsets are correct in multi-line method calls. (GH-93673)

* GH-93516: Store offset of first traceable instruction in code object (GH-93769)

* gh-90473: Include stdlib dir in wasmtime PYTHONPATH (GH-93797)

* GH-93429: Merge `LOAD_METHOD` back into `LOAD_ATTR` (GH-93430)

* gh-93353: regrtest checks for leaked temporary files (#93776)

When running tests with -jN, create a temporary directory per process
and mark a test as "environment changed" if a test leaks a temporary
file or directory.

* gh-79579: Improve DML query detection in sqlite3 (#93623)

The fix involves using pysqlite_check_remaining_sql(), not only to check
for multiple statements, but now also to strip leading comments and
whitespace from SQL statements, so we can improve DML query detection.

pysqlite_check_remaining_sql() is renamed lstrip_sql(), to more
accurately reflect its function, and hardened to handle more SQL comment
corner cases.

* GH-93678: reduce boilerplate and code repetition in the compiler (GH-93682)

* gh-91877: Fix WriteTransport.get_write_buffer_{limits,size} docs (#92338)

- Amend docs for WriteTransport.get_write_buffer_limits
- Add docs for WriteTransport.get_write_buffer_size

* GH-93429: Document `LOAD_METHOD` removal (GH-93803)

* Include freelists in allocation total. (GH-93799)

* gh-93795: Use test.support TESTFN/unlink in sqlite3 tests (#93796)

* Remove LOAD_METHOD stats. (GH-93807)

* Rename 'LOAD_METHOD' specialization stat consts to 'ATTR'. (GH-93812)

* gh-93353: Fix regrtest for -jN with N >= 2 (GH-93813)

* [docs] Fix LOAD_ATTR version changed (GH-93816)

* gh-93814: Add infinite test for itertools.chain.from_iterable (GH-93815)



fix #93814

Automerge-Triggered-By: GH:rhettinger

* gh-93735: Split Docs CI to speed-up the build (GH-93736)

* gh-93183: Adjust wording in socket docs (#93832)

package => packet

Co-authored-by: Victor Norman

* gh-93829: In sqlite3, replace Py_BuildValue with faster APIs (#93830)

- In Modules/_sqlite/connection.c, use PyLong_FromLong
- In Modules/_sqlite/microprotocols.c, use PyTuple_Pack

* Add test.support.busy_retry() (#93770)

Add busy_retry() and sleeping_retry() functions to test.support.

* gh-87260: Update sqlite3 signature docs to reflect actual implementation (#93840)

Align the docs for the following methods with the actual implementation:

- sqlite3.complete_statement()
- sqlite3.Connection.create_function()
- sqlite3.Connection.create_aggregate()
- sqlite3.Connection.set_progress_handler()

* test_thread uses support.sleeping_retry() (#93849)

test_thread.test_count() now fails if it takes longer than
LONG_TIMEOUT seconds.

* Use support.sleeping_retry() and support.busy_retry() (#93848)

* Replace time.sleep(0.010) with sleeping_retry() to
  use an exponential sleep.
* support.wait_process(): reuse sleeping_retry().
* _test_eintr: remove unused variables.

* Update includes in call.c (GH-93786)

* gh-93857: Fix broken audit-event targets in sqlite3 docs (#93859)

Corrected targets for the following audit-events:

- sqlite3.enable_load_extension => sqlite3.Connection.enable_load_extension
- sqlite3.load_extension => sqlite3.Connection.load_extension

* GH-93850: Fix test_asyncio exception ignored tracebacks (#93854)

* gh-93824: Reenable installation of shell extension on Windows ARM64 (GH-93825)

* test_asyncio: run_until() implements exponential sleep (#93866)

run_until() of test.test_asyncio.utils now uses an exponential sleep
delay (max: 1 second), rather than a fixed delay of 1 ms. Similar
design than support.sleeping_retry() wait strategy that applies
exponential backoff.

* test_asyncore: Optimize capture_server() (#93867)

Remove time.sleep(0.01) in test_asyncore capture_server(). The sleep
was redundant and inefficient, since the loop starts with
select.select() which also implements a sleep (poll for socket data
with a timeout).

* Tests call sleeping_retry() with SHORT_TIMEOUT (#93870)

Tests now call busy_retry() and sleeping_retry() with SHORT_TIMEOUT
or LONG_TIMEOUT (of test.support), rather than hardcoded constants.

Add also WAIT_ACTIVE_CHILDREN_TIMEOUT constant to
_test_multiprocessing.

* gh-84461: Document how to install SDKs manually (GH-93844)

Co-authored-by: Brett Cannon <brett@python.org>

* gh-93820: Fix copy() regression in enum.Flag (GH-93876)



GH-26658 introduced a regression in copy / pickle protocol for combined
`enum.Flag`s. `copy.copy(re.A | re.I)` would fail with
`AttributeError: ASCII|IGNORECASE`.

`enum.Flag` now has a `__reduce_ex__()` method that reduces flags by
combined value, not by combined name.

* Call busy_retry() and sleeping_retry() with error=True (#93871)

Tests no longer call busy_retry() and sleeping_retry() with
error=False: raise an exception if the loop times out.

* gh-87347: Add parenthesis around PyXXX_Check() arguments (#92815)

* gh-91321: Fix test_cppext for C++03 (#93902)

Don't build _testcppext.cpp with -Wzero-as-null-pointer-constant when
testing C++03: only use this compiler flag with C++11.

* gh-91577: SharedMemory move imports out of methods (#91579)

SharedMemory.unlink() uses the unregister() function from resource_tracker. Previously it was imported in the method, but this can fail if the method is called during interpreter shutdown, for example when unlink is part of a __del__() method.

Moving the import to the top of the file, means that the unregister() method is available during interpreter shutdown.

The register call in SharedMemory.__init__() can also use this imported resource_tracker.

* gh-92547: Amend What's New (#93872)

* Fix BINARY_SUBSCR_GETITEM stats (GH-93903)

* gh-93847: Fix repr of enum of generic aliases (GH-93885)

* gh-93353: regrtest supports checking tmp files with -j2 (#93909)

regrtest now also implements checking for leaked temporary files and
directories when using -jN for N >= 2. Use tempfile.mkdtemp() to
create the temporary directory. Skip this check on WASI.

* GH-91389: Fix dis position information for CACHEs (GH-93663)

* gh-91985: Ensure in-tree builds override platstdlib_dir in every path calculation (GH-93641)

* GH-83658: make multiprocessing.Pool raise an exception if maxtasksperchild is not None or a positive int (GH-93364)



Closes #83658.

* test_logging: Fix BytesWarning in SysLogHandlerTest (GH-93920)

* gh-91404: Revert "bpo-23689: re module, fix memory leak when a match is terminated by a signal or allocation failure (GH-32283) (#93882)

Revert "bpo-23689: re module, fix memory leak when a match is terminated by a signal or memory allocation failure (GH-32283)"

This reverts commit 6e3eee5.

Manual fixups to increase the MAGIC number and to handle conflicts with
a couple of changes that landed after that.

Thanks for reviews by Ma Lin and Serhiy Storchaka.

* gh-89745: Avoid exact match when comparing program_name in test_embed on Windows (GH-93888)

* gh-93852: Add test.support.create_unix_domain_name() (#93914)

test_asyncio, test_logging, test_socket and test_socketserver now
create AF_UNIX domains in the current directory to no longer fail
with OSError("AF_UNIX path too long") if the temporary directory (the
TMPDIR environment variable) is too long.

Modify the following tests to use create_unix_domain_name():

* test_asyncio
* test_logging
* test_socket
* test_socketserver

test_asyncio.utils: remove unused time import.

* gh-77782: Py_FdIsInteractive() now uses PyConfig.interactive (#93916)

* gh-74953: Add _PyTime_FromMicrosecondsClamp() function (#93942)

* gh-74953: Fix PyThread_acquire_lock_timed() code recomputing the timeout (#93941)

Set timeout, don't create a local variable with the same name.

* gh-77782: Deprecate global configuration variable (#93943)

Deprecate global configuration variable like
Py_IgnoreEnvironmentFlag: the Py_InitializeFromConfig() API should be
instead.

Fix declaration of Py_GETENV(): use PyAPI_FUNC(), not PyAPI_DATA().

* gh-93911: Specialize `LOAD_ATTR_PROPERTY` (GH-93912)

* gh-92888: Fix memoryview bad `__index__` use after free (GH-92946)

Co-authored-by: chilaxan <35645806+chilaxan@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <3659035+serhiy-storchaka@users.noreply.github.com>

* GH-89858: Fix test_embed for out-of-tree builds (GH-93465)

* gh-92611: Add details on replacements for cgi utility funcs (GH-92792)



Per @brettcannon 's [suggestions on the Discourse thread](https://discuss.python.org/t/pep-594-take-2-removing-dead-batteries-from-the-standard-library/13508/51), discussed in #92611 and as a followup to PR #92612 , this PR add additional specific per-function replacement information for the utility functions in the `cgi` module deprecated by PEP 594 (PEP-594).

@brettcannon , should this be backported (without the `deprecated-removed` , which I would update it accordingly and re-add in my other PR adding that to the others for 3.11+), or just go in 3.11+?

* GH-77403: Fix tests which fail when PYTHONUSERBASE is not normalized (GH-93917)

* gh-91387: Strip trailing slash from tarfile longname directories (GH-32423)

Co-authored-by: Brett Cannon <brett@python.org>

* Add jaraco as primary owner of importlib.metadata and importlib.resources. (#93960)

* Add jaraco as primary owner of importlib.metadata and importlib.resources.

* Align indentation.

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>

* gh-84461: Fix circulare dependency on BUILDPYTHON (GH-93977)

* gh-89828: Do not relay the __class__ attribute in GenericAlias (#93754)

list[int].__class__ returned type, and isinstance(list[int], type)
returned True. It caused numerous problems in code that checks
isinstance(x, type).

* gh-84461: Fix pydebug Emscripten browser builds (GH-93982)

wasm_assets script did not take the ABIFLAG flag of sysconfigdata into
account.

* gh-93955: Use unbound methods for slot `__getattr__` and `__getattribute__` (GH-93956)

* gh-91387: Fix tarfile test on WASI (GH-93984)

WASI's rmdir() syscall does not like the trailing slash.

* gh-93975: Nicer error reporting in test_venv (GH-93959)



- gh-93957: Provide nicer error reporting from subprocesses in test_venv.EnsurePipTest.test_with_pip.
- Update changelog

This change does three things:

1. Extract a function for trapping output in subprocesses.
2. Emit both stdout and stderr when encountering an error.
3. Apply the change to `ensurepip._uninstall` check.

* GH-93990: fix refcounting bug in `add_subclass` in `typeobject.c` (GH-93989)

* What's new in 3.10: fix link to issue (#93968)

* What's new in 3.10: fix link to issue

* What's new in 3.10: fix link to GH issue

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>

* gh-93761: Fix test_logging test_config_queue_handler() race condition (#93952)

Fix a race condition in test_config_queue_handler() of test_logging.

* gh-74953: Reformat PyThread_acquire_lock_timed() (#93947)

Reformat the pthread implementation of PyThread_acquire_lock_timed()
using a mutex and a conditioinal variable.

* Add goto to avoid multiple indentation levels and exit quickly
* Use "while(1)" and make the control flow more obvious.
* PEP 7: Add braces around if blocks.

* gh-93937, C API: Move PyFrame_GetBack() to Python.h (#93938)

Move the follow functions and type from frameobject.h to pyframe.h,
so the standard <Python.h> provide frame getter functions:

* PyFrame_Check()
* PyFrame_GetBack()
* PyFrame_GetBuiltins()
* PyFrame_GetGenerator()
* PyFrame_GetGlobals()
* PyFrame_GetLasti()
* PyFrame_GetLocals()
* PyFrame_Type

Remove #include "frameobject.h" from many C files. It's no longer
needed.

* gh-93991: Use boolean instead of 0/1 for condition check (GH-93992)



# gh-93991: Use boolean instead of 0/1 for condition check

* gh-84461: Fix Emscripten umask and permission issues (GH-94002)

- Emscripten's default umask is too strict, see
  emscripten-core/emscripten#17269
- getuid/getgid and geteuid/getegid are stubs that always return 0
  (root). Disable effective uid/gid syscalls and fix tests that use
  chmod() current user.
- Cannot drop X bit from directory.

* gh-84461: Skip test_unwritable_directory again on Emscripten (GH-94007)

GH-93992 removed geteuid() and enabled the test again on Emscripten.

* gh-93925: Improve clarity of sqlite3 commit/rollback, and close docs (#93926)

Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>

* gh-61162: Clarify sqlite3 connection context manager docs (GH-93890)



Explicitly note that transactions are only closed if there is an open
transation at `__exit__`, and that transactions are not implicitly
opened during `__enter__`.

Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>

Automerge-Triggered-By: GH:erlend-aasland

* gh-79009: sqlite3.iterdump now correctly handles tables with autoincrement (#9621)

Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>

* gh-84461: Silence some compiler warnings on WASM (GH-93978)

* GH-93897: Store frame size in code object and de-opt if insufficient space on thread frame stack. (GH-93908)

* GH-93516: Speedup line number checks when tracing. (GH-93763)

* Use a lookup table to reduce overhead of getting line numbers during tracing.

* gh-90539: doc: Expand on what should not go into CFLAGS, LDFLAGS (#92754)

* gh-87347: Add parenthesis around macro arguments (#93915)

Add unit test on Py_MEMBER_SIZE() and some other macros.

* gh-93937: PyOS_StdioReadline() uses PyConfig.legacy_windows_stdio (#94024)

On Windows, PyOS_StdioReadline() now gets
PyConfig.legacy_windows_stdio from _PyOS_ReadlineTState, rather than
using the deprecated global Py_LegacyWindowsStdioFlag variable.

Fix also a compiler warning in Py_SetStandardStreamEncoding().

* GH-93249: relax overly strict assertion on bounds->ar_start (GH-93961)

* gh-94021: Address unreachable code warning in specialize code (GH-94022)

* GH-93678: refactor compiler so that optimizer does not need the assembler and compiler structs (GH-93842)

* gh-93839: Move Lib/ctypes/test/ to Lib/test/test_ctypes/ (#94041)

* Move Lib/ctypes/test/ to Lib/test/test_ctypes/
* Remove Lib/test/test_ctypes.py
* Update imports and build system.

* gh-93839: Move Lib/unttest/test/ to Lib/test/test_unittest/ (#94043)

* Move Lib/unittest/test/ to Lib/test/test_unittest/
* Remove Lib/test/test_unittest.py
* Replace unittest.test with test.test_unittest
* Remove unittest.load_tests()
* Rewrite unittest __init__.py and __main__.py
* Update build system, CODEOWNERS, and wasm_assets.py

* GH-91432: Specialize FOR_ITER (GH-91713)

* Adds FOR_ITER_LIST and FOR_ITER_RANGE specializations.

* Adds _PyLong_AssignValue() internal function to avoid temporary boxing of ints.

* gh-94028: Clear and reset sqlite3 statements properly in cursor iternext (GH-94042)

* gh-94052: Don't re-run failed tests with --python option (#94054)

* gh-93839: Use load_package_tests() for testmock (GH-94055)



Fixes failing tests on WebAssembly platforms.

Automerge-Triggered-By: GH:tiran

* gh-54781: Move Lib/lib2to3/tests/ to Lib/test/test_lib2to3/ (#94049)

* Move Lib/lib2to3/tests/ to Lib/test/test_lib2to3/.
* Remove Lib/test/test_lib2to3.py.
* Update imports.
* all_project_files(): use different paths and sort files
  to make the tests more reproducible.
* Update references to tests.

* gh-74953: _PyThread_cond_after() uses _PyTime_t (#94056)

pthread _PyThread_cond_after() implementation now uses the _PyTime_t
type to handle properly overflow: clamp to the maximum value.

Remove MICROSECONDS_TO_TIMESPEC() function.

* GH-93841: Allow stats to be turned on and off, cleared and dumped at runtime. (GH-93843)

* gh-86986: Drop compatibility support for Sphinx 2 (GH-93737)

* Revert "bpo-42843: Keep Sphinx 1.8 and Sphinx 2 compatibility (GH-24282)"

This reverts commit 5c1f15b

* Revert "bpo-42579: Make workaround for various versions of Sphinx more robust (GH-23662)"

This reverts commit b63a620.

* gh-94068: Remove HVSOCKET_CONTAINER_PASSTHRU constant because it has been removed from Windows (GH-94069)



Fixes #94068

Automerge-Triggered-By: GH:zware

* Closes gh-94038: Update Release Schedule in README.rst from PEP 664 to PEP 693 (GH-94046)

* gh-93851: Fix all broken links in Doc/ (GH-93853)

* gh-93675: Fix typos in `Doc/` (GH-93676)

Closes #93675

* Minor optimization for Fractions.limit_denominator (GH-93730)

When we construct the upper and lower candidates in limit_denominator,
the numerator and denominator are already relatively prime (and the
denominator positive) by construction, so there's no need to go through
the usual normalisation in the constructor. This saves a couple of
potentially expensive gcd calls.

Suggested by Michael Scott Asato Cuthbert in GH-93477.

* gh-93240: clarify wording in IO tutorial (GH-93276)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>

* Tutorial: specify match cases don't fall through (GH-93615)

* gh-93021: Fix __text_signature__ for __get__ (GH-93023)

Because of the way wrap_descr_get is written, the second argument
to __get__ methods implemented through the wrapper is always
optional.

* gh-82927: Update files related to HTML entities. (GH-92504)

* DOC: correct bytesarray -> bytearray in comments (GH-92410)

* gh-87389: Fix an open redirection vulnerability in http.server. (#93879)

Fix an open redirection vulnerability in the `http.server` module when
an URI path starts with `//` that could produce a 301 Location header
with a misleading target.  Vulnerability discovered, and logic fix
proposed, by Hamza Avvan (@hamzaavvan).

Test and comments authored by Gregory P. Smith [Google].

* gh-89336: Remove configparser APIs that were deprecated for 3.12 (#92503)

https://github.com/python/cpython/issue/89336: Remove configparser 3.12 deprecations.

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>

* bpo-30535: [doc] state that sys.meta_path is not empty by default (GH-94098)

Co-authored-by: Windson yang <wiwindson@outlook.com>

* gh-88123: Implement new Enum __contains__ (GH-93298)

Co-authored-by: Ethan Furman <ethan@stoneleaf.us>

* Stats: Add summary of top instructions for misses and deferred specialization. (GH-94072)

* gh-74696: Do not change the current working directory in shutil.make_archive() if possible (GH-93160)

It is no longer changed when create a zip or tar archive.

It is still changed for custom archivers registered with shutil.register_archive_format()
if root_dir is not None.

Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* gh-94101 Disallow instantiation of SSLSession objects (GH-94102)



Fixes #94101

Automerge-Triggered-By: GH:tiran

* Fix typo in _io.TextIOWrapper Clinic input (#94037)

Co-authored-by: Łukasz Langa <lukasz@langa.pl>

* gh-93951: In test_bdb.StateTestCase.test_skip, avoid including auxiliary importers. (GH-93962)

Co-authored-by: Brett Cannon <brett@python.org>

* gh-91172: Create a workflow for verifying bundled pip and setuptools (GH-31885)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>

* gh-94114: Remove obsolete reference to python.org mirrors (GH-94115)



* gh-94114

* gh-84623: Remove unused imports (#94132)

* gh-54781: Move Lib/tkinter/test/test_ttk/ to Lib/test/test_ttk/ (#94070)

* Move Lib/tkinter/test/test_tkinter/ to Lib/test/test_tkinter/.
* Move Lib/tkinter/test/test_ttk/ to Lib/test/test_ttk/.
* Add Lib/test/test_ttk/__init__.py based on test_ttk_guionly.py.
* Add Lib/test/test_tkinter/__init__.py
* Remove old Lib/test/test_tk.py.
* Remove old Lib/test/test_ttk_guionly.py.
* Add __main__ sub-modules.
* Update imports and update references to rename files.

* gh-84623: Move imports in doctests (#94133)

Move imports in doctests to prevent false alarms in pyflakes.

* Add ABI dump Makefile target (#94136)

* gh-84623: Remove unused imports in idlelib (#94143)

Remove commented code in test_debugger_r.py.

Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>

* gh-85308: argparse: Use filesystem encoding for arguments file (GH-93277)

* Closes gh-94152: Update pyvideo.org URL (GH-94075)

The URL is now https://pyvideo.org, which uses HTTPS and avoids a redirect.

* gh-91456: [Enum] Deprecate default auto() behavior with mixed value types (GH-91457)

When used with plain Enum, auto() returns the last numeric value assigned, skipping any incompatible member values (such as strings); starting in 3.13 the default auto() for plain Enums will require all the values to be of compatible types, and will return a new value that is 1 higher than any existing value.

Co-authored-by: Ethan Furman <ethan@stoneleaf.us>

* gh-84461: Fix test_sqlite for Emscripten/WASI (#94125)

* gh-86404: [doc] Fix missing backtick and double target name. (#94120)

* gh-89121: Keep the number of pending SQLite statements to a minimum (#30379)

Make sure statements that have run to completion or errored are
reset and cleared off the cursor for all paths in execute() and
executemany().

* GH-91742: Fix pdb crash after jump  (GH-94171)

* [Enum] fix typo (GH-94158)

* gh-92858: Improve error message for some suites with syntax error before ':' (#92894)

* gh-93771: Clarify how deepfreeze.py is run (#94150)

* gh-91219: Add an index_pages default list and parameter to SimpleHTTPRequestHandler (GH-31985)

* Add an index_pages default list to SimpleHTTPRequestHandler and an
optional constructor parameter that allows the default indexes pages
list to be overridden.  This makes it easy to set a new index page name
without having to override send_head.

* [Enum] Remove automatic docstring generation (GH-94188)

* Add ABI dump script (#94135)

* Add more tests for throwing into yield from (GH-94097)

* gh-94169: Remove deprecated io.OpenWrapper (#94170)

Remove io.OpenWrapper and _pyio.OpenWrapper, deprecated in Python
3.10: just use :func:`open` instead. The open() (io.open()) function
is a built-in function. Since Python 3.10, _pyio.open() is also a
static method.

* gh-94199: Remove ssl.RAND_pseudo_bytes() function (#94202)

Remove the ssl.RAND_pseudo_bytes() function, deprecated in Python
3.6: use os.urandom() or ssl.RAND_bytes() instead.

* gh-94196: Remove gzip.GzipFile.filename attribute (#94197)

gzip: Remove the filename attribute of gzip.GzipFile,
deprecated since Python 2.6, use the name attribute instead. In write
mode, the filename attribute added '.gz' file extension if it was not
present.

* gh-93692: remove "build finished successfully" message from setup.py (#93693)

The message was only emitted when the build succeeded _and_ there were
missing modules.

* gh-84461: Fix ctypes and test_ctypes on Emscripten (#94142)

- c_longlong and c_longdouble need experimental WASM bigint.
- Skip tests that need threading
- Define ``CTYPES_MAX_ARGCOUNT`` for Emscripten. libffi-emscripten 2022-06-23 supports up to 1000 args.

* gh-94205: Ensures all required DLLs are copied on Windows for underpth tests (GH-94206)

* gh-84461: Build Emscripten with WASM BigInt support (#94219)

* gh-94172: urllib.request avoids deprecated check_hostname (#94193)

The urllib.request no longer uses the deprecated check_hostname
parameter of the http.client module.

Add private http.client._create_https_context() helper to http.client,
used by urllib.request.

Remove the now redundant check on check_hostname and verify_mode in
http.client: the SSLContext.check_hostname setter already implements
the check.

* IDLE: replace if statement with expression (#94228)

* Docs: Remove `Provides [...]` from `multiprocessing.shared_memory` description (#92761)

* gh-93382: Sync up `co_code` changes with 3.11 (GH-94227)

Sync up co_code changes with 3.11 commit 852b4d4.

* gh-94217: Skip import tests when _testcapi is a builtin (GH-94218)

* gh-85308: Add argparse tests for reading non-ASCII arguments from file (GH-94160)

* bpo-46642: Explicitly disallow subclassing of instaces of TypeVar, ParamSpec, etc (GH-31148)

The existing test covering this case passed only incidentally. We
explicitly disallow doing this and add a proper error message.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

* bpo-26253: Add compressionlevel to tarfile stream (GH-2962)

`tarfile` already accepts a compressionlevel argument for creating
files. This patch adds the same for stream-based tarfile usage.
The default is 9, the value that was previously hard-coded.

* gh-70441: Fix test_tarfile on systems w/o bz2 (gh-2962) (#94258)

* gh-94199: Remove ssl.match_hostname() function (#94224)

* gh-94207: Fix struct module leak (GH-94239)

Make _struct.Struct a GC type

This fixes a memory leak in the _struct module, where as soon
as a Struct object is stored in the cache, there's a cycle from
the _struct module to the cache to Struct objects to the Struct
type back to the module. If _struct.Struct is not gc-tracked, that
cycle is never collected.

This PR makes _struct.Struct GC-tracked, and adds a regression test.

* gh-94245: Test pickling and copying of typing.Tuple[()] (GH-94259)

* gh-77560: Report possible errors in restoring builtins at finalization (GH-94255)

Seems in the past the copy of builtins was not made in some scenarios,
and the error was silenced. Write it now to stderr, so we have a chance
to see it.

* gh-90016: Reword sqlite3 adapter/converter docs (#93095)

Also add adapters and converter recipes.

Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com

* bpo-39971: Change examples to be runnable (GH-32172)

* gh-70474: [doc] fix wording of GET_ANEXT doc (GH-94048)

* gh-93259: Validate arg to ``Distribution.from_name``. (GH-94270)

Syncs with importlib_metadata 4.12.0.

Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Ulises Ojeda <ulises.odysseus22@gmail.com>
Co-authored-by: jackh-ncl <1750152+jackh-ncl@users.noreply.github.com>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Co-authored-by: Colin Delahunty <72827203+colin99d@users.noreply.github.com>
Co-authored-by: Neil Schemenauer <nas-github@arctrix.com>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
Co-authored-by: Cyker Way <cykerway@gmail.com>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Omer Katz <omer.katz@omerkatz.com>
Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: AN Long <aisk@users.noreply.github.com>
Co-authored-by: Samodya Abeysiriwardane <379594+sransara@users.noreply.github.com>
Co-authored-by: Evorage <owner@evorage.com>
Co-authored-by: Davide Rizzo <sorcio@gmail.com>
Co-authored-by: Pascal Wittmann <mail@pascal-wittmann.de>
Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Andreas Grommek <76997441+agrommek@users.noreply.github.com>
Co-authored-by: Mark Shannon <mark@hotpy.org>
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
Co-authored-by: jacksonriley <52106215+jacksonriley@users.noreply.github.com>
Co-authored-by: Kalyan <kalyan.ben10@live.com>
Co-authored-by: Bluenix <bluenixdev@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Co-authored-by: Leo Trol <milestone.jxd@gmail.com>
Co-authored-by: XD Trol <milestonejxd@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: John Belmonte <john@neggie.net>
Co-authored-by: Julien Palard <julien@palard.fr>
Co-authored-by: Pamela Fox <pamela.fox@gmail.com>
Co-authored-by: Dong-hee Na <donghee.na@python.org>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Sanket Shanbhag <TechieBoy@users.noreply.github.com>
Co-authored-by: Jeong YunWon <69878+youknowone@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@python.org>
Co-authored-by: samtygier <samtygier@yahoo.co.uk>
Co-authored-by: Ken Jin <kenjin@python.org>
Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: chilaxan <35645806+chilaxan@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <3659035+serhiy-storchaka@users.noreply.github.com>
Co-authored-by: Chris Fernald <chrisf671@gmail.com>
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
Co-authored-by: Lei Zhang <leizhanghello@gmail.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
Co-authored-by: itssme <itssme3000@gmail.com>
Co-authored-by: Matthias Köppe <mkoeppe@math.ucdavis.edu>
Co-authored-by: MilanJuhas <81162136+MilanJuhas@users.noreply.github.com>
Co-authored-by: luzpaz <luzpaz@users.noreply.github.com>
Co-authored-by: paulreece <96156234+paulreece@users.noreply.github.com>
Co-authored-by: max <36980911+pr2502@users.noreply.github.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Thomas A Caswell <tcaswell@gmail.com>
Co-authored-by: Windson yang <wiwindson@outlook.com>
Co-authored-by: Carl Bordum Hansen <carl@bordum.dk>
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: chgnrdv <52372310+chgnrdv@users.noreply.github.com>
Co-authored-by: fikotta <81991278+fikotta@users.noreply.github.com>
Co-authored-by: partev <petrosyan@gmail.com>
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Co-authored-by: Oscar R <89599049+oscar-LT@users.noreply.github.com>
Co-authored-by: wookie184 <wookie1840@gmail.com>
Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Myron Walker <myron.walker@hotmail.com>
Co-authored-by: Sam Ezeh <sam.z.ezeh@gmail.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Gregory Beauregard <greg@greg.red>
Co-authored-by: Yaron de Leeuw <me@jarondl.net>
Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
markshannon added a commit that referenced this issue Jun 28, 2022
* Store offset of first traceable instruction to avoid having to recompute it all the time when tracing.
tiran added a commit to tiran/cpython that referenced this issue Jun 29, 2022
The ``assert`` is broken on big endian platforms and not present in the
main branch. Drop it.

Correct version would be ``_PyOpcode_Deopt[_Py_OPCODE(...)]`` instead of
``_PyOpcode_Deopt[...]``.
@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Jun 30, 2022

I now agree with Gregory and Alex that the performance regression is within acceptable levels. IMO we can leave this issue open but it shouldn't block the 3.11 release.

@jpe
Copy link

jpe commented Jul 1, 2022

The current sources seem to be significantly faster than b3.

Can the line number checks be eliminated when frame->f_trace_lines is false? This wouldn't help with coverage but would help debuggers that set f_trace_lines to false when line events aren't needed.

@fabioz
Copy link
Contributor

fabioz commented Jul 1, 2022

I just reran the performance for pydevd with the current tip. The numbers haven't changed from the last run (it's still in the range of 45 -> 70 percent slower depending on the use case, so, at least for pydevd the current tip measurements are very close to the beta 3 measurements).

Benchmark Python 3.10 Python 3.11 tip ( abf5f5c ) Slower than 3.10 in %
method_calls_with_breakpoint 0,25 0,363 45,20%
method_calls_without_breakpoint 0,247 0,375 51,82%
method_calls_with_step_over 0,236 0,376 59,32%
method_calls_with_exception_breakpoint 0,249 0,389 56,22%
global_scope_1_with_breakpoint 0,557 0,937 68,22%
global_scope_2_with_breakpoint 0,238 0,343 44,12%

@jpe
Copy link

jpe commented Jul 1, 2022

Hmm, I may be confusing myself but I'm seeing tip take about 50% of the time of b3 when running one of our test files. I'm testing with a C trace function that doesn't do anything except return 0. Looking at the diffs since b3, I think performance partially depends on the number of code objects.

@P403n1x87
Copy link
Contributor

On the off chance that some native flame graphs could be of help here. These are taken from running @nedbat's benchmarks for the bm_sudoku case, through austinp, comparing the tip of 3.11 (b22f9d6) against 3.10.5. The side view also shows a top-like report, sorted by own time.

bm_sudoku with 3.10
bm_sudoku with 3.11

@pablogsal
Copy link
Member

pablogsal commented Jul 6, 2022

Thanks for the report @P403n1x87! Unfortunately, the results you show are not very useful (is mostly what we are already getting by running perf) because what we need to understand is what linear or non-linear combination of the changes we did are affecting the coverage numbers. The flame graphs and tables show the difference between 3.10.5 and 3.11 which are two completely code paths for everything (they don't even have the same instructions). What we need to understand is the relative cost of the coverage-relative machinery before and after every change that we did that could affect this. Unfortunately tracing over all the interpreter is too unstable and noisy (and not granular enough) to give us insight.

But thanks a lot for the help 🙏, I am sure that if we need to measure the interaction between the benchmark python code and the interpreter we can use your tool 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests