Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incrementing class variable about 2**32 times causes the Python 3.12.1 interpreter to crash #113462

Closed
gsbrodal opened this issue Dec 24, 2023 · 19 comments
Labels
3.12 bugs and security fixes 3.13 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@gsbrodal
Copy link

gsbrodal commented Dec 24, 2023

Crash report

What happened?

The below example stores an integer counter as a class variable C.counter. Incrementing the counter using C.counter += 1 between 2^32 - 2^24 and 2^32 times makes the Python 3.12.1 interpreter crash (after about 30 min) .

Tested with Python 3.12.1 on Windows 11.

# Windows 11: Python 3.12.1 (tags/v3.12.1:2305ca5, Dec  7 2023, 22:03:25) [MSC v.1937 64 bit (AMD64)] on win32
# CPython interpreter crashes, last printed value: 4278190080 = 2**32 - 2**24

class C:
    counter = 0  # Class variable
    
while True:
    C.counter += 1  # Increment class variable
    if C.counter & 0b111111111111111111111111 == 0:
        print(C.counter)

(In the original application, the class was a wrapper class implementing __lt__ to count the number of comparisons performed by various algorithms)

CPython versions tested on:

3.12

Operating systems tested on:

Windows

Output from running 'python -VV' on the command line:

Python 3.12.1 (tags/v3.12.1:2305ca5, Dec 7 2023, 22:03:25) [MSC v.1937 64 bit (AMD64)]

Linked PRs

@gsbrodal gsbrodal added the type-crash A hard crash of the interpreter, possibly with a core dump label Dec 24, 2023
@CharlieZhao95
Copy link
Contributor

I tested further. After compiling the code of the main branch in debug mode, the interpreter will prompt the follows when executing the above code.

...
4278190080
python: Python/generated_cases.c.h:3408: _PyEval_EvalFrameDefault: Assertion `type_version != 0' failed.
Aborted

TARGET(LOAD_ATTR_CLASS) {
_Py_CODEUNIT *this_instr = frame->instr_ptr = next_instr;
next_instr += 10;
INSTRUCTION_STATS(LOAD_ATTR_CLASS);
static_assert(INLINE_CACHE_ENTRIES_LOAD_ATTR == 9, "incorrect cache size");
PyObject *owner;
PyObject *attr;
PyObject *null = NULL;
/* Skip 1 cache entry */
// _CHECK_ATTR_CLASS
owner = stack_pointer[-1];
{
uint32_t type_version = read_u32(&this_instr[2].cache);
DEOPT_IF(!PyType_Check(owner), LOAD_ATTR);
assert(type_version != 0);

I think this problem may be related to the bytecode length. BTW, this bug cannot be reproduced on 3.11.

@gaogaotiantian
Copy link
Member

I'm not the expert here, but it seems like it's related to specialization and the following happened:

  1. C.counter += 1 compiles to both LOAD_ATTR and STORE_ATTR
  2. For each LOAD_ATTR, CPython tries to specialize it, which increments interp.types.next_version_tag in _PyType_Lookup
  3. For each STORE_ATTR, as the attribute is rewritten, the type cache is cleared.
  4. For the next LOAD_ATTR, cache miss, increments the global interpreter version again.
  5. The interperter version tag is 32-bit long, so after some time (well a long time), it overflows.

@brandtbucher should be the expert.

@sambhavnoobcoder
Copy link

hi @everyone .I am a beginner looking to contribute to this repo . any of you experienced folks pointing me in the right direction to my first issue would indeed be a great help . thanks .

@gaogaotiantian
Copy link
Member

@sambhavnoobcoder unfortunately this might not be a good first issue to work on. This has something to do with a new and advanced feature in the CPython VM and it probably requires some deep knowledge of how the interpreter works. Up untill now, I'm not even sure if this is a bug that we should fix.

I would suggest to start your first contribution with some easier issues. Maybe look for easy tags. If you don't like docs or tests, something in stdlib that's pure Python might be easier to start with.

Of course, if you are really interested in the Python interpreter itself, you are welcome to take a look at it, but there is a learning curve and you should probably take it slow :)

@Eclips4 Eclips4 added 3.12 bugs and security fixes 3.13 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 25, 2023
@gaogaotiantian
Copy link
Member

Slept on this and now I think this is something we should fix. The core factor is - is this something we could have met in a real-life product, or just a theoretical situation. I'm thinking it's the former.

All the self defined classes(types) share the same global version counter which is 32 bit. Any specialization for LOAD_ATTR (maybe other bytecodes) will increment the version counter. Any STORE_ATTR (which is a simple write to any attribute to the class instance) will will clear the type cache and makes the next specialization for LOAD_ATTR increment the version counter again.

For any program that runs for a reasonably long time, the version counter is destined to overflow. I'm not talking about years, but days or even hours. It's okay if we have a bug that happens after a program runs for 10 years, but 10 days would be too short.

The easiest way to fix this is to make the version 64-bit, but that might have some performance concerns. I don't see a way to slow down (or perhaps safely reset) the version counter easily, but maybe there is one and @brandtbucher might know it :)

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Dec 25, 2023

Thanks for the report. The problem is that we are not checking for overflow in the bytecode specializer. I created an issue 2 years ago and a contributor is already working on a fix #89811

@gaogaotiantian
Copy link
Member

Thanks for the report. The problem is that we are not checking for overflow in the bytecode specializer. I created an issue 2 years ago and a contributor is already working on a fix #89811

It's more complicated than gracefully fail in overflow right? Do we want to keep the program work when the version overflows? As I mentioned above - this could be a real life situation for long-running programs. The current mechanism assumes the version counter will not duplicate otherwise there could be something funny happening I suppose?

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Dec 25, 2023

Thanks for the report. The problem is that we are not checking for overflow in the bytecode specializer. I created an issue 2 years ago and a contributor is already working on a fix #89811

It's more complicated than gracefully fail in overflow right? Do we want to keep the program work when the version overflows? As I mentioned above - this could be a real life situation for long-running programs. The current mechanism assumes the version counter will not duplicate otherwise there could be something funny happening I suppose?

Yes we want the program to keep working after overflowing. The proposed fix in that issue will prevent the specializer from even specializing bytecode once it sees that version overflows, so the LOAD_ATTR_CLASS bytecode won't even execute (this just means slow generic bytecode will always execute). The method cache in _PyType_Lookup already detects if it's an invalid version tag after overflowing, so that's fine too. Before Python 3.11, once the version overflows, the program keeps running, it just becomes slower. IMO, we should continue with that approach.

@gaogaotiantian
Copy link
Member

Yes we want the program to keep working after overflowing. The proposed fix in that issue will prevent the specializer from even specializing bytecode once it sees that version overflows, so the LOAD_ATTR_CLASS bytecode won't even execute (this just means slow generic bytecode will always execute). The method cache in _PyType_Lookup already detects if it's an invalid version tag after overflowing, so that's fine too. Before Python 3.11, once the version overflows, the program keeps running, it just becomes slower. IMO, we should continue with that approach.

Okay so no specialization once the version overflows. That is definitely much better than the current behavior and is obviously acceptable. However, that also means for something like server code which is supposed to be running for days or even months, specialization will only work for a small period of time.

Then I realized it's not only about the size of a global counter, it's also about the cache size of the bytecode, so increasing the size of the counter is much more complicated. Maybe we can figure out some optimizations to slow down the version update to alleviate the issue.

Anyway, I think to stop specializing once the version overflows is the right way to go at this point. Thanks for the info.

@gsbrodal
Copy link
Author

gsbrodal commented Dec 27, 2023

Thanks for quickly looking into the issue.

I would be happy if my programs just continued to run without crashing. I am (obviously) anyway waiting a long time for the programs to finish, so a slowdown (as in prior Python versions) is perfectly acceptable for my use cases.

I am not sure if my application could be considered a "real-life product", but the issue caused the experiments in a research paper of mine not to be able to run to completion under Python 3.12. The experiments were originally done with Python 3.9 and measure the number of comparisons performed by various heap implementations on various input types and sizes. The below example capture my essential use of a class variable to count the number of comparisons performed (by Pythons built-in sorted function) by wrapping each value into an Item object.

class Item:
    comparisons = 0  # class variable
    
    def __init__(self, key):
        self.key = key

    def __lt__(self, other):
        Item.comparisons += 1  # increment class variable
        return self.key < other.key

from random import randint

while True:
    n = 1_000_000
    values = [randint(1, n) for _ in range(n)]
    items = [Item(value) for value in values]
    
    comparisons_before = Item.comparisons
    items = sorted(items)
    comparisons_after = Item.comparisons

    comparisons = comparisons_after - comparisons_before
    print(f'{comparisons} comparisons done to sort {n} items')
    print(f'Total comparisons until now {Item.comparisons}')

The last lines printed with Python 3.12.1, before crashing:

...
18604327 comparisons done to sort 1000000 items
Total comparisons until now 4241787149
18604508 comparisons done to sort 1000000 items
Total comparisons until now 4260391657
18604282 comparisons done to sort 1000000 items
Total comparisons until now 4278995939

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Dec 27, 2023

Thanks for the report. I've marked the root/other issue as a release blocker for 3.12 and 3.13. This means no new versions of 3.12/3.13 can be released until we fix it.

@cfbolz
Copy link
Contributor

cfbolz commented Jan 6, 2024

It's probably not completely easy for cpython to replicate, but what pypy does is to stop changing the version if a class attribute is found to change a lot. That way, a counter on the class still allows the caching of all other class attributes/methods. We keep a set of 'rapidly changing names' per class, and those names aren't cached and mutating them doesn't cause version changes.

@Fidget-Spinner
Copy link
Member

I somewhat recall we had a discussion somewhere to limit the number of version tags each class can consume, to reduce the exhaustion of tags. @markshannon do you recall?

@markshannon
Copy link
Member

markshannon commented Jan 9, 2024

The idea was to have a per-class counter, and when that got to, say, 1000 no longer issue version number for that class.

@markshannon
Copy link
Member

Using a class attribute as a counter will perform poorly, even after this bug is fixed.
I'd suggest not mutating the class, but the counter:

class Item:
    comparisons = [0]  # class variable

    def __lt__(self, other):
        Item.comparisons[0] += 1  # increment class variable

@gsbrodal
Copy link
Author

gsbrodal commented Feb 2, 2024

I made a quick experiment with .sort() comparing the above suggested solution to using a global variable for counting the number of comparisons. Using a global variable was about 9% faster, so if performance is focus, I would just go with a global variable.

comparisons = 0

class Item:
    def __lt__(self, other):
        global comparisons
        comparisons += 1

As a lecturer on Introduction to Programming in Python, I try to draw connections to other languages. When talking about classes, I mention that Java and C++ have static class variables (where having a counter as a static int variable in the class is the canonical example you find in tutorials), and that this is also possible in Python. I would prefer to continue telling this story to help students migrate between languages (and just ignore the fact that this can come with a performance loss in Python).

@cfbolz
Copy link
Contributor

cfbolz commented Feb 2, 2024

Using a class attribute as a counter will perform poorly, even after this bug is fixed. I'd suggest not mutating the class, but the counter:

class Item:
    comparisons = [0]  # class variable

    def __lt__(self, other):
        Item.comparisons[0] += 1  # increment class variable

I understand the reason why this is hard to support in CPython, but this suggestion sounds quite bad to me. It's leakage of implementation details. Having a counter on a class is rather common, after all.

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Feb 2, 2024

Using a class attribute as a counter will perform poorly, even after this bug is fixed. I'd suggest not mutating the class, but the counter:

class Item:
    comparisons = [0]  # class variable

    def __lt__(self, other):
        Item.comparisons[0] += 1  # increment class variable

I understand the reason why this is hard to support in CPython, but this suggestion sounds quite bad to me. It's leakage of implementation details. Having a counter on a class is rather common, after all.

Yeah I completely agree. We should make common code patterns faster in the interpreter, not change the code patterns to suit the interpreter.

@Fidget-Spinner
Copy link
Member

Ok this should be fixed on 3.11, 3.12, 3.13. I shall close this issue. Thanks to @lazorchakp and Mark for the fixes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

8 participants