-
-
Notifications
You must be signed in to change notification settings - Fork 118
Potential issue with *.pyc
files
#589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Huh, interesting. I don't think code generation that cattrs does goes into a pyc file, my impression is that it's purely in memory. Would be amazing if you could put together a simple repro case so we can get to the bottom of the issue. attrs and dataclasses do similar codegeneration at runtime (as do other libs like jinja2), and I've never had this issue myself. Super curious. |
You say you get problems mostly in CI, but folks don't commonly commit pyc files, right? So CI runs should be "clean"? |
I know, I'm also surprised. We definitely have no pyc files in our git repo, but I also have to say I don't fully understand how exactly the Github runners do their thing. That is, no clue what parts of environments / containers are reused when and when not. (Because of these issues, we've also deactivated environment caching a while ago which first seemed to have solved the problem but then they came back 🙈 ) Will try and see if I can somehow get a reproducing minimal example |
Hey @Tinche, the weirdness doesn't stop. I was trying to create a minimal example for you by deleting all unnecessary parts from our codebase (had to go top-down because couldn't get it running bottom-up). And now I'm at the weirdest Python behavior I've ever observed, which blocks me from reducing the example any further: I have deleted ~99% from our code base and have a reproducible setting that creates the problem – a colleague of mine (@Scienfitz) confirmed on his machine. Surprisingly, whatever I do from here onwards, even deleting "dead" code that is nowhere used, changes the effect. Perhaps the problem appears only if the byte code exceeds a certain size 🤔 !? In any case, I simply can't get the example any smaller. However, if you are willing to have a look, you can reproduce the issue on our repo using a few lines of simple instructions. I'd be super curious to hear your thoughts and see if you are also as surprised as we are. InstructionsInstallation
Error reproduction
|
Interesting. I'll be sure to take a look since I'm very curious myself. Does the problem go away if pytest isn't used? I know pytest does some black magic rewriting bytecode for the assert statement to work. |
As far as I can tell right now, without pytest the behavior is consistent, regardless of whether the bytecode files are present or not |
Alright, I think I got a simple reproducer on 3.12 with this: from baybe.surrogates.base import Surrogate
from baybe.surrogates.random_forest import RandomForestSurrogate
surrogate = RandomForestSurrogate()
string = surrogate.to_json()
Surrogate.from_json(string) But the issue here is - a ton of fields in This didn't fix the problem. The true fix is to change from gc import collect
def find_subclass(base: type, name_or_abbr: str, /):
"""Retrieve a specific subclass of a base class via its name or abbreviation."""
collect()
try:
return next(cl for cl in get_subclasses(base) if refers_to(cl, name_or_abbr))
except StopIteration:
raise UnidentifiedSubclassError(
f"The class name or abbreviation '{name_or_abbr}' does not refer to any "
f"of the subclasses of '{base.__name__}'."
) When you use We try very hard to make this work transparently, but sometimes it doesn't. One place is This is also why the problem was sporadic - if a gc run happens before, the problem solves itself. |
Oh wow, that was REALLY fast 😳 That also perfectly explains the behavior I was just about to report to you:
I think you might just have saved us days of work, honestly!!! Let my have a proper look if I can confirm the behavior. Will let you know! |
Hey @Tinche, so it indeed seems to solve all problems when running stuff locally. I've also started the first tests in CI but I think in the end only time will show if everything's fixed. However, so far looking really good 🤩 Hence, let us consider the core of the problem solved for now 👏🏼 Many thanks again, your help is much appreciated!! One other related thing: while the |
There was a little bit of discussion here: python-attrs/attrs#407 Feel free to open a new issue there and maybe we can figure out a better way. I wonder what's actually keeping the original class alive, and can we just break those references. |
Fixes the issue discussed [here](python-attrs/cattrs#589).
There are probably some other solutions, like if you can figure out a way to call |
Very good question, indeed. I can only say that this is not the first time I stumbled over it but I understand it's a very complicated matter.
Unfortunately not possible. Since we are building a package and not an application, there is no obvious "entry point" for the user and thus it's not clear what parts of the code will be loaded and when. But at least we have a workaround for now, which already solves 99% of my problems 🙃 So let me thank you again and close the issue for now 👍🏼 |
Description
Hi, I'm not yet 100% certain, but I might have discovered a
cattrs
bug related to the*.pyc
created when executing code. Not sure if this is anyhow related to the caching/compilation mechanism ofConverter
, and I also haven't been able to create a minimum example yet because my setting is quite a bit involved. But I thought perhaps we can already shed some light on it when I report what I see.What I Did
In the past, we've already had several CI fails caused by
cattrs
showing some rather surprising/inconsistent behavior that we were not able to properly reproduce. The two most common types of errors were of the following kind:The interesting part here is not the errors themselves but rather that they arise more or less sporadically, and mostly in CI and not locally. For instance, as the first screenshot shows, the error happened to occur on Python 3.10 but not on 3.12 in this particular case. And I haven't figured out a clear pattern for this. My only conjecture is that it might have to do with cached/compiled files that are sometimes present and sometimes not.
Now the good news, which seem to confirm this conjecture: I was just able to consistently reproduce the second error on my local machine. The interesting piece: as soon as I delete my
*.pyc
files, the code runs without problems once. After the execution, the*.pyc
files are recreated and the test fails again. Same behavior of course if I usePYTHONDONTWRITEBYTECODE
. So this seems to indicate that there is some unexpected interaction between the created byte code andcattrs
.Any idea what could be the root cause or how we could nail down the problem furhter?
The text was updated successfully, but these errors were encountered: