-
Notifications
You must be signed in to change notification settings - Fork 93
Different ids for FakeOsModule.<some> Python3.12 #1171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So you are saying |
Yes full output on python:3.12-alpine image for example
|
Thank you - now I can reproduce the problem. And it happens not always - weird indeed. |
I verified that pyfakefs have always behaved that way (e.g. the different IDs), so this is not new. I still don't understand the behavior, though. |
import shutil
from unittest.mock import patch
from pyfakefs.fake_filesystem import FakeFilesystem
from pyfakefs.fake_os import FakeOsModule
fake_fs = FakeFilesystem()
os_mock = FakeOsModule(fake_fs)
for _ in range(10):
print(id(os_mock))
print(id(os_mock.lstat))
print()
with patch("shutil.os", os_mock):
shutil.rmtree("/foo") And this is how the id's stabilize, I haven't caught a case where they're different yet update I figured out why, there's only 2 of them, and they either change or they don't |
Funny, I did (and found) exactly the same, though I'm still stomped. ids = set()
for _ in range(100000):
ids.add(id(os_mock.lstat))
assert len(ids) == 1 # always passes |
I'm not sure if it's print-dependent and not something weirder. Because with print - I just did a demo after catching this error in shutil (I didn't have any prints) |
I don't think it's specifically dependent on |
Ok, I'm still not sure why ids of function objects behave this way, but this seems to be a general Python behavior: class Foo:
def bar(self):
pass
for _ in range(10):
print(id(foo.bar)) behaves the same way, so this a red herring. I should have checked the actual problem ( This was actually a known problem which is fixed if using the Patcher or related methods, but not if directly patching if hasattr(shutil, "_use_fd_functions"):
shutil._use_fd_functions = False which avoids using this optimized path which does not work with I have to think about how to best handle this without using the patcher... |
It didn't occur to me to check the class object, I checked the class itself :\ After your code I asked chatgpt (o3) with search mode and it gave me some great articles https://discuss.python.org/t/comparing-class-methods-using-is-is-wrong/29105/11 https://discuss.python.org/t/python-weirdness-can-anyone-explain-to-this-python-newbee/11976 As per the suggestion to cache in import types
class Foo:
def bar(self):
pass
def __getattribute__(self, name):
attr = object.__getattribute__(self, name)
if isinstance(attr, types.MethodType) and attr.__self__ is self:
object.__setattr__(self, name, attr)
return attr
foo = Foo()
for _ in range(10):
print(id(foo.bar)) |
Ah, that explains it, thanks! Though it also means that is just not a good idea to compare ids of methods in instances. |
I think the caching approach (if it doesn't break anything) is related to the problem, in that you can make your pyfakefs classes behave like “modules”, even though they are not in fact modules though if in stdlib somebody would want to check for isinstance(..., ModuleType) etc... Anyway, I don't want to go too far, these seem to be really extreme cases |
True, that could help for this specific case, though there are other cases (for example the mentioned |
I'm starting to think that this is not a good idea - whatever I do, it doesn't work reliably on all Python versions and systems. And yes, your idea (caching the methods) would probably work under Linux and Python < 3.14 (as the problem is only the function check), but with macOS or Python 3.14 (in all systems) this is not sufficient, as they use system-level calls per default that cannot be patched and should be avoided. I was thinking about some other approach, but first need to understand your use case better. You wrote in the other issue:
but by injecting the faked os with the fake filesystem into your code, it would use the same fake filesystem in all threads anyway, so the problem persists - or I don't understand something. Note that usually you won't notice problems in multi-threaded code, except if opening the same file for writing from several threads simultanously (or similar scenarios). Can you elaborate on the difference between your approach, and just using the Patcher in your tests in some way? |
In short - I should have multiple filesystem instances in my tests, not just one That is
And there are several running threads that need to use these different file systems. They switch file systems by using set ContextVar inside the thread, and modifying the product code def business_logic():
os = os_ctx.get()
pathlib = pathlib_ctx.get()
... I did it this way because to use patching on the test code side: def test_some():
with Patcher/mock.patch(...):
... # any background code can be executed here, using different file systems in threads actually this applies not only to threads, but also to asyncio coroutines, but I simplify the scenario So I can't use with Pacher() in test code, and similarly I can't use it inside threads in product code in any (hidden) way, because patching changes the global state. a simple demo of how this might look like import shutil
from threading import Thread
from random import random
import time
from pyfakefs.fake_filesystem_unittest import Patcher
def th1():
for _ in range(10):
with Patcher(allow_root_user=False) as patcher:
file_path = "/foo/bar"
patcher.fs.create_dir(file_path)
time.sleep(random())
def th2():
for _ in range(10):
with Patcher(allow_root_user=False) as patcher:
file_path = "/foo/bar"
patcher.fs.create_dir(file_path)
time.sleep(random())
def th3():
shutil.copy("/foo", "/bar")
thread1 = Thread(target=th1)
thread2 = Thread(target=th2)
thread3 = Thread(target=th3)
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join() but even this code causes errors. If we leave thread1 - there will be no errors, but if we include other threads (thread3 just for demo), we will get errors related to threads Traceback (most recent call last):
File "/opt/jumpscale7/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/opt/jumpscale7/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/opt/apiserver/jober/t.py", line 20, in th2
patcher.fs.create_dir(file_path)
AttributeError: 'NoneType' object has no attribute 'create_dir' so I went a different way, I try to make isolated module objects, without patching for example for shutil like this (I have already shown, this is how I “memorize” the os module inside the shutil module) open_func_contextvar = ContextVar("open_func_contextvar", default=open)
os_lib_contextvar = ContextVar("os_lib_contextvar", default=os)
shutil_lib_contextvar = ContextVar("shutil_lib_contextvar", default=shutil)
def get_open_func(): # noqa: ANN201
return open_func_contextvar.get()
def get_os_lib(): # noqa: ANN201
return os_lib_contextvar.get()
def get_shutil_lib(): # noqa: ANN201
return shutil_lib_contextvar.get()
def patched_open(*args, **kwargs):
return open_func_contextvar.get()(*args, **kwargs)
builtins.open = patched_open class FakeShutilModule(_FakeShutilModule):
def __init__(
self,
filesystem: FakeFilesystem,
fake_os: FakeOsModule,
) -> None:
# Lock while we mutate global state.
with _patch_module_lock:
with suppress(KeyError):
del sys.modules["shutil"]
sys.modules["os"] = cast("ModuleType", fake_os)
patched_shutil = importlib.import_module("shutil")
sys.modules["os"] = os
del sys.modules["shutil"]
import shutil # noqa: F401 # type: ignore (returns real module to sys.modules)
self.filesystem = filesystem
self._shutil_module = patched_shutil
def disk_usage(self, path: str) -> tuple[int, int, int]:
return self.filesystem.get_disk_usage(path)
def __getattr__(self, name: str) -> Any: # noqa: ANN401
return getattr(self._shutil_module, name) and for each thread/background coroutine fake_fs = FakeFilesystem()
open_func_mock = FakeFileOpen(fake_fs)
os_mock = FakeOsModule(fake_fs)
shutil_mock = FakeShutilModule(fake_fs, os_mock) and for each too token_open_func = open_func_contextvar.set(cast("Any", open_func_mock))
token_os = os_lib_contextvar.set(cast("Any", os_mock))
token_shutil = shutil_lib_contextvar.set(cast("Any", shutil_mock))
token_pathlib = pathlib_lib_contextvar.set(cast("Any", pathlib_mock)) I can now write product code in this way def product_code():
shutil = get_shutil_lib()
... and it becomes thread and coroutine independent and uses its own file systems I did the same thing with pathlib before your last PR, and even for some other modules missing in pyfakefs, like |
More conceptually, imagine that you are running two web servers in threads in the same pytest session, and you need each of them to use a different file system. I did this as setting a ContextVar for each incoming request to each web server. |
Thank you for the explanation! A couple of question:
I'm not sure what is the difference between that and injecting a fake os, which uses a fake filesystem. Are you saying that you want to use the fake filesystem only for some code/threads, but not for others?
|
Ok, I missed your second comment about the web servers... so you are indeed using different file systems for different threads. Hmm... |
It was actually working pretty well until I upgraded to 3.12 python - and there was a problem with os.lstat I can think of a workaraund for this, but at least you now know more about how your library is being used :) The ability to create modules (and more generally, fake objects) separately - I liked it, it's important in tests |
Uh oh!
There was an error while loading. Please reload this page.
On Python3.12 (python:3.12-alpine)
prints (sometimes) different ids and raised
I don't understand what this is related to, why different id's are returned for class
The text was updated successfully, but these errors were encountered: