Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch Python builtins with PyFilesystem #155

Closed
wants to merge 6 commits into from
Closed

Conversation

willmcgugan
Copy link
Member

Work in Progress

Will add a new feature to PyFilesystem, where a filesystem may be patched over the Python builtins. Here's an example of how it will work:

import os
import fs
with fs.patch('s3://mybucket'):
    for filename in os.listdir('/'):
        with open(filename, 'r') as read_file:
            for line in read_file:
                print(line)

Here os.listdir and open will use the s3 filesystem. So PyFilesystem can work with any code not explicitly written with the PyFIlesystem API.

@althonos early days on this one. Feedback welcome.

@coveralls
Copy link

coveralls commented Mar 25, 2018

Coverage Status

Coverage decreased (-93.6%) to 6.336% when pulling 3276026 on patcher into 8694573 on master.

@althonos
Copy link
Member

althonos commented Mar 25, 2018

wizard

No but seriously, this looks great, both for compatibility and usability purposes. (you may not be able to get patch in the top namespace of fs however without breaking extensions 😉)

On later Python versions, you may be interested in patching pathlib as well ! But considering you've got a nice framework already, I don't think it will be too difficult. Don't hesitate if you need help.

@willmcgugan
Copy link
Member Author

I'm not the best of wizards. This magic isn't easy.

If you patch module naively with something like this:

os.path.exists = myexists

It appears to work for the most part. But! If a module imports thusly:

from os.path import exists

The exists will remain a reference to the original method. And the magic will not work. Damn.

I found a solution to this. You can replace the __code__ attribute like this:

os.path.exists.__code__ = myexists.__code__

And then all references to exists will execute the proxy code. Figured I could do that for everything. I was wrong. Functions that are implemented in C don't have a __code__ attribute. So the following does not work:

os.listdir.__code__ = myexists.__code__

It was starting to look like my this magic would not work as well as I'd hoped. But then I recalled that I could use the gc module to get everywhere an object is referenced.

If you call this:

gc.get_referrers(os.listdir)

It returned a list objects that contain a reference to the listdir function, which mostly meant the module's __dict__. I could replace each reference to listdir with the proxy version, and it would work no matter how it was imported.

Naturally its not trivial. It's not enough to replace the "listdir" key, as it could have been imported like this:

from os import listdir as why_would_anyone_rename_this

So I need to replace any keys where value is os.listdir regardless of the key.

And that seems to work. Although it does require the use of get_referrers which has the following in the documentation:

> Avoid using get_referrers() for any purpose other than debugging.

Because of this dark magic, I think I'm going to make it an external project. And make it Python3 only. Python2 compatibility would literally double the work. Will close this PR shortly.

@althonos Don't know if you have any experience monkey patching? Wouldn't want to carry on this way only to find out there is a much simpler solution.

@althonos
Copy link
Member

Impressive work (even if that did not turn as expected)!

I'm not the best when it comes to the interpreter internals, so I cannot really help here. I quickly tested if the mock.patch context would work in the aforementioned case, and it doesn't, so my guess is that this is really tricky to get right.

@willmcgugan
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants