Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make universal extensions installable #124

Merged
merged 29 commits into from
Dec 9, 2020

Conversation

hodgestar
Copy link
Contributor

@hodgestar hodgestar commented Nov 26, 2020

This is an attempt to address #85 and remove many of the caveats in the old hpy.devel distutils hacking by separating the hpy extension building into its own build_hpy_ext command.

What's in the PR:

  • Universal modules now get a stub file written (using the example from @rlamy in Make universal-ABI modules installable #85).
  • Universal modules now build correctly with bdist_wheel.
  • pip install --hpy-abi=universal ... now works correctly.
  • Tests now load universal modules and cpython modules the same way.
  • A test for FatalError has been added that loads the test module in a Python subprocess.
  • The proof-of-concept test harness now installs cpython and universal modules the same way (with --hpy-abi=...)

The pypy MR for these changes is at https://foss.heptapod.net/pypy/pypy/-/merge_requests/783.

@antocuni
Copy link
Collaborator

I didn't look at it in details, but just wanted to note that hpy.devel is also used by pypy to compile its own hpy tests, so before merging this please check that it doesn't break pypy after you do ./update_vendored.sh (or, if it breaks, that you have a fix available :))

@hodgestar hodgestar changed the title WIP: Make universal extensions installable Make universal extensions installable Nov 30, 2020
Copy link
Collaborator

@antocuni antocuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, nice job!
I didn't try things by myself but it looks like this is a BIG improvement over the old situation.
I have added a couple of remarks; most of them are "this code is too obscure, please explain what the hell you are doing", because it's a bit hard to follow all the jumps if you don't know distutils deeply.

It might be worth adding also a comment describing the "general strategy", e.g.: "the first hook to be called is handle_hpy_ext_modules, which fixes the distribution, then at stage X you use build_hpy_ext instead of build_ext, etc. etc.


class HPyDevel:
""" Extra sources for building HPy extensions with hpy.devel. """

_BASE_DIR = Path(__file__).parent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's confusing to have _BASE_DIR here: if we pass a custom base_dir (as pypy does), we end up having self.base_dir and self._BASE_DIR pointing to two different things

spec = Spec(name, so_filename)
return hpy.universal.load_from_spec(spec)

def load_cython_module(self, name, so_filename):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that this method had a type, it should have been load_cPython_module. Good that ti goes away :)

test/support.py Outdated
spec = importlib.util.spec_from_file_location(name, so_filename)
module = importlib.util.module_from_spec(spec)
sys.modules[name] = module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? It should at least have a comment, and moreover it looks a bit fragile:

  • what if there is already something in sys.modules[name]?

  • it should be in a try/finally

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, I didn't realize that we are using pkg_resources in the stub.
Is it actually needed? Can't we just say that the .hpy.so file needs to be always side-by-side with the .py stub?
In the past I had bad experiences with it: at some point I had to remove it from pdb++ because calling pkg_resources.get_distribution() took almost a second (!!!) on my machine.

Note, I'm not saying that calling pkg_resources is a bad idea, but I would like to understand:

  1. in which situations it is better that just looking for a file side-by-side
  2. if we are interested in supporting the cases listed by point (1)

However, I don't want to block this PR because of this, so feel free to merge it anyway and leave the investigation/decision for later

test/support.py Outdated
[soname] = cmd_obj.get_outputs()
if hpy_abi == "cpython":
[mod_filename] = [
x for x in cmd_obj.get_outputs() if x.endswith(".so")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to break on windows and mac. Maybe there is a more generic way to get the filename extension of dlls, or maybe you could use if not x.endswith(".py")?

@@ -33,8 +36,15 @@ def test_FatalError(self):
@EXPORT(f)
@INIT
""")
# Calling mod.f() gives a fatal error, ending in abort().
# How to check that? For now we just check that the above compiles
result = subprocess.run([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice job!
However, this is probably going to do weird things on pypy (I don't know what sys.executable is in app-level tests).
It might be enough to skip it (on pypy) if we don't find a better way to test it properly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside app-level tests the sys module has no executable attribute, so that provides a convenient way to skip the remainder of the test.

I don't understand app-level tests well enough to know if a better plan can be made -- a separate process probably needs to be involved somehow since although SIGABRT can be "caught", restoring the process back to a sane place seems likely to require crazy crazy hacks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic looks good to me.
I think it's fine to just leave it as it is in app-level tests: the full logic will be tested anyway after translation by extra_tests.

I don't understand app-level tests well enough to know if a better plan can be made

the best way to think about app-level test is this:

def test_this_is_an_interp_level_test():
    # this is rpython code
    space = some_magic_to_create_an_objspace()
    space.exec_("""
        # this is app-level code
        import sys # this is the app-level sys, i.e. an instance of W_Module
        a = 2 + 2 # this is a W_IntObject
        assert a == 4
    """)

basically, inside pypy's conftest there is some magic which turns methods of AppTest* classes into something like this. That's why sys.executable doesn't exist, because we don't even have a proper executable (the closest option would be to point to pypy/bin/pyinteractive.py).


dist.__class__.has_ext_modules = dist_has_ext_modules
base_build.has_ext_modules = build_has_ext_modules
idx = [sub[0] for sub in base_build.sub_commands].index("build_ext")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks obscure and I have no idea of what it does without looking at the source code of distutils. Maybe it's worth a comment?

def extensions(self):
return self._extensions

@extensions.setter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obscure. Add a comment explaining why it's needed

"""


class HPyExtensionName(str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice hack ;)
I remember that I and @arigo considered using something like this when we originally wrote this code, but it looked too obscure even for us. But I agree that nowadays it's a better solution than the old collect_hpy_ext_names.

It is worth a comment though, especially explaining why you can't just stick an attribute is_hpy_extension on the ext objects.

Moreover, you might want to consider to add a function is_hpy_extension which does isinstance(HPyExtensionName), it looks cleaner to me

@hodgestar
Copy link
Contributor Author

@antocuni Thank you for the initial review of this obscure hacking. I think I've now addressed everything in one way or another. Would you mind looking at it again?

I am still going to make sure it can be made to work on PyPy before merging (now that PyPy is up to date with hpy and this PR is cleaned up a bit).

@hodgestar
Copy link
Contributor Author

@antocuni Ready for re-review.

Copy link
Collaborator

@antocuni antocuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hodgestar LGTM, thanks for incorporating my suggestions.

I added a couple of remarks in the comments about pkg_resources and global state, but feel free to consider them just a rant :)

@@ -33,8 +36,15 @@ def test_FatalError(self):
@EXPORT(f)
@INIT
""")
# Calling mod.f() gives a fatal error, ending in abort().
# How to check that? For now we just check that the above compiles
result = subprocess.run([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic looks good to me.
I think it's fine to just leave it as it is in app-level tests: the full logic will be tested anyway after translation by extra_tests.

I don't understand app-level tests well enough to know if a better plan can be made

the best way to think about app-level test is this:

def test_this_is_an_interp_level_test():
    # this is rpython code
    space = some_magic_to_create_an_objspace()
    space.exec_("""
        # this is app-level code
        import sys # this is the app-level sys, i.e. an instance of W_Module
        a = 2 + 2 # this is a W_IntObject
        assert a == 4
    """)

basically, inside pypy's conftest there is some magic which turns methods of AppTest* classes into something like this. That's why sys.executable doesn't exist, because we don't even have a proper executable (the closest option would be to point to pypy/bin/pyinteractive.py).

test/support.py Outdated
spec = importlib.util.spec_from_file_location(name, so_filename)
module = importlib.util.module_from_spec(spec)
sys.modules[name] = module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, I didn't realize that we are using pkg_resources in the stub.
Is it actually needed? Can't we just say that the .hpy.so file needs to be always side-by-side with the .py stub?
In the past I had bad experiences with it: at some point I had to remove it from pdb++ because calling pkg_resources.get_distribution() took almost a second (!!!) on my machine.

Note, I'm not saying that calling pkg_resources is a bad idea, but I would like to understand:

  1. in which situations it is better that just looking for a file side-by-side
  2. if we are interested in supporting the cases listed by point (1)

However, I don't want to block this PR because of this, so feel free to merge it anyway and leave the investigation/decision for later

# (for cpython modules, the module was already placed in
# sys.modules earlier in this function)
module = sys.modules.pop(name)
del sys.path[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: if the imported module modifies sys.path, this is wrong.
I know that in practice it will never happen because we have control on the test module that we generate, but who knows what someone will write in the future? It is worth at least an assert sys.path[0] == os.path.dirname(mod_filename).

Basically, the thing which I dislike the most is that before this branch importing a test module was "stateless" and self-contained, while now we are depending on global state (either sys.modules or importlib's cache), which is often problematic in tests. But I suppose that we can't do much to avoid it if we want to use the .py stub :(

@hodgestar
Copy link
Contributor Author

@antocuni Thank you for the explanation about app-level tests -- that was useful.

Some responses to other questions (apologies for these not being inline, but GitHub UI didn't make that easy):

  • Re pkg_resources: I think the point of pkg_resources.resource_filename is largely to support extension that have been stored in .zip files, and perhaps other more obscure use cases I don't know about. The point of this particular stub is that it closely follows the one for bdist_egg inside setuptools itself, and I think following a similar approach to that is worthwhile because it's a least a set of weird edge cases that people are used to.

  • Re del sys.path[0]: Let me add the assert, because it might be very annoying to debug later if the assumption is violated. Another options would be to save and restore all of sys.path (it's a bit hard to see why a compiled test module should expect to be able to modify sys.path at all).

@antocuni
Copy link
Collaborator

antocuni commented Dec 9, 2020

Re pkg_resources: I think the point of pkg_resources.resource_filename is largely to support extension that have been stored in .zip files, and perhaps other more obscure use cases I don't know about. The point of this particular stub is that it closely follows the one for bdist_egg inside setuptools itself, and I think following a similar approach to that is worthwhile because it's a least a set of weird edge cases that people are used to.

yes, I think the same. But I think that nowadays eggs and zips are mostly unused/unsupported? I think that wheels explicitly requires the installer to unzip the content (e.g., this is the first result for googling python zipimport wheel).

Moreover, so far the only installation method that we test/support is wheels, I think? I we decide we wanto to explicitly support eggs, it might be a good idea to add a test for that. I'm not saying that we should drop the call to pkg_resources or deliberately break eggs. I just want to make sure that either choice it's a deliberate design decision and not an accident.

That said, I am fine to merge this as is and think about this issue in another PR

Re del sys.path[0]: Let me add the assert, because it might be very annoying to debug later if the assumption is violated.

thank you, +1

Another options would be to save and restore all of sys.path (it's a bit hard to see why a compiled test module should expect to be able to modify sys.path at all).

lesson learned in 14 years of pypy: people's code will always be weirder than you think, even if you take this lesson into account ;)

@hodgestar
Copy link
Contributor Author

@antocuni Thank you for the reviews (again)!

@hodgestar hodgestar merged commit 22a05e8 into master Dec 9, 2020
antocuni added a commit that referenced this pull request Dec 16, 2020
…ommit 22a05e8)

- we need to have different code path to load universal and cpython modules,
  because soon we will introduce loading universal modules in debug mode.

- The idea of having a single function to load both universal and cpython
  modules was nice, but it ended up being too complicated and fragile; in
  particular, we had to monkey-patch sys.path, undo the insertion in
  sys.modules, etc.: the original idea was that tests should interact with the
  importing system as little as possible, to ensure isolation.

- Now we have some very minor code duplication, but the logic in each function
  is way simpler.
@hodgestar hodgestar deleted the feature/make-hpy-universal-extensions-installable branch December 18, 2020 20:32
@FFY00
Copy link

FFY00 commented Nov 23, 2021

@hodgestar is there any specific reason you moved back to pkg_resources (7f83e39) in this PR? This adds a runtime dependency on setuptools. setuptools used to be virtually present on all Python environments during package building, and after as a byproduct, but this is not the case anymore with PEP 517. So, I was wondering if there was any technical reason for the change, I can't find anything.

@hodgestar
Copy link
Contributor Author

@FFY00 -- my reasoning is documented in a comment earlier in this PR -- #124 (comment)

If we only want to support Python 3.7+, then we could use importlib.resources.path instead. Ideally we will eventually add an import hook that allows us to avoid this stub loader entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants