Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Refactor on-demand-import mechanism #3270

Closed

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented May 16, 2021

PR Summary

This is another attempt at refactoring the on-demand-import mechanism (supersedes #3246).
The original branch has become virtually impossible to fix, so I'm retrying here with incremental updates.

fix #3237

TODO

  • migrate every special classes to the new OnDemandImport mechanism
  • drop _ prefixes in on demand import names (reason: right now the prefix is effectively ignored in import statements via the as keyword, so there's a precedent of importing stuff from the on demand module as drop-in replacements)
  • generalise new mechanism to some optional deps that are currently not using the on-demand-import module (bottle, firefly_api, mpi4py, resource, yt_astro_analysis)
  • cleanup (remove) the NotAModule class. should be trivial after the previous step is checked (NOT APPLICABLE, see comment bellow)
  • cleanup on_demand_imports.py (make the _targets list a dictionary, and maybe other nitpicks)
  • add documentation (programming guide).

@neutrinoceros neutrinoceros added the refactor improve readability, maintainability, modularity label May 16, 2021
@neutrinoceros neutrinoceros changed the title implem OnDemandImportClass + optional deps list Refactor on-demand-import mechanism (attempt 2) May 16, 2021
@neutrinoceros neutrinoceros force-pushed the true_ondemand_imports2 branch 2 times, most recently from 8b1a1aa to af41ccb Compare May 16, 2021 23:01
@neutrinoceros neutrinoceros changed the title Refactor on-demand-import mechanism (attempt 2) Refactor on-demand-import mechanism May 17, 2021
@neutrinoceros neutrinoceros force-pushed the true_ondemand_imports2 branch 2 times, most recently from 555dbd4 to 3d3724f Compare May 17, 2021 18:18
@matthewturk
Copy link
Member

Hi -- I know this is draft, but I would prefer we not drop the underscore. I would really like to avoid redefining objects that mirror the external module names.

In general, I guess I also didn't realize this was a priority -- did something change that means we have to address this?

@neutrinoceros
Copy link
Member Author

I found that 2 packages are effectively optional dependencies (in the sense that the corresponding import statements are embedded in try/except blocks), but migrating them to use the actual on-demand-import mechanism isn't trivial:

  • resource: making it an actual on-demand-import lead to a MemoryError in windows CI. It is however used in only one line of the yt codebase. There's probably something to be done here but it will have to wait for next time.
  • yt_astro_analysis: this is a weird case where yt is a natural hard dependency to yt_astro_analysis, so by importing it in yt we create a circular dep. It is however not obvious how the situation could be mitigated since yt_astro_analysis is used in a yt frontend, which is by currently not something that's easy to migrate. I believe the general solution would be to improve the extensibility of yt by allowing users to define extension frontends without modifying the source code. This is obviously way out of scope here.
    pinging @brittonsmith for reference, but if it is ever discussed more thoroughly, we should do it in a specific issue :)

@neutrinoceros
Copy link
Member Author

neutrinoceros commented May 17, 2021

Hi -- I know this is draft, but I would prefer we not drop the underscore. I would really like to avoid redefining objects that mirror the external module names.

@matthewturk I don't understand how that's a problem, could you please develop ? A majority of the import statements we have on the main branch already redefine those with the as keyword, such as follows

from yt.utilities.on_demand_imports import _h5py as h5py

In general, I guess I also didn't realize this was a priority -- did something change that means we have to address this?

It's not a priority, but it is something that I can't solve in one go and want to be done with eventually. It's part of my broader effort to solve #3074

@neutrinoceros neutrinoceros marked this pull request as ready for review May 18, 2021 07:48
@neutrinoceros
Copy link
Member Author

@matthewturk I could be easily convinced that the leading underscore is a good thing, I'm actually more concerned about consistency. I could rewrite every import statement to use as. What's your preference ?

@matthewturk
Copy link
Member

matthewturk commented May 18, 2021 via email

@neutrinoceros
Copy link
Member Author

That sounds fine to me. A concern I do have with the change you're proposing is that it will make this PR even more of a sweeping change, and unless it's merged soon, I will end up maintaining it conflict after conflict for what I can only be assume would be a while. On the other hand, if this goes in as is, there's less of a rush, and I can provide a follow up PR with that change, which will be a breeze to review by itself, so everything can be done in a matter of hours. Does this sound acceptable ?

@matthewturk
Copy link
Member

matthewturk commented May 18, 2021 via email

@neutrinoceros
Copy link
Member Author

neutrinoceros commented May 18, 2021

Wait, what I'm suggesting is actually to preserve the previous behavior
-- import as the real module name, but in the on-demand section, define
them with a leading underscore. It should result in fewer files being
touched.

then I don't understand your argument. :/
If we're not going to use the underscore-prefixed names, then it makes no difference in the code that's using them, except for the fact that import statement get slightly "bloated".

@matthewturk
Copy link
Member

OK -- I would like to engage on this, but I am trying to find the mental bandwidth to work on the 4.0 stuff. I'm sorry; I have to check out just for a little while, and try to get the last little bit of push. :) Would it be possible to return to this after the 4.0 release, and we can have a high-bandwidth low-latency chat over video or audio? And in the interim, freeze the PR for the time being?

@neutrinoceros
Copy link
Member Author

You got it.

@neutrinoceros neutrinoceros added this to the 4.1 milestone May 18, 2021
@brittonsmith
Copy link
Member

I only want to comment for now on the issue of importing yt_astro_analysis. I don't like that we do that. We only do it for the halo plot annotation. I think it might make more sense to move that plot callback into yt_astro_analysis as it's specific for halo analysis anyway. In my view, this is line with moving analysis_modules to yt_astro_analysis in the first place. I probably would have done it, but didn't catch it at the time.

@neutrinoceros
Copy link
Member Author

I 100% agree with this, thanks @brittonsmith

@matthewturk matthewturk changed the title Refactor on-demand-import mechanism REF: Refactor on-demand-import mechanism Oct 9, 2021
@neutrinoceros neutrinoceros changed the title REF: Refactor on-demand-import mechanism ENH: Refactor on-demand-import mechanism Oct 14, 2021
@neutrinoceros neutrinoceros marked this pull request as draft October 14, 2021 10:11
@neutrinoceros neutrinoceros force-pushed the true_ondemand_imports2 branch 3 times, most recently from b6d5b0d to ad23206 Compare October 16, 2021 14:03
@neutrinoceros
Copy link
Member Author

neutrinoceros commented Oct 17, 2021

This is currently stuck because for some reason, tests that require netCDF4 dataset are failing.

I can reproduce this locally by running pytest against this file

from yt.utilities.answer_testing.framework import requires_ds
requires_ds("Orbit/orbit_hdf5_chk_0000")(lambda: None)
Traceback
_________________________________________________ ERROR collecting yt/data_objects/tests/test_particle_trajectories.py _________________________________________________
yt/data_objects/tests/test_particle_trajectories.py:50: in <module>
    ???
yt/utilities/answer_testing/framework.py:1183: in requires_ds
    elif not can_run_ds(ds_fn, file_check):
yt/utilities/answer_testing/framework.py:305: in can_run_ds
    return os.path.isfile(os.path.join(path, ds_fn)) and result_storage is not None
yt/loaders.py:86: in load
    if cls._is_valid(fn, *args, **kwargs):
yt/frontends/nc4_cm1/data_structures.py:180: in _is_valid
    with nc4_file.open_ds(keepweakref=True) as _handle:
../../../.pyenv/versions/3.9.5/lib/python3.9/contextlib.py:117: in __enter__
    return next(self.gen)
yt/utilities/file_handler.py:114: in open_ds
    ds = netCDF4.Dataset(self.filename, mode="r", **kwargs)
yt/utilities/on_demand_imports.py:59: in __getattr__
    return getattr(self.module, key)
../../../.pyenv/versions/3.9.5/lib/python3.9/functools.py:969: in __get__
    val = self.func(instance)
yt/utilities/on_demand_imports.py:48: in module
    return import_module(self._name)
../../../.pyenv/versions/3.9.5/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:855: in exec_module
    ???
<frozen importlib._bootstrap>:228: in _call_with_frames_removed
    ???
../../../.pyenv/versions/yt-dev/lib/python3.9/site-packages/netCDF4/__init__.py:3: in <module>
    from ._netCDF4 import *
src/netCDF4/_netCDF4.pyx:1: in init netCDF4._netCDF4
    ???
E   RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 88 from PyObject

note that these two lines work perfectly when run directly from the python runtime.
env:

  • numpy: 1.19.5
  • netcdf4: 1.5.7
  • pytest: 6.2.4

@neutrinoceros
Copy link
Member Author

close + reopening to retrigger test but I have no particular expectations

@neutrinoceros neutrinoceros reopened this Nov 25, 2021
@neutrinoceros neutrinoceros removed this from the 4.1.0 milestone Nov 25, 2021
- implement a general OnDemandImport class and migrate existing unique classes in yt.utilities.on_demand_import
- extend usage of the new on-demand-import mechanism to second class optional deps
@neutrinoceros neutrinoceros changed the title ENH: Refactor on-demand-import mechanism RFC: Refactor on-demand-import mechanism Nov 26, 2021
@neutrinoceros
Copy link
Member Author

I still can't understand why this is failing NetCDF4 tests. I vaguely feel like the real issue is upstream but in any case this PR is both more controversial than I anticipated at the time and not manageable. Overall it doesn't seem worth the effort of keeping it alive.

@neutrinoceros neutrinoceros deleted the true_ondemand_imports2 branch November 27, 2021 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor improve readability, maintainability, modularity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Out of date requirements (unused optional deps ?)
3 participants