Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allows pysatCDF to be preferentially used when present. #102

Merged
merged 34 commits into from
Aug 3, 2022

Conversation

rstoneback
Copy link
Collaborator

@rstoneback rstoneback commented Feb 18, 2022

Description

Addresses # (issue)

cdflib performance is not what I'd prefer. Loading C/NOFS takes about 60s with cdflib through pysat while loading via pysatCDF through pysat only takes about 1s. Performance difference matters for science. Working on a slightly tweaked pysatCDF that actually compiles on my modern system.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

I'm developing examples for pysatSeasons, or rather, fixing the existing example. Either way, I've been able to load a 67 day season of C/NOFS IVM data with this branch. Will run pysatNASA unit tests.

  • Tests are run using pysatCDF and Cdflib

Test Configuration

  • Operating system: MacOS
  • Version number: Python3.9
  • Any details about your local setup that are relevant

Checklist:

  • Make sure you are merging into the develop (not main) branch
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • Add a note to CHANGELOG.md, summarizing the changes
  • Update zenodo.json file for new code contributors

@rstoneback
Copy link
Collaborator Author

This goes along with this pysat/pysatCDF#33

@rstoneback
Copy link
Collaborator Author

While trying out the pysatModels example over in pysat/pysatModels#102 turns out that metadata isn't being handled as it should for CINDI, and likely the other NASA instruments.

@rstoneback
Copy link
Collaborator Author

Pushed a commit to pysatCDF and now the match example works fine. Metalabel loading working fine now as well.

@jklenzing jklenzing mentioned this pull request May 3, 2022
18 tasks
@rstoneback
Copy link
Collaborator Author

Eeek!

Collecting pysatCDF

          import numpy as np

[363](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:363)
      ModuleNotFoundError: No module named 'numpy'
[364](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:364)
      [end of output]

@rstoneback
Copy link
Collaborator Author

Good news is tests for this branch locally pass from a pip install of pysatCDF.

@rstoneback
Copy link
Collaborator Author

Eeek!

Collecting pysatCDF

          import numpy as np

[363](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:363)
      ModuleNotFoundError: No module named 'numpy'
[364](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:364)
      [end of output]

@jklenzing I think this is the same error we had with OMMBV. pysatCDF has moved to setup.cfg but since it has actual code in setup.py for the NASA build process there is an import numpy as np at the top. Not sure we can do anything about that in this case. Wasn't there a -no-binary option or something like that somewhere....

@jklenzing jklenzing added this to the 0.0.4 release milestone May 17, 2022
TST: test both pysatCDF and cdflib
@jklenzing
Copy link
Member

I need to think about how the extra tests will be merged in the new test structure. @rstoneback, is this still on the roadmap?

@jklenzing
Copy link
Member

@rstoneback, the pysatCDF install fails on three of the five CI environments. It doesn't work for windows or for the old versions of numpy (this may be due to the env definition, we install pysatCDF then update numpy to an older version).

The current setup should be fine and at least test pysatCDF in the two environments. For the others, we basically run cdflib load tests twice (once while forced, once during nominal runs when pysatCDF fails).

I've pulled pysatCDF out of the requirements lists since this is an optional install, but we should probably have some info on the readme for users who want to employ this option. This means that there will always be a working version for a quick install, and users looking for speed can grab the optional package.

@rstoneback
Copy link
Collaborator Author

rstoneback commented Jul 29, 2022 via email

@jklenzing
Copy link
Member

There's a separate install line in the main action, so it will get installed, just not always correctly.

@rstoneback rstoneback marked this pull request as ready for review August 1, 2022 19:23
@rstoneback
Copy link
Collaborator Author

It sounds like we are as good as it is currently going to get then.

Co-authored-by: Jeff Klenzing <19592220+jklenzing@users.noreply.github.com>
Copy link
Member

@jklenzing jklenzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidying up some imports. I think the extra ignore statement in setup.cfg was masking unused imports elsewhere.

pysatNASA/tests/test_instruments.py Outdated Show resolved Hide resolved
pysatNASA/tests/test_instruments.py Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
@@ -30,6 +30,7 @@ jobs:
- name: Install standard dependencies
run: |
pip install -r requirements.txt
pip install pysatCDF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if installing with the no-binary option would help with the bits where it fails?

Copy link
Member

@jklenzing jklenzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaning up imports based on style used in pysatCDAAC

pysatNASA/tests/test_instruments.py Outdated Show resolved Hide resolved
pysatNASA/tests/test_instruments.py Outdated Show resolved Hide resolved
pysatNASA/tests/test_instruments.py Outdated Show resolved Hide resolved
Copy link
Member

@jklenzing jklenzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Needs an update to the changelog. Wouldn't hurt to double-check the rest of the checklist.

.github/workflows/main.yml Outdated Show resolved Hide resolved
Copy link
Member

@jklenzing jklenzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Noting that according to coveralls, pysatCDF still only loads in 2 out of 5 CI environments, but I say we merge and sort out those issues when we can.

@rstoneback rstoneback merged commit 0dd3e50 into develop Aug 3, 2022
@rstoneback rstoneback deleted the optional_pysatcdf branch August 3, 2022 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants