Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New package #32

Merged
merged 106 commits into from
Apr 21, 2021
Merged

New package #32

merged 106 commits into from
Apr 21, 2021

Conversation

wpreimes
Copy link
Member

Hi,
so here is the version of the package that I have been working on for some time.
It would be good to get feedback and/or to merge and release this

It's using @sebhahn's classes for stations, networks sensors etc. but also implements the ISMN_Interface as before (where metadata is stored and loaded, so that using the ismn data is still fast). This is not fully backward compatible, but I tried to make the new class as similar to the old one as possible, so that people can still use it as before.

Here is a list of changes:

  • Rewrite package, objects for Networks, Stations, Sensors etc.
  • Update ISMN_Interface to use new components, similar behaviour to old interface.
  • Add MetaVar and MetaData modules for ismn metadata handling.
  • Data readers are now collected in filehandler class.
  • Metadata for soil layers is now filter by the sensor depth.
  • Support reading from zip archive and extracted folders.
  • New python_metadata structure (no absolute paths, no pickle format).
  • Drop support for old ceop format (new ceop_sep format is supported!).
  • Move lookup tables to const.py
  • Update docs and tests.
  • Use Github Actions instead of Travis CI.

Examples on using it are in https://github.com/wpreimes/ismn/blob/master/docs/read_and_plot_ismn_data/interface.ipynb

Things that might dislike compared to the old package:

  • Metadata generation takes longer and starts multiple processes now, but does not include absolute paths anymore, so it can be shared and moved with the data and reprocessing it should rarely be necessary
  • Filehandlers / metadata is NOT loaded from pickled objects anymore, metadata is stored as csv, that has the disadvantage that every time the interface class is initialised, we have to iterate over the metadadata and build the filehanders (which takes a few seconds for global data, for smaller sets or if only a few networks are selected, it is faster) https://github.com/wpreimes/ismn/blob/master/src/ismn/filecollection.py#L275

@daberer @Adeaem @sebhahn any feedback?

Let me know what you think we should change, this is probably not perfect yet, but especially the new metadata handling, relative paths and reading from zip I think is very much needed.

@wpreimes
Copy link
Member Author

this includes #12 and #31

@wpreimes
Copy link
Member Author

image

@wpreimes wpreimes marked this pull request as draft February 8, 2021 19:34
@wpreimes
Copy link
Member Author

wpreimes commented Feb 8, 2021

Some feedback I got:

  • Check network/station names that have _ or - in name: The correct version is the folder name or the name in the file directly. Still, we should make it so that both names work...
  • Add activate_network function to ISMN_Interface
  • Add option to ISMN_Interfac.get_dataset_ids to group the returned ids by networks (e.g. to distribute them to multiple parallel jobs)
  • For a few stations the metadata is not generated correctly

wpreimes and others added 7 commits February 14, 2021 22:14
…groupby metadata field when getting dataset ids
This commit solves the issue of mismatch between the nework name in the IsmnFileCollection class and the name in the folders.
* The name is now parsed from the folder name
* The tests have been updated to check this
* A sensor from the network FR_Aqui has been included in the test dataset to check the mismatch FR_Aqui/FR-Aqui
@wpreimes
Copy link
Member Author

at the following stations is 1 sensor that creates an error in the metadata generation:
FLUXNET-AMERIFLUX\VairaRanch
MONGOLIA\Khutag-under
MONGOLIA\Murun
OZNET\Uri-Park
RISMA\CEF
RISMA\MB10
RISMA\MB1
RISMA\MB11
RISMA\MB4
RISMA\MB2
RISMA\MB5
RISMA\MB6
RISMA\MB8
RISMA\ON4
RISMA\ON5
RISMA\SK1
RISMA\ON6
RISMA\SK2
RISMA\SK4
RUSWET-AGRO\Latvia
RUSWET-AGRO\Oshskaya
RUSWET-AGRO\SyrdarinskayaO
SCAN\Vermillion
SNOTEL\BILLIECREEKDIVIDE

@wpreimes wpreimes marked this pull request as ready for review March 15, 2021 17:03
@wpreimes
Copy link
Member Author

I think the metadata issue is due to the csv files. and catching errors is what the reader should do and does in those cases. So this is open for merging again.


@property
def grid(self):
return self.networks.grid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be self.collection.grid ?

@wpreimes
Copy link
Member Author

wpreimes commented Mar 16, 2021 via email

@daberer daberer merged commit 81fe679 into TUW-GEO:master Apr 21, 2021
@wpreimes
Copy link
Member Author

wpreimes commented Apr 21, 2021

@daberer thanks for merging, now I'd suggest you check everything again, docs and build. Then when you are happy with everying and everything is ready for a new release (which should be a new major version, i.e. 1.0) you can try the release script:
https://github.com/TUW-GEO/ismn/blob/master/.github/workflows/release.yml

This will trigger when you draft a new release on github (right side releases on the overview page), but before that you have to change the owner name in https://github.com/TUW-GEO/ismn/blob/master/.github/workflows/release.yml#L14 from wpreimes to TUW-GEO, and add the pypi token to the github actions secrets (unfortunately I dont have the rights to get a token from pypi)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants