New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the data management toolbox #369

Merged
merged 1 commit into from Oct 21, 2015

Conversation

Projects
None yet
3 participants
@cdeil
Member

cdeil commented Oct 20, 2015

This PR improves the data management toolbox in Gammapy.
The setup is a bit elaborate, but I think it will work nicely for all HESS FITS data, and hopefully it's intuitive and simple to use.

The main idea is to introduce a file index table containing the actual filenames.
This extra level of indirection means that Gammapy doesn't have to support multiple file / folder layouts, it's up to the data provider (HESS in this case) to produce a file index table in the appropriate format (which I'm doing for HAP and PARISANALYSIS).
It will also make it faster and easier to download, check and select data.

I've also implemented #354 here by bundling pathlib (this version: https://pypi.python.org/pypi/pathlib2/, https://github.com/mcmtroffaes/pathlib2/blob/develop/pathlib2.py) as gammapy/extern/pathlib.py.

@cdeil cdeil self-assigned this Oct 20, 2015

@cdeil cdeil added this to the 0.4 milestone Oct 20, 2015

@@ -12,6 +12,10 @@
## from the list of packages for which version numbers are displayed
## when running the tests ... this was added in Astropy 1.0
try:
# Removing h5py for now to get rid of this travis-ci fail
# https://travis-ci.org/gammapy/gammapy/jobs/86565582#L1074
del PYTEST_HEADER_MODULES['h5py']

This comment has been minimized.

@cdeil

cdeil Oct 21, 2015

Member

@astrofrog @mwcraig Since today I'm getting this error from travis-ci:
https://travis-ci.org/gammapy/gammapy/jobs/86570163#L1061

INTERNALERROR>   File "h5py/h5t.pxd", line 14, in init h5py._conv (/home/ilan/minonda/conda-bld/work/h5py/_conv.c:7364)
INTERNALERROR>   File "h5py/numpy.pxd", line 66, in init h5py.h5t (/home/ilan/minonda/conda-bld/work/h5py/h5t.c:20460)
INTERNALERROR> ValueError: numpy.dtype has the wrong size, try recompiling

I tried del PYTEST_HEADER_MODULES['h5py'] in gammapy/conftest.py, but am still getting the error, i.e. pytest is still trying to import h5py to get it's version number.

Have you seen this issue with Astropy or other affiliated packages?
Any idea how to resolve it?

This comment has been minimized.

@astrofrog

astrofrog Oct 21, 2015

Contributor

I think they updated the h5py package to one that's Numpy-version independent, which seems like a bug, so I opened an issue a few hours ago: ContinuumIO/anaconda-issues#486

I'm seeing this in other packages.

I think there's nothing we can do but wait until they fix it, unless we force the version to an older one.

This comment has been minimized.

@cdeil

cdeil Oct 21, 2015

Member

@astrofrog - Thanks.

But surely there should be a way or at least workaround to run Astropy and affiliated package tests even if h5py is broken. Having a dependency which isn't even used in most affiliated package break their continuous integration is bad, no?

If this is really not possible at the moment, I think we should open a ticket to change the setup.
Should this be in the Astropy or Astropy-helpers tracker?

This comment has been minimized.

@astrofrog

astrofrog Oct 21, 2015

Contributor

Does simply uninstalling h5py from the Travis environment fix things?

This comment has been minimized.

@cdeil

cdeil Oct 21, 2015

Member

A simple conda uninstall h5py broke the builds where h5py wasn't installed, because then the uninstall commands returns a non-zero exit code. Not sure how to put this in an if statement to make it work for all builds.
@astrofrog Any advice how to achieve this in the bash script?

Another thing I tried was to remove h5py from the conda install command for optional dependencies.
Unfortunately it seems it still gets installed for some reason:
https://travis-ci.org/gammapy/gammapy/jobs/86619175#L1080
It's a bit frustrating that the executed commands don't show up in the travis-ci logs.
That would help figuring out what's going on.
@astrofrog Any idea how to achieve that?

This comment has been minimized.

@astrofrog

astrofrog Oct 21, 2015

Contributor

@cdeil - I think it's not worth trying to get a workaround. I heard back from Continuum, and the h5py package will be fixed today. So patience is probably the best workaround :)

@cdeil cdeil force-pushed the cdeil:datastore-hess branch from b28ddc6 to da6b870 Oct 21, 2015

@cdeil cdeil force-pushed the cdeil:datastore-hess branch from da6b870 to 38e8ec8 Oct 21, 2015

cdeil added a commit that referenced this pull request Oct 21, 2015

Merge pull request #369 from cdeil/datastore-hess
Improve the data management toolbox

@cdeil cdeil merged commit 08c9f29 into gammapy:master Oct 21, 2015

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@joleroi

This comment has been minimized.

Contributor

joleroi commented Oct 27, 2015

Hi,

I am testing the DataManager/DataStore at the moment, I found some issues and I am not sure wheater I should fix them or if this is stuff that you're working on

I left some inline comments

elif path2.is_file():
filename = path2
else:
raise FileNotFoundError('File not found at {} or {}'.format(path1, path2))

This comment has been minimized.

@joleroi

joleroi Oct 27, 2015

Contributor

For me (using Python 2.7.10) this raises

NameError: global name 'FileNotFoundError' is not defined

This comment has been minimized.

@cdeil

cdeil Oct 27, 2015

Member

I already fixed this in #364 and changed quite a few other things (e.g. how test data is accessed).
Probably easiest if I merge that PR and then we briefly talk and both continue from master ...

This comment has been minimized.

@cdeil

cdeil Oct 27, 2015

Member

PS: I'll try to get #364 to a state where travis-ci passes and it can be merged before lunch ...

This comment has been minimized.

@joleroi

joleroi Oct 27, 2015

Contributor

Sounds good to me

Command line tools
------------------
* ``gammapy-data-manage`` -- Manage data locally and on servers

This comment has been minimized.

@joleroi

joleroi Oct 27, 2015

Contributor

this points to 'gammapy-data-manage = gammapy.obs.data_manager:main', which does not exist

This comment has been minimized.

@cdeil

cdeil Oct 27, 2015

Member

This command line tool to manage data (download files, generate test files) doesn't exist yet.
I've uncommented that line and added a TODO comment in #364

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment