Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use Pooch to download sample data #915

Merged
merged 2 commits into from Aug 23, 2018
Merged

WIP: Use Pooch to download sample data #915

merged 2 commits into from Aug 23, 2018

Conversation

leouieda
Copy link
Contributor

@leouieda leouieda commented Aug 10, 2018

Add it to metpy/cbook.py and include the dependency on setup.py. Create the registry file using pooch.make_registry and stored it in the metpy package.

Some tests fail because they use a glob to get all the files in the nids folder. Will try to replace that with a lookup of the Pooch registry.

I'm also not sure how to get it installed on the CIs. I added it to the setup.py. Is that enough?

Fixes fatiando/pooch#8

@leouieda
Copy link
Contributor Author

The CircleCI build is complaining about netcdf4 but it passes locally. Any ideas?

@jrleeman
Copy link
Contributor

CircleCI is an expected fail right now - we're trying to get this in today/over the weekend. Working on it this afternoon.

@dopplershift dopplershift added Area: Infrastructure Pertains to project infrastructure (e.g. CI, linting) Type: Feature New functionality labels Aug 17, 2018
@dopplershift dopplershift added this to the 0.9 milestone Aug 17, 2018
@dopplershift dopplershift added Area: DevTools Pertains to tools for developers and removed Area: Infrastructure Pertains to project infrastructure (e.g. CI, linting) labels Aug 17, 2018
@jrleeman jrleeman mentioned this pull request Aug 17, 2018
Copy link
Member

@dopplershift dopplershift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! This is a great start, just a few things I'd like to see.

MANIFEST.in Outdated
@@ -5,6 +5,7 @@ include LICENSE
include CONTRIBUTING.md
include versioneer.py
include metpy/_version.py
include metpy/registry.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something more descriptive--maybe static-data-manifest.txt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be better. @jrleeman should I fix this or are you gonna edit this branch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editing now 😄

environment.yml Outdated
@@ -35,3 +35,5 @@ dependencies:
- cartopy
- doc8
- recommonmark
- pip:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's vendor in pooch temporarily while things settle just a bit and we work on things like conda packages. Without conda packages for pooch, we won't even be able to update MetPy's packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about that. I think I'll get started on the conda-forge package and we make new releases later with fixes required by metpy.

metpy/cbook.py Outdated

def get_test_data(fname, as_file_obj=True):
"""Access a file from MetPy's collection of test data."""
path = POOCH.fetch(fname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have a way to use our local git checkout of files directly so we can more easily bootstrap adding new files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that while implementing this. The problem is that the directory is versioned. I got around it on the CIs by copying the data to the cache folder. That bypasses the download at least. A static link would be better locally to keep the folders in sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way to do this is check "are we in a git repo?" and if so, then we'll set the environment variable to point the cache to the staticdata directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't work because of the versioning. The env variable only points to the base level directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got a workaround going - verifying now.

Copy link
Contributor

@jrleeman jrleeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we're actually going to put some hooks in to look into the github directory, if it exists, first. We'll also do some checking to make sure those files do get into the registry.

.travis.yml Outdated
@@ -108,8 +109,9 @@ install:
- pip install ".[$EXTRA_INSTALLS]" --upgrade --upgrade-strategy=eager --no-index $PRE -f file://$PWD/$WHEELDIR $VERSIONS;

script:
# Move the data to the cache location to avoid downloading it
- cp -r $TRAVIS_BUILD_DIR/staticdata/* $TEST_DATA_DIR/master
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out Travis doesn't like indented comments. This will keep it from downloading the data every time.

@leouieda
Copy link
Contributor Author

So I think we're actually going to put some hooks in to look into the github directory, if it exists, first.

Wouldn't it be better to link staticdata to $TEST_DATA_DIR/master? No coding required.

@@ -0,0 +1,48 @@
# pylint: disable=missing-docstring
# Import functions/classes to make the API
from . import version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 '.version' imported but unused

# pylint: disable=missing-docstring
# Import functions/classes to make the API
from . import version
from .core import Pooch, create

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 '.core.create' imported but unused
F401 '.core.Pooch' imported but unused

# Import functions/classes to make the API
from . import version
from .core import Pooch, create
from .utils import os_cache, file_hash, make_registry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 '.utils.os_cache' imported but unused
F401 '.utils.file_hash' imported but unused
F401 '.utils.make_registry' imported but unused

dirname = os.path.basename(root)
if dirname.startswith(parentdir_prefix):
return {
"version": dirname[len(parentdir_prefix) :],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E203 whitespace before ':'

# starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
# just "foo-1.0". If we see a "tag: " prefix, prefer those.
TAG = "tag: "
tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E203 whitespace before ':'

for ref in sorted(tags):
# sorting will prefer e.g. "2.0" over "2.0rc1"
if ref.startswith(tag_prefix):
r = ref[len(tag_prefix) :]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E203 whitespace before ':'

tag_prefix,
)
return pieces
pieces["closest-tag"] = full_tag[len(tag_prefix) :]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E203 whitespace before ':'

"""
registry = {
"tiny-data.txt": "baee0894dba14b12085eacb204284b97e362f4f3e5a5807693cc90ef415c1b2d",
"subdir/tiny-data.txt": "baee0894dba14b12085eacb204284b97e362f4f3e5a5807693cc90ef415c1b2d",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E501 line too long (99 > 95 characters)

# Copyright (c) 2015 MetPy Developers.
# Distributed under the terms of the BSD 3-Clause License.
# SPDX-License-Identifier: BSD-3-Clause
"""Vendor packages used in MetPy."""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W292 no newline at end of file

@jrleeman
Copy link
Contributor

@dopplershift - any ideas on why codecov isn't ignoring this? Once things look better, I'll cleanup and rebase.

@dopplershift
Copy link
Member

Looks like globs for codecov are not recursive. They give a solution in the linked docs.

@leouieda
Copy link
Contributor Author

I started conda-forge/staged-recipes#6540 so once that is done we probably don't need the vendoring, right?

@jrleeman
Copy link
Contributor

We'll probably vendor for this version of MetPy to make sure all the dust settles and then go to a conda forge install in 0.10.

@leouieda
Copy link
Contributor Author

👍 just in case you want to try it out, the feedstock is ready https://github.com/conda-forge/pooch-feedstock

@Unidata Unidata deleted a comment from codecov bot Aug 22, 2018
@@ -0,0 +1,557 @@
# pylint: skip-file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to nuke this file and version.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. That's not used in the rest of the package. It is used in one of the tests (https://github.com/fatiando/pooch/blob/master/pooch/tests/test_integration.py#L13) but if you're not running those you could also nuke the entire tests folder.

'dev'

"""
parse = Version(version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work as:

return fallback if '+' in version else version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, using packaging is probably overkill here.

@leouieda
Copy link
Contributor Author

leouieda commented Aug 22, 2018

What's the advantage of vendoring vs pinning the requirement to a given version? If the concern is pooch updating and breaking things, the pinning should solve that. Plus, nothing else relies on pooch so it wouldn't block anything else from updating (like numpy would).

That's an honest question, BTW. I had to look up "vendor" on wikipedia :)

@dopplershift
Copy link
Member

You make an interesting point about pinning...

leouieda and others added 2 commits August 23, 2018 10:44
Add it to `metpy/cbook.py` and include the dependency on setup.py.
Create the registry file using `pooch.make_registry` and stored it in
the `metpy` package.
@jrleeman
Copy link
Contributor

Okay, just did what should be the final history cleanup here - per review and CI this is ready to rock 🍾

@dopplershift dopplershift merged commit 7ccd12e into Unidata:master Aug 23, 2018
@dopplershift dopplershift mentioned this pull request Aug 23, 2018
@leouieda
Copy link
Contributor Author

🎉 thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: DevTools Pertains to tools for developers Type: Feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants