Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kvl example ok meuse #28

Merged
merged 8 commits into from Dec 21, 2016
Merged

Conversation

kvanlombeek
Copy link
Contributor

Addition of an example how to use the PyKrige package, using the spatial dataset http://spatial-analyst.net/book/meusegrids

@basaks
Copy link
Collaborator

basaks commented Dec 19, 2016

@kvanlombeek , this is a nice example. I quite like it.

A few suggestions:

  1. It adds 10MB+ to the git. May be you can write a python/shell script to download the data rather than add to git.
  2. Clarify that this needs shapely, pandas and whatever else. These are rather large packages, and should not be part of the main package, or, at least, not yet.
  3. Use the examples directory to commit just the notebook and any associated scripts? In fact, consider making gridsampler.py part of the notebook as well.

@rth @bsmurphy may have further suggestions.

@kvanlombeek
Copy link
Contributor Author

Hi @basaks

Thanks for your positive feedback. I have a few questions about your suggestions:

  1. I will do.
  2. What do you mean with clarifying and that the packages pandas and shapely should not be part of the main package? Are you opposed to importing them in this example notebook?
  3. Ok.

@basaks
Copy link
Collaborator

basaks commented Dec 19, 2016

hi @kvanlombeek ,
on no 2, I meant to say that pandas and shapely are not required for PyKrige, but may be you want to clarify that these are required to be installed for this notebook example.

@kvanlombeek
Copy link
Contributor Author

hi @basaks ,

  • The meuse.zip file is now dowloaded and unzipped by python
  • The meuse data files folder, meuse_example_data, is added to the .gitignore file
  • The code to sample a grid is placed in a notebook cell

I don't understand your request to clarify why I need pandas and shapely however. Some interpretations that I see:

  • Clarify here on github to you why I import the packages? Would you rather have me finding ways around so I would not have to import these packages?
  • Clarify inside the code with comments to new users why I import these packages?

I am a bit surprised by this request, as when I follow data science tutorials with notebooks, I have never seen anybody explaining why they import pandas for example.

But I probably just don't understand your request, and am happy to implement it once I understand it.

@rth
Copy link
Contributor

rth commented Dec 20, 2016

@kvanlombeek That's a great example, thanks for contributing!

I don't understand your request to clarify why I need pandas and shapely however.
I am a bit surprised by this request, as when I follow data science tutorials with notebooks, I have never seen anybody explaining why they import pandas for example.

Yeah, it's just that for libraries the general assumption is that once you install the library, you can run the examples without installing anything else (see for instance scikit-learn's examples/ folder, and because PyKrige aims to be a lightweight library we can't just add new dependencies easily.
This notebook however is indeed closer to a tutorial (and it's great) so if you just add in the introduction of the notebook something along the lines that this notebook requires PyKrige 1.4 with additional packages that can be installed with,

     pip install geopandas folium seaborn

(this will also install pandas, pyproj, descartes, click), that should work. Maybe also specify the versions of those dependencies that you are using? just in case they have some backward incompatible updates in the future, and it's not clear which version should be used.

Just a comment regarding PEP8 code style; for functions, the usual code style is f(x=2, y=3, ..) (i.e. no space around = and space after ,). This applies to cells no 8, 12, 13, 15, 17, 20.

Also the person who is going to merge this should use "Squash and merge" button (I think), otherwise the deleted 10MB of files will still end up in the git history of the master branch.

Besides the main notebook is 400kB in size (mostly, I imagine due to images). As far as I understand, after running it on your computer, git will detect it as different, and will try to add the new generated images with the next commit (adding 400kB to the repo size), so care must be taken not to update this notebook (particularly with minor changes) too often. Or maybe we should adapt some ideas from this post in the future... cc @bsmurphy , @basaks

@kvanlombeek
Copy link
Contributor Author

@rth Ok clear, thanks for the great feedback.

Regarding the last point, I can clear all notebook outputs just before saving, this will significantly reduce the size of the file. The downside is that I believe you can't view the notebook and its output in the browser in the github repo, which is a nice feature to convince new users to use this repo. I will ask a git expert if there is a way to have the benefits of both options.

@rth
Copy link
Contributor

rth commented Dec 20, 2016

The downside is that I believe you can't view the notebook and its output in the browser in the github repo, which is a nice feature to convince new users to use this repo.

yes, it's quite nice to have the preview on github (though the map plots don't seem to render)...

Also maybe it would be best rename the example to meuse_kriging.ipynb (or something more explicit than "Example kriging")?

@kvanlombeek
Copy link
Contributor Author

@rth It is good that we talk about this:

  • If we leave the notebook with the outputs:
    -- Downside: Takes around 400 kb every time it is updated. I plan to do some more commits, as I have opened a few issues and I would like to illustrate some of these kriging options in the notebook. So this can make the github repo indeed heavy.
    -- Upside: we can use something like http://nbviewer.jupyter.org/github/kvanlombeek/PyKrige/blob/kvl_example_ok_meuse/examples/Example%20kriging.ipynb to show an render of the latest version of the notebook

  • We clear all the outputs:
    -- Downside: you don't get to see the notebook rendered. In that case I will make myself a repo with the rendered example. This link can then be posted inside this repo, but it has the disadvantage that the rendered example might be out of sync with the latest version of the original package
    -- Upside: filesize is a lot smaller.

Hope this is clear, up to you guys!

@kvanlombeek
Copy link
Contributor Author

I went for the second option, let me know if there are still issues.

@basaks
Copy link
Collaborator

basaks commented Dec 21, 2016

LGTM @kvanlombeek .
Would you simply squash all your commits into one commit? Makes it easier to merge :)

@rth
Copy link
Contributor

rth commented Dec 21, 2016

LGTM as well.
We can always go with option A with full figures later, once you add all the kriging options you wanted to add.
We can squash this at merge, so it's OK (particularly since it's on a separate branch on your fork not the master branch).

@rth rth merged commit 2585fe8 into GeoStat-Framework:master Dec 21, 2016
@kvanlombeek kvanlombeek deleted the kvl_example_ok_meuse branch December 21, 2016 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants