Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Jupyter Notebooks Section: MPAS Plot with Datashader and Geoviews #425

Merged
merged 16 commits into from
Feb 15, 2022

Conversation

erogluorhan
Copy link
Collaborator

@erogluorhan erogluorhan commented Feb 9, 2022

This PR adds @clyne 's Jupyter notebook demonstration of interactive (pan, zoom, etc.) MPAS plotting with Datashader and Holoviews-Geoviews packages.

See here to explore how the output of this PR looks like.

@clyne @anissa111 @michaelavs and others, please let me know what you think, if you find anything missing/wrong with this draft PR, etc.

Due to the nature of this development, there are significant additions to GeoCAT-examples by this PR. Below list shows those as a checklist where the checked off items are currently implemented in this draft PR, but the PR should better be merged with all/most of them checked:

  • Add "nbsphinx" dependency to make galleries directly out of Jupyter notebooks (for now, those that are pre-executed ) just like sphinx-gallery does with executing .py files

    • See conda_environment.yml and conf.py for details
  • Add a new page "Notebook Gallery" to the same level with "Gallery" or "GeoCAT-comp examples" in GeoCAT-examples gallery
    - See docs/index.rst and docs/gallery_nb.rst

  • Add the first Jupyter notebook example "docs/gallery-notebooks/Datashader/MPAS_Datashader_Trimesh.ipynb"
    - It reads in 30km DYAMOND MPAS data from geocat-datafiles

  • Modify installation.rst to describe how readers can execute Jupyter notebooks examples in their local

  • Explore Dask parallelization for this notebook on 3.75 km dyamond_1 dataset (@anissa111 I might ask your help on this as my first attempt didn't end up with better performance than the current implementation). This may be done after merging this PR though.

  • Figure out how we can read-in much bigger datasets. This is not the scope of this PR but just for future reference.

  • Apply dynamically resizing margins in the document pages (what @erinlincoln did for GeoCAT-viz or @hCraker did for GeoCAT-comp) so that we can show larger plots, and the entire content could dynamically adjust to the screen size.

  • Add a download link for the notebook

  • Add a Binder link

  • As a minor implementation, this PR also adds the NSF logo to the doc homepage.

@hCraker
Copy link
Contributor

hCraker commented Feb 9, 2022

RE: dynamically resizing margins
Hi @erogluorhan. It was not intentional to leave GeoCAT-examples as is. @erinlincoln and I made the improvements for -viz and -comp as we were working on those repo, but we never got around to doing the same for -examples. I believe we can make this improvement in -examples without any issue.

@erogluorhan
Copy link
Collaborator Author

RE: dynamically resizing margins Hi @erogluorhan. It was not intentional to leave GeoCAT-examples as is. @erinlincoln and I made the improvements for -viz and -comp as we were working on those repo, but we never got around to doing the same for -examples. I believe we can make this improvement in -examples without any issue.

Great to know; thanks a lot for input!

@jukent
Copy link
Collaborator

jukent commented Feb 9, 2022

What you have so far looks good! Please ignore my review as it will be stale soon I imagine.

Copy link
Collaborator

@clyne clyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, @erogluorhan! Thanks for putting this together.

A few general comments:

  1. It would be nice to add a download link for the notebook if possible
  2. Maybe consider changing the image width height to something on the
    order of 1200x600. There is a lot of detail in this data and it is
    kind of hard to appreciate at 600x300.
  3. There is a comment under "Load data and coordinates" about figuring
    out how to provide access to larger data. I'm not sure that this
    is entirely necessary. The example certainly demonstrates how to
    plot an MPAS data set with the 30km data. Using higher resolution
    data is simply a matter of changing the path.
  4. For the future you might consider something like nbautoexport to facilitate
    PR reviews of notebooks.

Typos:

"is note used" -> "is not used"

"splits a global" -> "splits a global mesh"

"optimized performance" -> "optimized machine code"

"resolution date" -> "resolution data"

docs/gallery_nb.rst Outdated Show resolved Hide resolved
@clyne
Copy link
Collaborator

clyne commented Feb 10, 2022

Explore Dask parallelization for this notebook on 3.75 km dyamond_1 dataset (@anissa111 I might ask your help on this as my first attempt didn't end up with better performance than the current implementation). This may be done after merging this PR though.

I struggled with getting much benefit Dask either. However, if I monitored the process table while the notebooks was running there were times when multiple cores were clearly active. This might be an issue to follow up on with Jim Bednar. Jim's team were instrumental in helping me get the first version of this notebook together, and it might be a good idea to get their input.

@erogluorhan
Copy link
Collaborator Author

erogluorhan commented Feb 10, 2022

This is awesome, @erogluorhan! Thanks for putting this together.

A few general comments:

  1. It would be nice to add a download link for the notebook if possible
  2. Maybe consider changing the image width height to something on the
    order of 1200x600. There is a lot of detail in this data and it is
    kind of hard to appreciate at 600x300.
  3. There is a comment under "Load data and coordinates" about figuring
    out how to provide access to larger data. I'm not sure that this
    is entirely necessary. The example certainly demonstrates how to
    plot an MPAS data set with the 30km data. Using higher resolution
    data is simply a matter of changing the path.
  4. For the future you might consider something like nbautoexport to facilitate
    PR reviews of notebooks.

Typos:

"is note used" -> "is not used"

"splits a global" -> "splits a global mesh"

"optimized performance" -> "optimized machine code"

"resolution date" -> "resolution data"

This is great input; thanks a lot @clyne! Pleas find my responses below to the general comments:

  1. Yes, this is something I attempted to have but couldn't get it done with nbsphinx yet. For now, I have put this into the checklist of this PR not to forget before merge.
  2. Yes, I agree, and indeed I wanted to do that but the current narrow margin of the page hadn't allowed me to make plots bigger and look fine. Now, I pushed the dynamic resizing feature for the pages, so I also made plots 1200x600. Please check the newly generated RTD page.
  3. Rephrased.
  4. I'd need to explore nbautoexport a bit. Thanks a lot for the recommendation!
  • Typos are all fixed. Thanks for catching these!

@erogluorhan
Copy link
Collaborator Author

Explore Dask parallelization for this notebook on 3.75 km dyamond_1 dataset (@anissa111 I might ask your help on this as my first attempt didn't end up with better performance than the current implementation). This may be done after merging this PR though.

I struggled with getting much benefit Dask either. However, if I monitored the process table while the notebooks was running there were times when multiple cores were clearly active. This might be an issue to follow up on with Jim Bednar. Jim's team were instrumental in helping me get the first version of this notebook together, and it might be a good idea to get their input.

Indeed! I am hoping to get some help from @anissa111 's Dask knowledge on this. (FYI: In my own experience, the execution of Example 1's rasterize block on 3.75 km resolution data took 1 minute in my local and 2:30 minutes with Dask LocalCluster I later commented out).

After discussing this with Anissa, I'd further discuss with Jim et al.

Copy link
Collaborator

@clyne clyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be great to have this addition!

@erogluorhan erogluorhan marked this pull request as ready for review February 14, 2022 13:59
Copy link
Contributor

@michaelavs michaelavs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I was able to build it locally and see the changes reflected correctly. Definitely a very cool update to add to the gallery!

@erogluorhan
Copy link
Collaborator Author

@anissa111 If you are planning to give this PR a review, please make sure to review the latest changes on the

  • INSTALLATION.md and instal.rst files to reflect the new installation instructions for Jupyter notebooks
  • The renaming of the "\Plots" folder to "\Gallery" (Ths refactoring added several py files to this PR as if they are changed, but nothing's changed in them actually). This renaming could have been left to a new PR, but the modification in INSTALLATION.md triggered me to address it here.

If/oce you approve this PR, I am planning to merge it as is and create new issue(s) for the items that were not checked off in the description (Binder, etc).

@erogluorhan erogluorhan merged commit 508e022 into main Feb 15, 2022
@hCraker hCraker deleted the interactive branch July 14, 2022 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants