Pangeo training material for Big data geosciences #3147

annefou · 2022-01-28T09:26:10Z

We have:

pangeo: Pangeo ecosystem 101 for everyone - Introduction to Xarray Galaxy Tools
pangeo-notebook: Pangeo Notebook in Galaxy - Introduction to Xarray

The first one (pangeo 101) is meant to be used by anyone and does not require any programming skills (using Galaxy Tools) and shows what is Pangeo and its community and how to use Xarray tools in Galaxy.

The second one (pangeo notebook) makes use of Pangeo JupyterLab interactive tool and is an introduction to Xarray for those who have basic Python programming skills.

yvanlebras · 2022-01-28T10:30:58Z

Amazing! Thank you Anne! Really top! I made a first rapid review and PR to nordicESMhub repo bit I think I made something wrong like 1 pr on main branch and another on the good pangeo one.... Don't hesitate if you have doubts on how to manage it ;)

hexylena · 2022-01-28T11:00:12Z

The second one (pangeo notebook) makes use of Pangeo JupyterLab interactive tool and is an introduction to Xarray for those who have basic Python programming skills.

Fyi @annefou there is a new format you can opt-in to using, that generates the ipynb files automatically. You can see it in action here: https://training.galaxyproject.org/training-material/topics/data-science/ anything tagged jupyter-notebook and rmarkdown-notebook have these files automatically generated from their GTN content, if that's interesting to you

* Remove duplicated however * Remove duplicates creating history mention

annefou · 2022-01-28T11:11:35Z

Fyi @annefou there is a new format you can opt-in to using, that generates the ipynb files automatically.

Wow!!! This is so cool!!! I was looking for something like that!!! I definitely want it.
Thank you so much!

hexylena · 2022-01-28T11:25:19Z

Oh, I even wrote documentation! https://training.galaxyproject.org/training-material/topics/contributing/tutorials/create-new-tutorial-content/tutorial.html#automatic-jupyter-notebooks

Do not use the built in citation system

is also outdated, citations work now.

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

annefou · 2022-02-18T13:34:08Z

thank you @yvanlebras and Solenne! I'll update the few pending comments (from previous review too). Many thanks for reviewing this material!

annefou · 2022-02-18T17:28:22Z

Ok. So I think I took into account all your comments. Thanks a lot for your review!

yvanlebras · 2022-02-18T19:45:20Z

Amazing! I will try to test this final version now and validate it! If you think @annefou you can test mine ;) #3152 this can be amazing !!!! Have a nice week-end!

annefou · 2022-02-18T19:47:17Z

Amazing! I will try to test this final version now and validate it! If you think @annefou you can test mine ;) #3152 this can be amazing !!!! Have a nice week-end!

Cool. Yes I can review your training material! Thanks.

topics/climate/tutorials/pangeo-notebook/tutorial.md

yvanlebras · 2022-02-18T22:03:43Z

Really sorry... now I have the dataset, I have an error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/data/jwd/main/041/778/41778869/tmp/ipykernel_354/4122857281.py in <module>
----> 1 dset = xr.open_dataset("CAMS-PM2_5-20211222.netcdf")

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
    477 
    478     if engine is None:
--> 479         engine = plugins.guess_engine(filename_or_obj)
    480 
    481     backend = plugins.get_backend(engine)

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/plugins.py in guess_engine(store_spec)
    150         )
    151 
--> 152     raise ValueError(error_msg)
    153 
    154 

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'cfgrib', 'pydap', 'rasterio', 'zarr']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
http://xarray.pydata.org/en/stable/getting-started-guide/installing.html
http://xarray.pydata.org/en/stable/user-guide/io.html

typing this dset = xr.open_dataset("CAMS-PM2_5-20211222.netcdf") ...

yvanlebras · 2022-02-18T22:04:06Z

my jupyter notebook FYI https://3525516ba8d111f5-3742e8a717c04bddb0a50d763b550537.interactivetoolentrypoint.interactivetool.ecology.usegalaxy.eu/ipython/lab/tree/Untitled.ipynb

annefou · 2022-02-18T22:16:55Z

my jupyter notebook FYI https://3525516ba8d111f5-3742e8a717c04bddb0a50d763b550537.interactivetoolentrypoint.interactivetool.ecology.usegalaxy.eu/ipython/lab/tree/Untitled.ipynb

I am not sure why. It usually happens when the type of the file is not set to netcdf but h5. Actually it did not find the file. The error is a bit misleading... Your file is in the data folder:

dset = xr.open_dataset("data/CAMS-PM2_5-20211222.netcdf")

yvanlebras · 2022-02-18T22:30:20Z

ok, I retest from start and it is ok now! I can go further! THANK YOU !

yvanlebras · 2022-02-18T23:02:00Z

hey hey! Done!

yvanlebras · 2022-02-18T23:02:26Z

Amazing tuto! Thank you Anne!!!!!

bgruening · 2022-02-19T06:54:49Z

What a cool tutorial!

shiltemann · 2022-02-21T14:12:54Z

whoo!! so awesome! 🎉

(currently there seems to be a problem with rendering the slides video, but we are working on it!)

annefou · 2022-02-21T15:20:10Z

(currently there seems to be a problem with rendering the slides video, but we are working on it!)

Let me know if there is anything to do on my side.

shiltemann · 2022-02-23T11:25:56Z

@annefou nah, there was a small bug in the video generation. but the video's are up now :) The pronunciation of "Pangeo" is a bit off tho, so we wil look into teaching it how to pronounce it

annefou · 2022-02-23T12:14:54Z

Cool! That's awesome!

I find a few "dots" that cut sentences and sometimes it is very odd. I guess sentences were far too long. I have started to note precisely when it happens for the first video. Let me know how I can fix these issues.

in the first pangeo video:

1:55 there is a dot at the end and it should be removed (probably my fault)!
must be scalable ... current and future challenges of big data ... e.g. no dot beteen future and challenges.
2:08 we should also remove the dot after use cases e.g. use cases as well as ...
2:20 remove the dot after be e.g. cannot be tackled separately.
2:30 remove dot after define e.g. developers can define priorities for future...
3:22 remove dot after interface e.g. user interface with many functions...
3:54 remove dot after Galaxy e.g. from Galaxy Tools can be useful.

We have similar issues in the second pangeo video (for pangeo-notebook.
Also netCDF is not pronounced correctly. I think I should have written net CDF or net-CDF (I forgot about it).

Thanks!

How can I fix these small issues?

hexylena · 2022-02-23T12:16:53Z

Also netCDF is not pronounced correctly. I think I should have written net CDF or net-CDF (I forgot about it).

You can add these in bin/ari-map.yml, Keep writing netCDF in your slides (better for screen readers/etc), and then the ari-map will map those terms to the way to pronounce them.

hexylena · 2022-02-23T12:21:23Z

I guess sentences were far too long

Ahh I see what happened, you didn't use bullet points, so they were treated as individual lines. Until now most people have used bullet points or at least had a full sentence on a single line, rather than wrapping which is what's causing the error.

If you rearrange the subtitles so an entire line of text is a single line in the file, this will fix it.

Anne Fouilloux added 22 commits December 23, 2021 10:04

template for Galaxy Pangeo Tutorial

5851fc4

slide template for Pangeo ecosystem

1f72d97

Add intro for Pangeo slides

3aaa8c0

explain about pangeo

e42aeda

split pangeo training in two (Galaxy tool and Pangeo notebook)

1cb326e

start pangeo notebook tutorial

7078d8d

first final draft for pangeo notebook training.

8decdc5

add conclusion and references

97424d2

Create template for Pangeo with Galaxy Tools

ae9ea1a

Pangeo Galaxy tutorial with Xarray: add first steps

5153adc

add final section for Pangeo tutorial (Xarray Galaxy Tools)

1261fda

update slides for pangeo notebook tutorial

bdf04f2

fix formatting issues

4ec5232

Add Galaxy workflow for Pangeo tutorial

bf3253c

Add placeholder for slides on STAC

70f0b3e

Merge branch 'main' into pangeo

4087757

add STAC and info on pangeo software stack

95188a4

try to clarify pangeo-forge

76186bc

add info on how pangeo and stac interact.

5db12da

add Ryan's suggestions for STAC and Pangeo and intake-stac

b173610

fix contributor list by adding missing contributors

8f62dfd

Add missing annotation for pangeo-notebook workflow

e1474b2

Test basic review (#17)

c797d9d

* Remove duplicated however * Remove duplicates creating history mention

annefou added 3 commits January 28, 2022 18:51

update for automating jupyter notebook

83d577d

add presenter notes for slides "Pangeo 101 for everyone"

056567a

start presenter notes for slides

537a1cf

Anne Fouilloux and others added 6 commits February 18, 2022 14:30

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

d6e4710

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

a8da3a5

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

64d35b9

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

d7cc58d

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

63a803b

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

Update topics/climate/tutorials/pangeo-notebook/tutorial.md

85a4aa0

Co-authored-by: Yvan Le Bras <yvan.le-bras@mnhn.fr>

update to take into account review comments

ce0184d

yvanlebras reviewed Feb 18, 2022

View reviewed changes

remove unnecessary sentence about installation of packages.

f42c739

gallardoalba approved these changes Feb 18, 2022

View reviewed changes

yvanlebras approved these changes Feb 18, 2022

View reviewed changes

gallardoalba merged commit 72087e5 into galaxyproject:main Feb 18, 2022

annefou deleted the pangeo branch February 19, 2022 08:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pangeo training material for Big data geosciences #3147

Pangeo training material for Big data geosciences #3147

annefou commented Jan 28, 2022 •

edited

yvanlebras commented Jan 28, 2022

hexylena commented Jan 28, 2022

annefou commented Jan 28, 2022

hexylena commented Jan 28, 2022

annefou commented Feb 18, 2022

annefou commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

annefou commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

annefou commented Feb 18, 2022 •

edited

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

bgruening commented Feb 19, 2022

shiltemann commented Feb 21, 2022

annefou commented Feb 21, 2022

shiltemann commented Feb 23, 2022

annefou commented Feb 23, 2022

hexylena commented Feb 23, 2022

hexylena commented Feb 23, 2022

Pangeo training material for Big data geosciences #3147

Pangeo training material for Big data geosciences #3147

Conversation

annefou commented Jan 28, 2022 • edited

yvanlebras commented Jan 28, 2022

hexylena commented Jan 28, 2022

annefou commented Jan 28, 2022

hexylena commented Jan 28, 2022

annefou commented Feb 18, 2022

annefou commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

annefou commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

annefou commented Feb 18, 2022 • edited

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

yvanlebras commented Feb 18, 2022

bgruening commented Feb 19, 2022

shiltemann commented Feb 21, 2022

annefou commented Feb 21, 2022

shiltemann commented Feb 23, 2022

annefou commented Feb 23, 2022

hexylena commented Feb 23, 2022

hexylena commented Feb 23, 2022

annefou commented Jan 28, 2022 •

edited

annefou commented Feb 18, 2022 •

edited