Skip to content

Latest commit

 

History

History
77 lines (41 loc) · 6.52 KB

users-getting-started.md

File metadata and controls

77 lines (41 loc) · 6.52 KB

Users: How to get access to pangeo-eosc services?

In this section you will learn how to register and access pangeo-eosc services.

Registration

You need to create an EGI Check-in account and enroll to the vo.pangeo.eu Virtual Organisation. There are several steps to follow:

  1. Sign up for an EGI Check-in account following these steps. Using ORCID iD to authenticate is recommended.
  2. Enroll in the vo.pangeo.eu Virtual Organisation (VO) by clicking on the enrollment URL using the EGI Check-in account created in the previous step. Review and click on Submit. Please add a note in the statement of purpose when requesting to join the VO explaining why you want to access pangeo-eosc.

Managers of the Virtual Organisations may take several days to approve your petitions to join and also get back to you via email to verify your identity.

Access DaskHub

Access DaskHub via https://pangeo-eosc.vm.fedcloud.eu/ and choose among the 4 available flavors (as shown on the figure below):

Cloud EGI JupyterHub flavors

  • Pangeo Notebook uses a docker image maintained by the Pangeo community. It contains all the Python packages you need to data analysis and visualization. The list of packages and all the Pangeo Notebook environment is made available here; look up the pangeo-notebook folder.
  • Machine Learning Pangeo notebook with GPU enable tensorflow2: similarly, it is maintained by the Pangeo community and the complete computational environment with the list of Python packages is also available at https://github.com/pangeo-data/pangeo-docker-images in the ml-notebook folder. This flavor contains all the packages from the Pangeo Notebook flavor and is GPU-enabled tensorflow2. Choose this flavor if you need GPUs; for instance for training neural networks;
  • Machine Learning Pangeo notebook with GPU enable pytorch: it is the same as ml-notebook but with GPU-enabled pytorch.
  • Datascience Notebook with Python, R and Julia is maintained by the Jupyter community at https://github.com/jupyter/docker-stacks. Look up the datascience-notebook folder. It contains 3 different kernels, namely Python, R and Julia notebooks. Please note that you would probably need to add additional packages as the list of available packages is not exhaustive.

Currently (September 2023) we have configured quotas to host 20 simultaneous users with Jupyter (8 CPUs, 32GB RAM) and a Dask cluster (max: 4 workers, each worker with 8 CPUs and 32 GB RAM). This is subject to change depending on usage and resource availability at CESNET. You need to click on Sign in with EGI Check-in and then use your ORCID iD credentials.

A Dask Gateway is available for scaling your computation. For more details on this deployment, you may want to take a look at Daskhub helm chart.

Access MinIO

Each user has a very small amount of local storage when using the DaskHub as it is not meant to be used for storing large data. Instead a dedicated MinIO Object storage has been setup.

The MinIO console endpoint is: https://pangeo-eosc-minio.vm.fedcloud.eu/. You can authenticate to the MinIO Object Storage in the same way you login to DaskHub. As shown on the Figure below, make sure you "Select Other Authentication Method" and "Login with SSO (checkin)" to access the MinIO console. Then use your ORCID iD to login.

minIO Login

You can create, access and manage your buckets from the minIO console (or use minIO Python package). The figure below shows the GUI (with several tabs on the left; the bucket tab is selected on the figure): initially, you won't have any buckets so please feel free to create public/privates buckets.

minIO buckets

In addition to the MinIO console, the API end point is https://pangeo-eosc-minioapi.vm.fedcloud.eu/ for those who prefer to interact with MinIO via the API.

Please check out this example to get started.

Support

If you need support, please open an issue.

Monitoring

Check out the open grafana dashboard. It is particularly useful to check that there are GPUs available before requesting an environment with GPU.

Weekly coffee meetings

Join the Pangeo community in Europe in a weekly call every Tuesday at 9.30am CET/CEST at: https://meet.jit.si/pangeo-europe

Attend the meeting not only to get to know each other but also to ask questions about how to use the Pangeo ecosystem.

How to acknowledge Pangeo-EOSC

Pangeo-EOSC has benefited from services and resources provided by the EGI-ACE project (funded by the European Union’s Horizon 2020 research and innovation programme under Grant Agreement no. 101017567), and the C-SCALE project (funded by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 101017529), with the dedicated support of CESNET.

The European Open Science Cloud (EOSC)

EOSC logo

The European Open Science Cloud (EOSC) aims at becoming the main environment for hosting and processing research data to support European Science.

Pangeo Europe

Pangeo logo

Pangeo is a worldwide community for Big Data geoscience promoting open, reproducible, and scalable science.

Pangeo Europe aims at highlighting European contributions to the Pangeo Community and at providing a reference deployment for Pangeo on EOSC. The Pangeo deployment on EOSC has been made possible thanks to CESNET in the context of the the EGI-ACE project and the C-SCALE project.