Pangeo Cloud Datastore
Browseable Online Website: https://pangeo-data.github.io/pangeo-datastore/
The master intake catalog URL is
Using this catalog requires package versions that are quite recent as of April, 2019.
To open the catalog and load a dataset from python, you can run the following code
import intake cat_url = 'https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml' cat = intake.open_catalog(cat_url) ds = cat.atmosphere.gmet_v1.to_dask()
To explore the whole catalog, you can try
Accessing requester pays data
Several of the datasets within the cloud data catalog are contained in requester pays storage buckets. This means that a user requesting data must provide their own billing project (created and authenticated through Google Cloud Platform) to be billed for the charges associated with accessing a dataset. To set up an GCP billing project and use it for authentication in applications:
- Create a project on GCP; if this is the first time using GCP, a prompt will appear to choose a Google account to link to all GCP-related activities.
- Create a Cloud Billing account associated with the project and enable billing for the project through this account.
- Using Google Cloud IAM, add the Service Usage Consumer role to your account, which enables it to make billed requests on the behalf of the project.
- Through command line, install the Google Cloud SDK; this can be done using conda:
conda install -c conda-forge google-cloud-sdk
- Initialize the
gcloudcommand line interface, logging into the account used to create the aforementioned project and selecting it as the default project; this will allow the project to be used for requester pays access through the command line:
gcloud auth login gcloud init
- Finally, use
gcloudto establish application default credentials; this will allow the project to be used for requester pays access through applications:
gcloud auth application-default login
To suggest adding a new dataset, please open an issue.