Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of current Pangeo deployments #232

Closed
jhamman opened this Issue May 1, 2018 · 18 comments

Comments

Projects
None yet
9 participants
@jhamman
Copy link
Member

jhamman commented May 1, 2018

We'd like to develop a current listing of Pangeo deployments. If you have deployed pangeo in one form or another, please speak up!

Information we'd like to know:

  1. Cloud or HPC (or other)? Which system (e.g. Google Cloud, NCAR's Cheyenne)?
  2. How are you deploying dask distributed (KubeCluster, dask-jobqueue, dask-mpi, etc.)?
  3. How are you deploying Jupyter (Jupyterhub, single user)
  4. Primary use (Tutorial, individual research, etc.)?

(I'll provide a few examples below, sharing a url to your jupyterhub deployment is not required).

xref #229


pangeo.pydata.org (Pangeo EarthCube)

  1. Platform: Google cloud
  2. Dask: KubeCluster
  3. Jupyter: JupyterHub
  4. Use: Exploratory deployment for Pangeo EarthCube project. Used for demos, tutorials, research.

pangeo-aws.cloudmaven.org (University of Washington)
Note: this deployment will be taken down soon

  1. Platform: Amazon cloud
  2. Dask:KubeCluster
  3. Jupyter: JupyterHub
  4. Use: Demo deployment / proof of concept for proposal.

Cheyenne, Caldera, and Geyser (NCAR)

  1. Platform: HPC
  2. Dask: dask-jobqueue and dask-mpi
  3. Jupyter: Single user notebook servers (jupyterhub coming soon: #26 )
  4. Use: Individual research
@jhamman

This comment has been minimized.

Copy link
Member Author

jhamman commented May 1, 2018

@rabernat

This comment has been minimized.

Copy link
Member

rabernat commented May 1, 2018

NASA Pleiades Cluster

  1. Platform: HPC
  2. Dask: dask-mpi and custom slurm launch scripts
  3. Jupyter: single user notebook servers
  4. Use: Ongoing ocean and climate data analysis

Columbia Habanero Cluster

  1. Platform: HPC
  2. Dask: dask-mpi and custom slurm launch scripts
  3. Jupyter: single user notebook servers
  4. Use: Ongoing ocean and climate data analysis
@tjcrone

This comment has been minimized.

Copy link
Contributor

tjcrone commented May 1, 2018

Lamont Real-Time Earth Pangeo Cluster

  1. Platform: Azure cloud
  2. Dask: KubeCluster
  3. Jupyter: JupyterHub
  4. Use: Earth/ocean science research
@guillaumeeb

This comment has been minimized.

Copy link
Member

guillaumeeb commented May 2, 2018

HAL (CNES)

  1. Platform: HPC (PBSPro)
  2. Dask: dask-jobqueue and custom launch scripts
  3. Jupyter: Single user notebook servers (working on a Jupyterhub service)
  4. Use: Demo deployment / proof of concept
@rsignell-usgs

This comment has been minimized.

Copy link
Member

rsignell-usgs commented May 2, 2018

Yeti Cluster (USGS)

  1. Platform: HPC (Slurm)
  2. Dask: dask-jobqueue and custom launch scripts
  3. Jupyter: Single user notebook servers
  4. Use: Analysis of coupled ocean, atmosphere, wave and sediment transport model output

@mrocklin mrocklin referenced this issue May 2, 2018

Merged

Update docs #50

@jgerardsimcock

This comment has been minimized.

Copy link

jgerardsimcock commented May 2, 2018

https://compute.rhg.com/hub/login

  1. Platform: GKE
  2. Dask: KubeCluster
  3. Jupyter: Single-user notebooks spawned from Jupyterhub
  4. Use: Climate Research and analysis
@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented May 2, 2018

@rabernat

This comment has been minimized.

Copy link
Member

rabernat commented May 2, 2018

@rsignell-usgs: didn't you and @jreadey win an Amazon award for AWS credits to deploy a pangeo cluster? Can you share these details? Is the cluster publicly accessible?

@rsignell-usgs

This comment has been minimized.

Copy link
Member

rsignell-usgs commented May 2, 2018

@rabernat, yes we received the AWS award, but I'm a bit ashamed to say we don't have a Pangeo cluster running there yet. 😳

Hopefully this will change soon: @jacobtomlinson has agreed to help us try to deploy it live via screenshare Friday using EKS, and we will record the session in case it's useful to others.

If others want to join us live, that's fine also:
Friday, May 4, 2018 (Time: 2:30pm BST, 9:30am EST, 6:30am PST) https://www.gotomeeting.com/join/533510693

@rabernat

This comment has been minimized.

Copy link
Member

rabernat commented May 2, 2018

Considering the amount of data already in S3 (e.g. #234), it would be quite advantageous to have a general-use pangeo cluster on AWS. We are all happy to help.

@rsignell-usgs

This comment has been minimized.

Copy link
Member

rsignell-usgs commented May 2, 2018

There is a lot of NOAA Big Data there, but a lot of it is simply NetCDF files parked on S3, which we then need to rewrite to Zarr or HSDS for efficient access: #234 (comment)

@rabernat

This comment has been minimized.

Copy link
Member

rabernat commented May 2, 2018

@rsignell-usgs - maybe it would be good if you could move that comment to #234, where we can discuss in more detail without taking this thread off topic.

@jhamman

This comment has been minimized.

Copy link
Member Author

jhamman commented May 3, 2018

Those that have provided a summary of their pangeo deployment may be interested in #238. Feel free to comment on the description of your favorite cluster/cloud.

@jacobtomlinson and @TomAugspurger - okay if I add your deployments myself?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

TomAugspurger commented May 4, 2018

@rabernat

This comment has been minimized.

Copy link
Member

rabernat commented May 4, 2018

Wasn't there someone who came along a few months ago and said they used our setup guides to deploy on a university cluster? I have a distinct memory of this, but I can't find any record of the exchange.

@jhamman jhamman closed this in #238 May 4, 2018

@jhamman

This comment has been minimized.

Copy link
Member Author

jhamman commented May 4, 2018

I merged the first iteration of this. I'll reopen and we can add more as they come available. https://pangeo-data.github.io/pangeo/deployments.html points to this issue for adding additional deployments.

@stale

This comment has been minimized.

Copy link

stale bot commented Jul 13, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 13, 2018

@stale

This comment has been minimized.

Copy link

stale bot commented Jul 20, 2018

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@stale stale bot closed this Jul 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.