Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project: GEOS-Chem Proof-of-Concept #154

Closed
5 tasks done
cmbz opened this issue Dec 22, 2023 · 15 comments
Closed
5 tasks done

Project: GEOS-Chem Proof-of-Concept #154

cmbz opened this issue Dec 22, 2023 · 15 comments
Assignees
Labels
Project: GEOS-Chem Proof of concept and other work related to support for GEOS-Chem Proof of Concept Issue relates to a proof-of-concept deliverable Size: 30

Comments

@cmbz
Copy link
Contributor

cmbz commented Dec 22, 2023

Overview

Two-phase project to investigate and pilot large data and computation support for GEOS-Chem datasets using a containerized Dataverse installation running on Mass Open Cloud resources.

The proof-of-concept will be demoed at the Mass Open Cloud Alliance Conference (2024/02/28)

Participants

Timeline

  • 2024/01 - 31: Establish a containerized Dataverse installation on Mass Open Cloud resources
  • 2024/02/28: Proof of concept
  • 2024/03: Further data hosting and curation investigation

Tasks

January, 2024 and February, 2024

  • GEO-Chem team will assemble a slide about project that Stefano can include in his keynote slide deck
  • Dataverse team will implement containerized Dataverse installation on MOC (using Stefano's account)
  • With assistance from Dataverse team, GEOS-Chem team will upload a sample of benchmark output (e.g., 10 year benchmark data) to Dataverse installation on MOC
  • GEOS-Chem team will create a small Jupyter notebook to perform simple computations on the local data

March, 2024

Related

Resources

@cmbz cmbz self-assigned this Dec 22, 2023
@cmbz cmbz added the Proof of Concept Issue relates to a proof-of-concept deliverable label Jan 3, 2024
@cmbz cmbz mentioned this issue Jan 22, 2024
13 tasks
@cmbz cmbz changed the title Project: GEOS-CHEM Proof-of-Concept Project: GEOS-Chem Proof-of-Concept Jan 23, 2024
@cmbz cmbz added the Project: GEOS-Chem Proof of concept and other work related to support for GEOS-Chem label Jan 24, 2024
@cmbz
Copy link
Contributor Author

cmbz commented Jan 26, 2024

2024/01/26

@landreev
Copy link

I'll be using this issue to track the effort of porting the GEOS-Chem supplied notebook into something that can be deployed on MERC and used in the context of the MOC-PoC presentation.

@landreev landreev self-assigned this Feb 12, 2024
@cmbz
Copy link
Contributor Author

cmbz commented Feb 12, 2024

Thanks @landreev But please give me a heads' up before you close the issue.

@landreev
Copy link

Wasn't planning to close it, no.
@pdurbin and I briefly discussed opening a separate local issue to track the dev. effort needed to port the notebook into the framework of the demo. We decided to use this one instead, since it was already there. But I can still open a new one if you prefer.

@landreev
Copy link

(I just wanted to have something "in progress" to reflect this task, since this is the focus of the MOC-POC effort at this point).

@cmbz
Copy link
Contributor Author

cmbz commented Feb 12, 2024

@landreev totally fine to keep on with this issue!

@cmbz cmbz added the Size: 10 label Feb 12, 2024
@pdurbin
Copy link
Member

pdurbin commented Feb 12, 2024

If it helps, I now have OpenShift running locally on my laptop because I was looking at this PR:

I just started a thread in Slack if it would be helpful or interesting for me to try to run the notebook on my local version of OpenStack, but I'm not sure where to begin.

@pdurbin
Copy link
Member

pdurbin commented Feb 13, 2024

I'm not sure if this helps or not @r1beguin recently demoed launching JupyterLab from Dataverse. Here are some screenshots from his July 2023 community call presentation:

Screenshot 2024-02-13 at 9 46 28 AM Screenshot 2024-02-13 at 9 46 35 AM Screenshot 2024-02-13 at 9 46 43 AM

The code is here: https://forgemia.inra.fr/dipso/eosc-pillar/dataverse-jupyterhub-connector

I just merged a PR where there's a nice writeup of the tool on our "integrations" page:

Also, from their README, here's a diagram of how it works:

Screenshot 2024-02-13 at 9 52 40 AM

@landreev
Copy link

JupyterHUB, not "Lab", right?

@pdurbin
Copy link
Member

pdurbin commented Feb 13, 2024

@landreev whoops, yes, hub not lab.

@landreev
Copy link

Can this setup be used in our case, for the purposes of the demo? - I don't fully understand this part.

I passed the ssh key to a NERC VM to Bob Yantosca, the author of the notebook, yesterday and asked him to install it, replicating the environment under which he developed it. The data files are already saved on the instance locally. I'm waiting to hear from him. Once it's running like that, we'll at least be able to see what it's looking like, and then we can add extras to it - the storage calls, the passing of parameters and figuring out how it can be deployed in a container. So this is the extent of my current plan.

@landreev
Copy link

I have a very crude/fake/hard-coded/everything glued together with dog drool kind of a demo that nevertheless ties the pieces together - the dataset with the GEOS-Chem datafiles in it and the "external tool" that sends the user to the statistics notebook, that in turn generates pretty graph images. I will post links/images in the slack channel as a quick status update, and will continue working making the whole thing less fake/hard-coded.

@landreev
Copy link

I marked the remaining demo-related items on the checklist as completed and I'm removing my name from the issue (@cmbz you asked me not to close it - so, leaving it as is).
This is under the assumption that this completed "for the purposes of the demo presentation", as a quick proof of concept only. I will open a new issue in the main repo for working out a real infrastructure setup that will allow users to run arbitrary, non-hard coded computation code on a cluster. That is the next logical step, and it makes sense to work on this while we have access to the NERC cluster facilities.

@landreev landreev removed their assignment Feb 27, 2024
@pdurbin pdurbin removed their assignment Feb 27, 2024
@pdurbin
Copy link
Member

pdurbin commented Feb 27, 2024

I'm not actively working on this so I removed my name as well.

@cmbz
Copy link
Contributor Author

cmbz commented Feb 28, 2024

Closing issue as complete. Follow up work to create Harvard Dataverse Repository GEOS-Chem collections will continue here: #178

@cmbz cmbz closed this as completed Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project: GEOS-Chem Proof of concept and other work related to support for GEOS-Chem Proof of Concept Issue relates to a proof-of-concept deliverable Size: 30
Projects
None yet
Development

No branches or pull requests

3 participants