Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research and discuss challenges around access/security policies of institutional clouds #184

Open
4 tasks
choldgraf opened this issue Jul 30, 2021 · 5 comments
Labels
Task Actions that don't involve changing our code or docs.

Comments

@choldgraf
Copy link
Member

Summary

In the Pangeo cloud deployment (#136 and 2i2c-org/infrastructure#482), we are running into a lot of headaches because we are running the cloud infrastructure on a project that is controlled by Columbia University, rather than our own project.

This is causing a lot of extra work because:

  1. We need to create Columbia accounts for any people that wish to access the infrastructure
  2. We must abide by Columbia's institutional policies regarding deployments on the cloud

Both of these things suggest that running infrastructure in this was is not a sustainable approach. It takes too much special-casing for each institution. While it may be worth it for Pangeo because of the scope of the collaboration, it won't be worth it for most organizations (or it will be prohibitively expensive for them).

What can we do?

We should research and understand our options for avoiding this complexity in the future. It seems like the easiest approach would be investigating whether it's possible to use university grants that pay for infrastructure on 2i2c projects, rather than having 2i2c access grant infrastructure on the university cloud project. Perhaps @rabernat could brainstorm this with us a bit as well.

Actions

  • Keep an eye on this issue as we continue to run hub infrastructure for others
  • Run some interviews with people in universities to understand whether Columbia is representative of other universities with respects to cloud infra
  • Brainstrom some ways around this
  • ...next steps here...
@choldgraf choldgraf added the Task Actions that don't involve changing our code or docs. label Jul 30, 2021
@sgibson91
Copy link
Member

sgibson91 commented Aug 3, 2021

Some notes from a chat I had with Arielle Bennett, project manager of the Tools, Practices and Systems programme at the Turing:

  • mostly dependent on grant conditions and legal requirements of the organisation
  • for contractors, a services agreement should be setup anyway and we should perhaps use the negotiation process of that to establish and agree with an org where we will setup the infrastructure
  • this is all highly variable institution to institution, and grant policies to grant policies

Sarah's take:

  • we'll have to go through a (long) negotiation phase and ask the questions RE 'neutral' cloud project vs. institution cloud project, and what controls there are in place if the latter, during this phase before we sign anything/agree to do the work
  • Need to figure out where we draw the lines for
    • We are happy to do this work
    • We are happy to do this work under these conditions, but we need to charge a higher service fee to cover overheads
    • We are not happy to do this work

@rabernat
Copy link

rabernat commented Aug 3, 2021

Folks, so sorry for the headaches this is causing!

What we could try to do is move the entire cloud budget from Columbia to 2i2c. It would require rewriting our subaward agreement, but we have to do that anyway.

@damianavila
Copy link
Contributor

Folks, so sorry for the headaches this is causing!

No need to apologize, IMHO, this is part of the process we need to go through to "define" our service.

What we could try to do is move the entire cloud budget from Columbia to 2i2c. It would require rewriting our subaward agreement, but we have to do that anyway.

@sgibson91 and @yuvipanda, since you have been more closely involved in this deployment, what are your thoughts on that proposal?

@sgibson91
Copy link
Member

@sgibson91 and @yuvipanda, since you have been more closely involved in this deployment, what are your thoughts on that proposal?

I think if rewriting the subaward grant mitigates #136 and 2i2c-org/infrastructure#575, then it's worth it

The work on private nodes benefits all our hubs by having more secure nodes without impacting the Jupyter front-end (most likely, I haven't tested it with a hub yet!). And the dynamic backends for terraform work I believe could be generalised further for any storage space on any cloud (as opposed to any bucket in GCP) if that's something we end up needing.

@choldgraf
Copy link
Member Author

Just to echo @damianavila - I think this is a really important learning experience to understand where the pain points will be in working with universities. For example, we really need to find a way to serve university stakeholders without requiring us to create email accounts for each person that does work there. But I worry that this won't be possible if the university wants us to use their cloud accounts (e.g., because they have institutionally-negotiated cloud rates). This kind of thing definitely won't be unique to Columbia, so we should have a good understanding of the challenges here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Task Actions that don't involve changing our code or docs.
Projects
None yet
Development

No branches or pull requests

4 participants