Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limited to google cloud? #29

Open
tachim opened this issue Jan 12, 2017 · 6 comments
Open

limited to google cloud? #29

tachim opened this issue Jan 12, 2017 · 6 comments

Comments

@tachim
Copy link

tachim commented Jan 12, 2017

Hi, just wondering if this library is limited to Google Cloud or if it can be used on other kubernetes deployments as well (e.g. on EC2)?

@timodonnell
Copy link
Member

It's limited to google cloud, since it only supports google cloud storage buckets and not an AWS storage service like S3. The bucket interface code is simple though, and if you were up for writing an S3 backend we'd be happy to have it. I haven't used kubernetes on AWS but besides the storage backend I don't see any reason this wouldn't work.

@tachim
Copy link
Author

tachim commented Jan 12, 2017

Sounds good, looks simple enough. The idea here is to support binary uploads/downloads to pass around arguments/return values to the containers? A couple questions:

  1. I see some ACL-related commands there; does kubeface rely on ACLs for permissions between jobs or is that just part of using GCE?
  2. What are the requirements for the base images that kubeface uses? Do they need to have kubeface preinstalled, or just python?

@timodonnell
Copy link
Member

Awesome. Yep, that's right. The uploads and downloads are used to pass input and output data to and from tasks.

I updated the README for the project to hopefully give a better overview of things. Let me know if anything is unclear.

  1. We're not doing anything important with ACLs or permissions. They are just there because the google client code seems to want them, but they can be ignored.

  2. Good question. If kubeface is not already installed in the docker image then it will try to install itself via pip when the pod starts up (implemented here). The most efficient thing to do is to have a docker image with whatever libraries your code uses plus kubeface. For getting started and testing though, relying on the automatic pip installation seems to work. I've been using the continuumio/anaconda3 docker image for testing.

@tachim
Copy link
Author

tachim commented Jan 22, 2017

Thanks. One more question -- does this library provide any support for copying a source tree over to the docker instances before running the function, or is the client responsible for handling that? E.g. if we have a module mymodule and want to use kubeface to run mymodule.experiment(hyperparams) over many instances of hyperparams, does kubeface handle copying mymodule over? What about mymodule's dependencies?

@timodonnell
Copy link
Member

Good question. We don't try to copy and source over to the worker currently. That seems error prone to me to try to do in general, but if you know of a good way to handle that I'd be interested.

What we do support for this case is a set of pip packages to install on the workers before running user code (the --worker-pip-packages flag implemented here). Since pip can install directly from github, my workflow has been to push analysis code to a github branch (with a suitible setup.py so pip can install it) and then specify a URL to the github branch with --worker-pip-packages. No need to actually put the module on pypi. It's not as seamless as if we transferred it to the worker from your workstation, but it does give you a simple way to handle this.

@tachim
Copy link
Author

tachim commented Jan 24, 2017

I see, that's a pretty reasonable solution.

In our case researchers will be working in their own docker instances so it seems like we can just commit/push the instance itself and use that as the base image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants