-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Deployment Pipeline #9
Comments
The overall concept looks really good. There are a few things that I would change a bit / where I see a need to discuss:
My points are a bit vague. I'm aware of that. The reason is that I'm not having a clear picture of the development workflow just yet. We have to iterate and optimize. |
DockerizationJust throwing in a few links...
|
Some answers from our recent discussion:
We should get some hands-on experience regarding using dockerized Tensorflow and deploying Docker containers on GCP/AWS. After we know our tools, we can define a proper pipeline. |
|
Today we had our first successful test on GCP. We managed to trigger a build of an image using the Image Building functionality of Google Container Registry. The build was triggered as intended previously by a simple tag. The final image was automatically pushed to the Google Container Registry ready for deployment. I will track all my steps, which I have done to setup everything. Please note that this is only testing. Branches such as GCP will be deleted. Catch me up here: GCP Building the Pipeline |
I tried to access Google Cloud Storage from Python, to save our models, logs and more in a persistence storage in GCP. I used Timo's linear-combination project as base. Unfortunately after creating service accounts on GCP, importing the access keys and the needed google cloud, I ran into access problems (403). |
The issue was solved without changing anything in regards of GCP or local configuration. The one and only difference was the access from a different network (university then and now private network). Following pictures show that Tensorflow automatically reads the path to key-files from the environment variable and successfully saves the trained model. We are also able to save other files. |
Next steps to take in regard of our deployment pipeline:
|
Within GCP, have you made a decision on whether to use (1) VMs, (2) Kubernetes, or (3) ML Engine? We can also discuss that at #12. |
@Simsso I am sorry for the delayed answer, as I am very limited in time these days. Currently I have made no decision on whether we are going to use VMs, K8s or the ML Engine in GCP, as I am still working into ML Engine currently. |
Today we made a decision on whether we are going to deploy on GCP or on AWS. The decision was that we use GCP, due to reasons which were already mentioned in #3. Within GCP we are going to use the ML Engine, due it high-abstraction level and design for research in ML. All previous tasks are extended / partly replaced by defined tasks for @doktorgibson #18 . All results regarding these tasks can be found in #11 . This issue should only contain discussion about the abstract design of our pipeline. |
Deployment pipeline has been successfully created. Next steps are to build tools for better enablement of research. |
Let's use this issue as a thread to communicate different, possible deployment pipelines for ML models. Once we have decided upon something we can go ahead and create a wiki page.
The text was updated successfully, but these errors were encountered: