Mount notebooks into a mounted fs volume to avoid data loss #873

timothyjlaurent · 2016-01-25T03:47:02Z

The current way this is done:

docker run -p 8888:8888 -it --rm $USER/assignments

creates an ephemeral container due to the --rm flag. In the first notebook you are doing very expensive operations (gunziping) several GBs. All of this data will be lost when the container shuts down. Using the filesystem as a mounted volume, allows subsequent sessions to have access to the files created (eg pickled data). This is especially important because this is course material so want to avoid time consuming gotchas for the students.

tensorflow-jenkins · 2016-01-25T03:47:04Z

Can one of the admins verify this patch?

googlebot · 2016-01-25T03:47:06Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please let us know the company's name.

timothyjlaurent · 2016-01-25T03:50:49Z

I signed it!

googlebot · 2016-01-25T03:50:51Z

CLAs look good, thanks!

vincentvanhoucke · 2016-01-25T03:55:23Z

@craigcitro any gotcha going down that route?
Should I also be advertising not using --rm for users of the pre-built Docker container?

craigcitro · 2016-01-25T07:14:29Z

No, these both seem pretty reasonable. Two thoughts:

The --rm thing is definitely a smart move -- I tend to spend more time debugging docker images than using them right now. For this case, though, it does make it a bit too easy to lose notebooks.
Mounting the notebooks dir as a volume also makes sense. This is one where mounting it ("I want to make changes that get saved") and not mounting it ("I'm playing around with something and it's a throwaway") both make sense, so maybe some text explaining when you would or wouldn't want the option? As a minor thing, no need to cd -- just pass the subdir path instead of $(pwd)?

vincentvanhoucke · 2016-01-25T14:09:23Z

@timothyjlaurent thanks for the fix. Would you mind changing the doc for the hosted container as well and removing the 'cd'?

timothyjlaurent · 2016-01-25T21:46:15Z

Ok @vincentvanhoucke I updated the readme.

vincentvanhoucke · 2016-01-25T22:08:57Z

tensorflow/examples/udacity/README.md

+
+To avoid losing work between sessions in the container, it is recommended that you mount the `tensorflow/examples/udacity` directory into the container:
+
+    docker run -p 8888:8888 -v </path/to/tesorflow/examples/udacity>:/notebooks -it --rm $USER/assignments


s/tesorflow/tensorflow/

timothyjlaurent · 2016-01-25T22:59:54Z

I fixed the typo

vincentvanhoucke · 2016-01-25T23:30:01Z

LGTM @vrv

vrv · 2016-01-26T07:07:03Z

Thanks! Squash the commits and we'll merge.

The current way this is done: ``` docker run -p 8888:8888 -it --rm $USER/assignments ``` creates an ephemeral container due to the `--rm` flag. In the first notebook you are doing very expensive operations (gunziping) several GBs. All of this data will be lost when the container shuts down. Using the filesystem as a mounted volume, allows subsequent sessions to have access to the files created (eg pickled data). This is especially important because this is course material so want to avoid time consuming gotchas. update readme Re ephemeral vs mounted notebooks dir

timothyjlaurent · 2016-01-26T19:02:53Z

Commits have been squashed.

There is one caveat to this PR -- if the user needs to rebuild the docker image -- all the files underneath will need to be added to the context for the build. This is several GB for the first assignment so it takes a LONG time.

craigcitro · 2016-01-26T21:21:32Z

We aren't running any docker build commands from the top of the tree, though, are we? (Just the one docker run?)

vincentvanhoucke · 2016-01-26T21:31:54Z

Merged. Thanks!

timothyjlaurent · 2016-01-26T21:32:53Z

Well in the readme it says:

 Building a local Docker container
 ---------------------------------

     cd tensorflow/examples/udacity
     docker build -t $USER/assignments .

If you use that same directory to mount the notebooks then it will fill up with data files. Then if you need to remake the image all of that will have to be put in the context. When I had to do this, I just moved the data directories temporarily to .. and then back in.

So maybe we should recommend copying the udacity dir to some work dir location and then mount that into the container or mention the issue with context and explain that they should move extraneous files elsewhere while rebuilding the image.

vincentvanhoucke · 2016-01-26T21:38:18Z

I'm not sure why you, as a class user, would ever want to build the container?
These instructions are for us, when we want to update the hosted containers.

timothyjlaurent · 2016-01-26T21:47:32Z

The course materials instructed me to build the image:

If there is an image I could pull, that would definitely be better, but there is no mention of this image in the course materials as far as I can see.

vincentvanhoucke · 2016-01-26T23:07:08Z

Arpan, there should be no need for anyone to build a container. Was the doc updated?

tensorflow-jenkins · 2016-01-26T23:07:12Z

Can one of the admins verify this patch?

tensorflow-jenkins · 2016-01-26T23:07:12Z

Can one of the admins verify this patch?

timothyjlaurent · 2016-01-26T23:26:54Z

If there is a hosted image that should be used for the class, the command should look something like this:

docker run --rm -v <path to notebooks>:/notebooks tensorflow/udacity

vincentvanhoucke · 2016-01-26T23:39:00Z

@timothyjlaurent Access to the hosted version is described on the first line in the document you edited:
docker run -p 8888:8888 -it --rm b.gcr.io/tensorflow-udacity/assignments
Maybe I should separate the rest of this readme so that it stands out more.

timothyjlaurent · 2016-01-27T00:26:09Z

Oh I see it now that you point it out. Maybe the course should be updated to instruct users to use that image rather than build their own.

timothyjlaurent · 2016-01-27T01:58:22Z

Also now that I see the first line (although it wasn't shown in the course material)
This should probably also mount the working directory as a volume.
It takes maybe 20-30 minutes to decompress the data files. It also loads all this data into memory -- I had to upgrade my docker machine VM to have 8GB of RAM to avoid running out of memory. All of this should be disclosed to the student.

vincentvanhoucke · 2016-01-31T16:14:23Z

@timothyjlaurent we pushed some changes this weekend that should help with memory.

Updated calls to '..._cross_entropy_with_logits' to add arguments

googlebot added the cla: no label Jan 25, 2016

timothyjlaurent changed the title ~~Mount notebooks into a volume to avoid data loss with --rm flag~~ Mount notebooks into a mounted fs volume to avoid data loss Jan 25, 2016

googlebot added cla: yes and removed cla: no labels Jan 25, 2016

vincentvanhoucke assigned craigcitro and vincentvanhoucke and unassigned craigcitro Jan 25, 2016

vincentvanhoucke added the udacity label Jan 25, 2016

vincentvanhoucke reviewed Jan 25, 2016
View reviewed changes

vincentvanhoucke added a commit that referenced this pull request Jan 26, 2016

Merge pull request #873 from timothyjlaurent/patch-1.

5758d53

vincentvanhoucke closed this Jan 26, 2016

vincentvanhoucke removed their assignment Jan 26, 2016

vincentvanhoucke assigned napratin Jan 26, 2016

vincentvanhoucke reopened this Jan 26, 2016

vincentvanhoucke closed this Jan 31, 2016

tarasglek pushed a commit to tarasglek/tensorflow that referenced this pull request Jun 20, 2017

Merge pull request tensorflow#873 from tensorflow/add-arguments

09e0cdc

Updated calls to '..._cross_entropy_with_logits' to add arguments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mount notebooks into a mounted fs volume to avoid data loss #873

Mount notebooks into a mounted fs volume to avoid data loss #873

timothyjlaurent commented Jan 25, 2016

tensorflow-jenkins commented Jan 25, 2016

googlebot commented Jan 25, 2016

timothyjlaurent commented Jan 25, 2016

googlebot commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

craigcitro commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

timothyjlaurent commented Jan 25, 2016

vincentvanhoucke Jan 25, 2016

timothyjlaurent commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

vrv commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

craigcitro commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

tensorflow-jenkins commented Jan 26, 2016

tensorflow-jenkins commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 27, 2016

timothyjlaurent commented Jan 27, 2016

vincentvanhoucke commented Jan 31, 2016


		To avoid losing work between sessions in the container, it is recommended that you mount the `tensorflow/examples/udacity` directory into the container:

		docker run -p 8888:8888 -v </path/to/tesorflow/examples/udacity>:/notebooks -it --rm $USER/assignments

Mount notebooks into a mounted fs volume to avoid data loss #873

Mount notebooks into a mounted fs volume to avoid data loss #873

Conversation

timothyjlaurent commented Jan 25, 2016

tensorflow-jenkins commented Jan 25, 2016

googlebot commented Jan 25, 2016

timothyjlaurent commented Jan 25, 2016

googlebot commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

craigcitro commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

timothyjlaurent commented Jan 25, 2016

vincentvanhoucke Jan 25, 2016

Choose a reason for hiding this comment

timothyjlaurent commented Jan 25, 2016

vincentvanhoucke commented Jan 25, 2016

vrv commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

craigcitro commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

tensorflow-jenkins commented Jan 26, 2016

tensorflow-jenkins commented Jan 26, 2016

timothyjlaurent commented Jan 26, 2016

vincentvanhoucke commented Jan 26, 2016

timothyjlaurent commented Jan 27, 2016

timothyjlaurent commented Jan 27, 2016

vincentvanhoucke commented Jan 31, 2016