The swapfile and backup tool combine to completely fill the disk of Datalab VMs #1192
Comments
This change alters the way that the persistent disk used for storing notebooks is exposed to the Datalab container. Rather than the entire disk being mounted by the container at `/content`, we just mount the `datalab` subdirectory at `/content/datalab`. That change allows us to create a `tmp` subdirectory, and mount that into the container at `/tmp`. That, in turn, prevents temp files created in the container from filling up the VM's boot disk. This fixes #1192
This change alters the way that the persistent disk used for storing notebooks is exposed to the Datalab container. Rather than the entire disk being mounted by the container at `/content`, we just mount the `datalab` subdirectory at `/content/datalab`. That change allows us to create a `tmp` subdirectory, and mount that into the container at `/tmp`. That, in turn, prevents temp files created in the container from filling up the VM's boot disk. This fixes #1192
Are we just backing up the git repo? Or something more? I wouldn't have
expected a swap file in the backup adding to the size...
…On Wed, Feb 15, 2017 at 3:42 PM, Omar Jarjur ***@***.***> wrote:
Closed #1192 <#1192> via
#1193 <#1193>.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1192 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA5ELX1VQFo8lhjUMk9F-VJeUPrs6QSks5rc410gaJpZM4MCNJh>
.
|
We are backing up the
It was a bug for the swapfile to be included in the backup, but the more important bug was the |
Should we be backing up only the git repo directory?
I can imagine a user saving some data locally that they pull down from
BQ/GCS, and that shouldn't be backed up as well. Or we really do want to
backup the entire PD?
…On Wed, Feb 15, 2017 at 4:07 PM, Omar Jarjur ***@***.***> wrote:
@nikhilk <https://github.com/nikhilk>
Are we just backing up the git repo? Or something more?
We are backing up the /content directory.
I wouldn't have expected a swap file in the backup adding to the size...
It was a bug for the swapfile to be included in the backup, but the more
important bug was the /tmp directory being on the boot disk, which is
only 20 GB.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1192 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA5EGiUFOVoH27fDoG6aBEEfWnSeo5Sks5rc5M0gaJpZM4MCNJh>
.
|
+1 for only the git repo directory. |
We had a discussion around this and decided to backup the entire disk
because it's cheap. We had some ideas around setting a file size threshold
as future work.
For what it's worth, the backup tool does take a path as an argument, and
this should be exposed as a Datalab config that the user can set.
On Wed, Feb 15, 2017 at 4:16 PM Nikhil Kothari <notifications@github.com>
wrote:
Should we be backing up only the git repo directory?
I can imagine a user saving some data locally that they pull down from
BQ/GCS, and that shouldn't be backed up as well. Or we really do want to
backup the entire PD?
On Wed, Feb 15, 2017 at 4:07 PM, Omar Jarjur ***@***.***>
wrote:
> @nikhilk <https://github.com/nikhilk>
>
> Are we just backing up the git repo? Or something more?
>
> We are backing up the /content directory.
>
> I wouldn't have expected a swap file in the backup adding to the size...
>
> It was a bug for the swapfile to be included in the backup, but the more
> important bug was the /tmp directory being on the boot disk, which is
> only 20 GB.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#1192 (comment)
>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AAA5EGiUFOVoH27fDoG6aBEEfWnSeo5Sks5rc5M0gaJpZM4MCNJh
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1192 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABW9FasTodIRyVXZIhpejS718mEHVhTzks5rc5VtgaJpZM4MCNJh>
.
--
Yasser Elsayed
|
I had a similar problem today. In my case, the additional disk was out of space (not the boot one). Disk size was 10GB and there was a 9.7GB swapfile there. The symptom was also that the VM was completely unresponsive. Deleting the file fixed the problem. However, before deleting the file, I simply tried to increase disk size. I managed to increase the disk size from 10GB to 40GB, but when I enter a "df" command on my VM, it mounts only 10GB. I'm now wondering if Google is charging me for a 40GB disk, while offering only 10GB. I tried resetting the VM, but it didn't recognize the 40GB. Do you know what am I missing? Thanks |
@ramurti resizing the disk doesn't change the format of the disk; you have to do that manually. If you don't, then you'll wind up with a 40GB disk that has a 10GB filesystem on it. Instructions for doing that are included in the "Resizing the file system or partitions on a persistent disk" section of this page |
There is a new issue with Datalab instances that have a machine type with lots of RAM (e.g. n1-standard-4).
The symptom is that the VM can become completely unresponsive. Someone digging in to the issue would see that the boot disk was out of free space because of /var/lib/docker/overlay files taking up all of the available space.
The root issue is that the new version of the CLI creates a swapfile on the persistent disk used for storing notebooks. The size of that swapfile is based on the amount of RAM in the host (so this doesn't affect the default machine type).
The backup utility tries to backup all of the contents of the notebook persistent disk, and as a result winds up with copying it into the /tmp directory (potentially multiple times... if the hourly, daily, and weekly backups are all running at the same time).
The result is that two issues combine to fill up the boot disk:
For now, the simplest work around is to disable backups when creating the VM by passing the
--no-backups
flag.I think the fix should be to change the volume mounts from the host to the Datalab Docker container such that we have the following mapping:
This would give two benefits:
Similarly, this would also have the benefit that the
lost+found
directory does not show up in the Datalab file listing.The text was updated successfully, but these errors were encountered: