New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fresh Debian VM with big private storage fails to start #3758

Closed
thejh opened this Issue Mar 29, 2018 · 12 comments

Comments

@thejh

thejh commented Mar 29, 2018

Qubes OS version:

R4.0

Affected component(s):

probably core-agent-linux?


Steps to reproduce the behavior:

  1. Set up a clean Qubes 4.0 installation, as described in the update instructions.
  2. Create a new qube, with template debian-9, with the box "launch settings after creation" ticked.
  3. Set the private storage max size to 110000MiB.
  4. Save the settings.
  5. Attempt to start the newly created VM.

Expected behavior:

The new VM should start, going into Transient state first, then into Running state.

Actual behavior:

The new VM starts and goes into Transient state, uses a lot of CPU power for some time, then turns off again. The logfile shows that the VM spends around a minute in the stage "A start job is running for initiali...unt /rw and /home".

General notes:

I think the problem might be that the VM attempts to initialize the whole disk or so while some other system component is applying a timeout to VM startup? Creating VMs with smaller storage sizes (like 3GiB) seems to work fine, including Debian VMs.
I ended up working around this by first creating a VM with normal private storage size, then resizing the private storage at runtime.
(The reason I'm creating such a gigantic VM is that I need a place to which I can download my Qubes backup from R3.)
My disk speed (when dd'ing from /dev/xvdb to /dev/null in a VM) is around 200-300MB/s.


Related issues:

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Mar 30, 2018

Member

On first VM startup there is need for mkfs. And to be absolutely sure that no data is overridden, the script compares the whole device to /dev/zero (actually as a trigger for mkfs). This takes time on large device...
Enlarging private volume after first VM startup avoids this problem, as you've already noticed.
Any better idea for reliable method of checking if device is empty?

Member

marmarek commented Mar 30, 2018

On first VM startup there is need for mkfs. And to be absolutely sure that no data is overridden, the script compares the whole device to /dev/zero (actually as a trigger for mkfs). This takes time on large device...
Enlarging private volume after first VM startup avoids this problem, as you've already noticed.
Any better idea for reliable method of checking if device is empty?

@thejh

This comment has been minimized.

Show comment
Hide comment
@thejh

thejh Mar 31, 2018

From dom0, I think it should be doable by checking whether the allocated size of the disk image file / LVM logical volume is zero.

For LVM logical volumes, I think the DM_TABLE_STATUS ioctl can be used on the control device to determine the allocated size; in the kernel, it goes via retrieve_status, which does an indirect call to a function like thin_status, which should give you a string with the number of used sectors and the index of the highest used sector AFAIU. The LVM userspace library also has some code for interacting with that; this code is used by the userspace tool lvdisplay for calculating the "Mapped size" line.

For disk image files, AFAIU you could use lseek(fd, 0, SEEK_DATA) to determine whether any part of the file is allocated.

thejh commented Mar 31, 2018

From dom0, I think it should be doable by checking whether the allocated size of the disk image file / LVM logical volume is zero.

For LVM logical volumes, I think the DM_TABLE_STATUS ioctl can be used on the control device to determine the allocated size; in the kernel, it goes via retrieve_status, which does an indirect call to a function like thin_status, which should give you a string with the number of used sectors and the index of the highest used sector AFAIU. The LVM userspace library also has some code for interacting with that; this code is used by the userspace tool lvdisplay for calculating the "Mapped size" line.

For disk image files, AFAIU you could use lseek(fd, 0, SEEK_DATA) to determine whether any part of the file is allocated.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Mar 31, 2018

Member

But not that easy (I think?) from VM...

Member

marmarek commented Mar 31, 2018

But not that easy (I think?) from VM...

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Apr 4, 2018

Member

Another idea is relaxing the check, to compare for example only first 1GB or such. This takes only few seconds. Not as reliable as full device check, but IMO still much better than heuristics for known filesystems employed by tools like blkid.

Member

marmarek commented Apr 4, 2018

Another idea is relaxing the check, to compare for example only first 1GB or such. This takes only few seconds. Not as reliable as full device check, but IMO still much better than heuristics for known filesystems employed by tools like blkid.

pgerber added a commit to pgerber/qubes-core-agent-linux that referenced this issue Jul 3, 2018

setup-rwdev.sh: Only check first 1 GiB for zeros
As described in QubesOS/qubes-issues#3758, VM may fail to start
if the volume is large. This because the whole volume was read to
ensure it's empty (all zeros).

This limits the check to the first 1 GiB of the private volume. As
additional safety, a scan with blkid is done.

Fixes QubesOS/qubes-issues#3758

pgerber added a commit to pgerber/qubes-core-agent-linux that referenced this issue Jul 3, 2018

setup-rwdev.sh: Only check first 1 GiB for zeros
As described in QubesOS/qubes-issues#3758, VM may fail to start
if the volume is large. This because the whole volume was read to
ensure it's empty (all zeros).

This changes limits the check to the first 1 GiB of the private
volume. As additional safety, a scan with blkid is done.

Fixes QubesOS/qubes-issues#3758

@pgerber pgerber referenced this issue in QubesOS/qubes-core-agent-linux Jul 3, 2018

Merged

setup-rwdev.sh: Only check first 1 GiB for zeros #124

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 8, 2018

Automated announcement from builder-github

The package core-agent-linux has been pushed to the r4.0 testing repository for the CentOS centos7 template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.0-current-testing

Changes included in this update

Automated announcement from builder-github

The package core-agent-linux has been pushed to the r4.0 testing repository for the CentOS centos7 template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.0-current-testing

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 8, 2018

Automated announcement from builder-github

The package qubes-core-agent_4.0.32-1+deb9u1 has been pushed to the r4.0 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing stretch-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

Automated announcement from builder-github

The package qubes-core-agent_4.0.32-1+deb9u1 has been pushed to the r4.0 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing stretch-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 8, 2018

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-4.0.32-1.fc26) has been pushed to the r4.0 testing repository for the Fedora template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.0-current-testing

Changes included in this update

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-4.0.32-1.fc26) has been pushed to the r4.0 testing repository for the Fedora template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.0-current-testing

Changes included in this update

marmarek added a commit to QubesOS/qubes-core-agent-linux that referenced this issue Jul 18, 2018

setup-rwdev.sh: Only check first 1 GiB for zeros
As described in QubesOS/qubes-issues#3758, a VM may fail to start
if the volume is large. This because the whole volume is read to
ensure it's empty (=all zeros).

This changes limits the check to the first 1 GiB of the private
volume. As additional safety, a scan with blkid is done.

Fixes QubesOS/qubes-issues#3758

(cherry picked from commit 8ff9cbe)

@qubesos-bot qubesos-bot referenced this issue in QubesOS/updates-status Jul 18, 2018

Open

core-agent-linux v3.2.31 (r3.2) #599

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 18, 2018

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-3.2.31-1.fc26) has been pushed to the r3.2 testing repository for the Fedora template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r3.2-current-testing

Changes included in this update

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-3.2.31-1.fc26) has been pushed to the r3.2 testing repository for the Fedora template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r3.2-current-testing

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 18, 2018

Automated announcement from builder-github

The package qubes-core-agent_3.2.31-1+deb9u1 has been pushed to the r3.2 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing stretch-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

Automated announcement from builder-github

The package qubes-core-agent_3.2.31-1+deb9u1 has been pushed to the r3.2 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing stretch-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 24, 2018

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-4.0.33-1.fc26) has been pushed to the r4.0 stable repository for the Fedora template.
To install this update, please use the standard update command:

sudo yum update

Changes included in this update

Automated announcement from builder-github

The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-4.0.33-1.fc26) has been pushed to the r4.0 stable repository for the Fedora template.
To install this update, please use the standard update command:

sudo yum update

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 24, 2018

Automated announcement from builder-github

The package core-agent-linux has been pushed to the r4.0 stable repository for the Fedora centos7 template.
To install this update, please use the standard update command:

sudo yum update

Changes included in this update

Automated announcement from builder-github

The package core-agent-linux has been pushed to the r4.0 stable repository for the Fedora centos7 template.
To install this update, please use the standard update command:

sudo yum update

Changes included in this update

@qubesos-bot

This comment has been minimized.

Show comment
Hide comment
@qubesos-bot

qubesos-bot Jul 24, 2018

Automated announcement from builder-github

The package qubes-core-agent_4.0.33-1+deb9u1 has been pushed to the r4.0 stable repository for the Debian template.
To install this update, please use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

Automated announcement from builder-github

The package qubes-core-agent_4.0.33-1+deb9u1 has been pushed to the r4.0 stable repository for the Debian template.
To install this update, please use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment