Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to run 'sfdisk -d /dev/xvdc' when bosh-agent partitions ephemeral disk #28

Closed
edwardstudy opened this issue Feb 8, 2018 · 3 comments

Comments

@edwardstudy
Copy link
Contributor

Hi, we were trying to build SoftLayer xenial stemcell with this based OS image: https://github.com/cloudfoundry/bosh-linux-stemcell-builder/blob/master/bosh-stemcell/os_image_versions.json#L4. It failed during compiler vm creation:

02:11:30 | Preparing deployment: Preparing deployment (00:00:00)
02:11:30 | Preparing package compilation: Finding packages to compile (00:00:00)
02:11:30 | Compiling packages: batlight/dd894d36ab4d50ec504c09be6715c04f4077afee (0)
02:26:50 | Creating missing vms: mutable/ad72b6a1-a1bd-4546-9ba3-1781595aedef (0) (00:15:20)
            L Error: Timed out pinging to e4526a3f-9639-4b9d-a24c-51fdb45c01d2 after 600 seconds

02:26:51 | Error: Timed out pinging to e4526a3f-9639-4b9d-a24c-51fdb45c01d2 after 600 seconds

This error was caused due to the bosh-agent(v2.65.0) failed to partition ephemeral disk in the compilation VM:
[main] 2018/02/08 02:43:01 ERROR - Agent exited with error: Running bootstrap: Setting up ephemeral disk: Partitioning ephemeral disk: Partitioning ephemeral disk '/dev/xvdc': Getting partitions for /dev/xvdc: Shelling out to sfdisk when getting partitions: Running command: 'sfdisk -d /dev/xvdc', stdout: '', stderr: 'sfdisk: failed to dump partition table: Success

And 'sfdisk -l' result seems normal:

/:/var/vcap/bosh# sfdisk -l /dev/xvdc
Disk /dev/xvdc: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Hope detailed logs help you:

2018-02-08_02:42:49.10993 ********************
2018-02-08_02:42:49.11021 [settingsService] 2018/02/08 02:42:49 DEBUG - Loading settings from fetcher
2018-02-08_02:42:49.11023 [File System] 2018/02/08 02:42:49 DEBUG - Reading file
2018-02-08_02:42:49.11023 [registryProvider] 2018/02/08 02:42:49 DEBUG - Using file registry at /var/vcap/bosh/user_data.json
2018-02-08_02:42:49.11023 [File System] 2018/02/08 02:42:49 DEBUG - Reading file /var/vcap/bosh/user_data.json
2018-02-08_02:42:49.11024 [File System] 2018/02/08 02:42:49 DEBUG - Read content
2018-02-08_02:42:49.11024 ********************
2018-02-08_02:42:49.11024 {"agent_id":"91787558-2d2b-4d64-a99e-7b1736f75cf9","vm":{"name":"vm-91787558-2d2b-4d64-a99e-7b1736f75cf9","id":"vm-91787558-2d2b-4d64-a99e-7b1736f75cf9"},"mbus":"nats://nats:mer0itet8msechlek4sl@10.112.116.16:4222","ntp":["time1.google.com","time2.google.com","time3.google.com","time4.google.com"],"blobstore":{"provider":"dav","options":{"endpoint":"http://10.112.116.16:25250","password":"sp4uvvhgop3nrv9387nc","user":"agent"}},"networks":{"default":{"type":"dynamic","dns":["10.112.166.140"],"default":["dns","gateway"],"preconfigured":true,"cloud_properties":{"PrimaryBackendNetworkComponent":{"NetworkVlan":{"Id":1.292651e+06}},"PrimaryNetworkComponent":{"NetworkVlan":{"Id":1.292653e+06}}}}},"disks":{"ephemeral":"/dev/xvdc","persistent":null},"env":{"bosh":{"group":"bats-director-bat-compilation-36bfae6b-a4fd-4445-a51a-9e384c850217","groups":["bats-director","bat","compilation-36bfae6b-a4fd-4445-a51a-9e384c850217","bats-director-bat","bat-compilation-36bfae6b-a4fd-4445-a51a-9e384c850217","bats-director-bat-compilation-36bfae6b-a4fd-4445-a51a-9e384c850217"],"keep_root_password":true,"password":"$6$4cbef42d659bd6ea$6EvArRU9G9haaLC.1UTbNMyP7JNWxoddj6jVgGsF8fglPPUWGv1jJ2lteG3mmdoe6EA6W5m0lfagvICYn0zF8/"}}}
2018-02-08_02:42:49.11026 ********************
2018-02-08_02:42:49.11066 [settingsService] 2018/02/08 02:42:49 DEBUG - Successfully received settings from fetcher
2018-02-08_02:42:49.11099 [File System] 2018/02/08 02:42:49 DEBUG - Making dir /var/vcap/bosh with perm 0777
2018-02-08_02:42:49.11100 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Running command 'route -n'
2018-02-08_02:42:49.11194 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Successful: true (0)
2018-02-08_02:42:49.11250 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Running command 'usermod -p $6$4cbef42d659bd6ea$6EvArRU9G9haaLC.1UTbNMyP7JNWxoddj6jVgGsF8fglPPUWGv1jJ2lteG3mmdoe6EA6W5m0lfagvICYn0zF8/ vcap'
2018-02-08_02:42:49.13426 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Stdout:
2018-02-08_02:42:49.13428 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Stderr:
2018-02-08_02:42:49.13428 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Successful: true (0)
2018-02-08_02:42:49.13453 [File System] 2018/02/08 02:42:49 DEBUG - Writing /etc/resolvconf/resolv.conf.d/base
2018-02-08_02:42:49.13454 [File System] 2018/02/08 02:42:49 DEBUG - Making dir /etc/resolvconf/resolv.conf.d with perm 0777
2018-02-08_02:42:49.13454 [File System] 2018/02/08 02:42:49 DEBUG - Write content
2018-02-08_02:42:49.13455 ********************
2018-02-08_02:42:49.13455 # Generated by bosh-agent
2018-02-08_02:42:49.13455 nameserver 10.112.166.140
2018-02-08_02:42:49.13456
2018-02-08_02:42:49.13456 ********************
2018-02-08_02:42:49.13456 [File System] 2018/02/08 02:42:49 DEBUG - Symlinking oldPath /run/resolvconf/resolv.conf with newPath /etc/resolv.conf
2018-02-08_02:42:49.13456 [File System] 2018/02/08 02:42:49 DEBUG - Lstat '/etc/resolv.conf'
2018-02-08_02:42:49.13500 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Running command 'resolvconf -u'
2018-02-08_02:42:49.14989 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Stdout:
2018-02-08_02:42:49.14991 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Stderr:
2018-02-08_02:42:49.14991 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Successful: true (0)
2018-02-08_02:42:49.14991 [File System] 2018/02/08 02:42:49 DEBUG - Writing /var/vcap/bosh/etc/ntpserver
2018-02-08_02:42:49.15016 [File System] 2018/02/08 02:42:49 DEBUG - Making dir /var/vcap/bosh/etc with perm 0777
2018-02-08_02:42:49.15017 [File System] 2018/02/08 02:42:49 DEBUG - Write content
2018-02-08_02:42:49.15018 ********************
2018-02-08_02:42:49.15019 time1.google.com time2.google.com time3.google.com time4.google.com
2018-02-08_02:42:49.15019 ********************
2018-02-08_02:42:49.15019 [Cmd Runner] 2018/02/08 02:42:49 DEBUG - Running command 'sync-time'
2018-02-08_02:43:01.21297 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stdout:
2018-02-08_02:43:01.21300 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stderr:
2018-02-08_02:43:01.21300 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Successful: true (0)
2018-02-08_02:43:01.21300 [linuxPlatform] 2018/02/08 02:43:01 INFO - Setting up raw ephemeral disks
2018-02-08_02:43:01.21301 [linuxPlatform] 2018/02/08 02:43:01 INFO - Setting up ephemeral disk...
2018-02-08_02:43:01.21301 [File System] 2018/02/08 02:43:01 DEBUG - Glob '/var/vcap/data/*'
2018-02-08_02:43:01.21301 [File System] 2018/02/08 02:43:01 DEBUG - Making dir /var/vcap/data with perm 0750
2018-02-08_02:43:01.21302 [linuxPlatform] 2018/02/08 02:43:01 INFO - Creating swap & ephemeral partitions on ephemeral disk...
2018-02-08_02:43:01.21302 [linuxPlatform] 2018/02/08 02:43:01 DEBUG - Getting device size of `/dev/xvdc'
2018-02-08_02:43:01.21303 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Running command 'sfdisk -s /dev/xvdc'
2018-02-08_02:43:01.21453 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stdout: 104857600
2018-02-08_02:43:01.21455 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stderr:
2018-02-08_02:43:01.21456 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Successful: true (0)
2018-02-08_02:43:01.21456 [linuxPlatform] 2018/02/08 02:43:01 DEBUG - Calculating partition sizes of `/dev/xvdc', with available size 107374182400B
2018-02-08_02:43:01.21558 [linuxPlatform] 2018/02/08 02:43:01 INFO - Partitioning `/dev/xvdc' with [[Type: swap, SizeInBytes: 8368222208] [Type: linux, SizeInBytes: 99005960192]]
2018-02-08_02:43:01.21559 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Running command 'sfdisk -d /dev/xvdc'
2018-02-08_02:43:01.22030 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stdout:
2018-02-08_02:43:01.22031 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Stderr: sfdisk: failed to dump partition table: Success
2018-02-08_02:43:01.22032 [Cmd Runner] 2018/02/08 02:43:01 DEBUG - Successful: false (1)
2018-02-08_02:43:01.22033 [main] 2018/02/08 02:43:01 ERROR - App setup Running bootstrap: Setting up ephemeral disk: Partitioning ephemeral disk: Partitioning ephemeral disk `/dev/xvdc': Getting partitions for /dev/xvdc: Shelling out to sfdisk when getting partitions: Running command: 'sfdisk -d /dev/xvdc', stdout: '', stderr: 'sfdisk: failed to dump partition table: Success
2018-02-08_02:43:01.22034 ': exit status 1
2018-02-08_02:43:01.22035 [main] 2018/02/08 02:43:01 ERROR - Agent exited with error: Running bootstrap: Setting up ephemeral disk: Partitioning ephemeral disk: Partitioning ephemeral disk `/dev/xvdc': Getting partitions for /dev/xvdc: Shelling out to sfdisk when getting partitions: Running command: 'sfdisk -d /dev/xvdc', stdout: '', stderr: 'sfdisk: failed to dump partition table: Success
2018-02-08_02:43:01.22037 ': exit status 1

Thank you so much. :)

@tjvman
Copy link
Member

tjvman commented Feb 8, 2018

@edwardstudy Hm, I'm surprised that you're seeing sfdisk being used - these two files should combine to make the partitioner type be parted. Can you confirm that /var/vcap/bosh/agent.json has Platform.Linux.PartitionerType: parted set?

We've seen the above issue before; it's caused by an incompatability between the bosh agent and newer versions of sfdisk. Using parted instead fixes the problem, hence my previous question

@edwardstudy
Copy link
Contributor Author

@tjvman Your approach worked.
But there still had a problem about open-iscsi service which used by softlayer builder stage.

On trusty, open-iscsi is 'Starts and stops the iSCSI initiator services and logs into default targets'. But open-iscsi is 'Login to default iSCSI targets at boot and log out of all iSCSI targets at shutdown' on xenial. However, iscsid is 'The iSCSI initiator daemon takes care of monitoring iSCSI connections to targets. It is also the daemon providing the interface for the iscisadm tool to talk to when administering iSCSI connections.' on xenial.

I just send a bug report: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1753353

Could you have any idea about this issue?

@ramonskie
Copy link
Contributor

Closing this issue as there is no activity for a very long time
you can reopen this issue if the issue still persists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants