"Failed creating partition: exit status 1" - log missing essential information #50

kkaempf · 2022-07-15T06:23:51Z

Output from journalctl -u elemental-operator after booting an ISO

-- Logs begin at Fri 2022-07-15 06:15:20 UTC, end at Fri 2022-07-15 06:18:02 UTC. --
Jul 15 06:15:40 localhost systemd[1]: Starting Elemental Operator Registration...
Jul 15 06:15:40 localhost elemental-operator[1600]: time="2022-07-15T06:15:40Z" level=info msg="Using TPMHash 1c9b6e73145c651a5cb73ee0eed13c2300effb030f4b7dbfefe5e7a63db96308 to dial wss://172.17.0.2/elemental/registration/b5msj66rsjm9twbnjnz2vsswpsr76w4hvl2mbn58jp9xn8cc94txsr"
Jul 15 06:15:40 localhost elemental-operator[1600]: Install environment:
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_INSTALL_CLOUD_INIT=/tmp/2415592138.yip
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_DEBUG=true
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_INSTALL_REBOOT=true
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_INSTALL_PASSWORD=<removed>
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_INSTALL_TARGET=/dev/sda
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_REGISTRATION_URL=https://172.17.0.2/elemental/registration/b5msj66rsjm9twbnjnz2vsswpsr76w4hvl2mbn58jp9xn8cc94txsr
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_REGISTRATION_CA_CERT=-----BEGIN CERTIFICATE-----
Jul 15 06:15:40 localhost elemental-operator[1600]: MIIBpzCCAU2gAwIBAgIBADAKBggqhkjOPQQDAjA7MRwwGgYDVQQKExNkeW5hbWlj
Jul 15 06:15:40 localhost elemental-operator[1600]: bGlzdGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwHhcNMjIw
Jul 15 06:15:40 localhost elemental-operator[1600]: NzA1MDcwOTU2WhcNMzIwNzAyMDcwOTU2WjA7MRwwGgYDVQQKExNkeW5hbWljbGlz
Jul 15 06:15:40 localhost elemental-operator[1600]: dGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwWTATBgcqhkjO
Jul 15 06:15:40 localhost elemental-operator[1600]: PQIBBggqhkjOPQMBBwNCAAS2wzKIZ5azTvfTMHc4+Zrmy9efM3ndy2ocgycwTCVB
Jul 15 06:15:40 localhost elemental-operator[1600]: rBdbRGH7t1wjGHtpaYh9CR2AflK34+CQ/nibHJLGzH8Go0IwQDAOBgNVHQ8BAf8E
Jul 15 06:15:40 localhost elemental-operator[1600]: BAMCAqQwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUa11N1nJQbJ8YX6MtGjva
Jul 15 06:15:40 localhost elemental-operator[1600]: X4KvRIUwCgYIKoZIzj0EAwIDSAAwRQIgNHzToz74FitImN5T4uCTdmOfYOKVwX0+
Jul 15 06:15:40 localhost elemental-operator[1600]: WyDGp78hxx8CIQDFThQWNi1j1JJTOiIJKXpqjrr4ndzWzsQHqRjveFqdBg==
Jul 15 06:15:40 localhost elemental-operator[1600]: -----END CERTIFICATE-----
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_SYSTEM_AGENT_URL=https://172.17.0.2/k8s/clusters/local
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_SYSTEM_AGENT_TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6Ik1BU1FFdkhiVzI0ZFFKbF9TeDFNZzVtVHdlRHZURTZ0YmY4MHJCLWtxcjgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJmbGVldC1kZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Im15LW5vZGUtdG9rZW4tazhxeGIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibXktbm9kZSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImQ4MTIxOTA5LTJhNWEtNGNkZi05YjQzLWQyYTc1ZTgzZjZjZSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpmbGVldC1kZWZhdWx0Om15LW5vZGUifQ.esXZgq5-6vRPwjnXP3g9L-RzXQKrj0LPTc8oqQN5OUCYtr6PU7lbg-43zsN83_J-I89nmVvH2nwvD1BbucZ9B3U7aq_B2wPZ-_rRveUj4Xsbp-7DenXbS5rqXPjra8ieEhJ5Nr62sNPrPfDkm-lCtiu5olHFgroK2KWHfgrU3DizHqVEJfa2eJJ6ZA8hWMiYZ-JzAuDaoB2Q67FbAJhYRDtSXF7ei5KsfbOXb-FBdm94TDunN6qL2I-kgT06xR7_FWlvqLB5dpU7RFA0xN_aWF5dEj5UknZBenGLrgJRRdaQeDz_tHZmKSVBbviC9z7FoQde4ZHmrGaUplTzmN1F6g
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_SYSTEM_AGENT_SECRET_NAME=m-qemu-standard-pc-q35-ich9-2009-not-specified
Jul 15 06:15:40 localhost elemental-operator[1600]: ELEMENTAL_SYSTEM_AGENT_SECRET_NAMESPACE=fleet-default
Jul 15 06:15:40 localhost elemental-operator[1651]: INFO[2022-07-15T06:15:40Z] Starting elemental version 0.0.15
Jul 15 06:15:40 localhost elemental-operator[1651]: INFO[2022-07-15T06:15:40Z] Install called
Jul 15 06:15:40 localhost elemental-operator[1651]: INFO[2022-07-15T06:15:40Z] Running before-install hook
Jul 15 06:15:40 localhost elemental-operator[1651]: INFO[2022-07-15T06:15:40Z] Partitioning device...
Jul 15 06:15:40 localhost elemental-operator[1651]: ERRO[2022-07-15T06:15:40Z] Failed creating partition: exit status 1
Jul 15 06:15:40 localhost elemental-operator[1651]: ERRO[2022-07-15T06:15:40Z] Failed creating bios partition
Jul 15 06:15:40 localhost elemental-operator[1651]: Error: 1 error occurred:
Jul 15 06:15:40 localhost elemental-operator[1651]:         * exit status 1
Jul 15 06:15:40 localhost systemd[1]: elemental-operator.service: Main process exited, code=exited, status=1/FAILURE
Jul 15 06:15:40 localhost elemental-operator[1600]: time="2022-07-15T06:15:40Z" level=fatal msg="failed to write elemental config: exit status 1"
Jul 15 06:15:40 localhost systemd[1]: elemental-operator.service: Failed with result 'exit-code'.
Jul 15 06:15:40 localhost systemd[1]: Failed to start Elemental Operator Registration.

There are several things missing from this log

"Partitioning device ..." should also print the actual device its partitioning
"Failed creating partition" should print the partition number
How is the partition created ? Is parted (or fdisk) being called ? What the full output of the partitioning tool ?

The text was updated successfully, but these errors were encountered:

kkaempf · 2022-07-15T06:26:13Z

/proc/partitions shows that sda1 was created.

kkaempf · 2022-07-15T06:27:52Z

Manually creating another partition (by calling fdisk) is successful.

kkaempf · 2022-07-15T10:23:15Z

I could not reproduce the error :-/

kkaempf · 2022-07-16T15:50:18Z

This just happened again. A systemctl restart elemental-operator fixed it. So it's a timing issue.

kkaempf · 2022-07-16T16:43:27Z

udev needs several seconds to settle. 😉

davidcassany · 2022-07-18T13:58:22Z

I did several attempts and I could not reproduce it, I'd like to actually have a failure with the output when having the debug flag enabled. #58 ensures debug flag on elementa-cli is used if the install spec includes a debug: true, I can see in the logs above the ELEMENTAL_DEBUG=true did not have any effect, #58 fixes it.

davidcassany · 2022-07-18T14:45:24Z

I am not fully convinced this is a timing issue between udev and parted as this happened when creating the first partition of the disk, just right after creating the partition table.

The commands sequence should be equal to:

parted --script --machine -- /dev/vda unit s mklabel gpt
parted --script --machine -- /dev/vda unit s print
parted --script --machine -- /dev/vda unit s mkpart bios 2048 4095 set 1 bios_grub on

According to the logs above I guess elemental-cli failed on the third call. Between the first and the second calls we could add a udevadm settle execution, however I can't see how to verify any relevant event is not synched. The second call did not fail and I do not feel like adding a retry mechanism for failures happened on mkpart parted command (third call).

I have some doubts the root issue of this is a timing issue. I'd appreciate if anyone can reproduce it including debug messages (requires #58) and share the logs before merging rancher/elemental-cli#286

kkaempf · 2022-07-19T06:47:16Z

Happened again, now with more detail:

> cat operator.log 
-- Logs begin at Tue 2022-07-19 06:43:24 UTC, end at Tue 2022-07-19 06:44:59 UTC. --
Jul 19 06:43:45 localhost systemd[1]: Starting Elemental Operator Registration...
Jul 19 06:43:45 localhost elemental-operator[1612]: time="2022-07-19T06:43:45Z" level=info msg="Operator version 0.2.1, commit 3530dc5, commit date git20220718"
Jul 19 06:43:45 localhost elemental-operator[1612]: time="2022-07-19T06:43:45Z" level=info msg="Using TPMHash 1c9b6e73145c651a5cb73ee0eed13c2300effb030f4b7dbfefe5e7a63db96308 to dial wss://172.17.0.2/elemental/registration/g9mscdw8f7m9q4l59rgldx459t9s2ths9gvtqx4z422mm2p6hrv462"
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Starting elemental version 0.0.15 on commit 605346d
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Full config loaded: &v1.RunConfig{
Jul 19 06:43:46 localhost elemental-operator[1656]:   Strict: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   Reboot: true,
Jul 19 06:43:46 localhost elemental-operator[1656]:   PowerOff: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   CloudInitPaths: nil,
Jul 19 06:43:46 localhost elemental-operator[1656]:   EjectCD: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   Config: v1.Config{
Jul 19 06:43:46 localhost elemental-operator[1656]:     Logger: &v1.logrusWrapper{ // p0
Jul 19 06:43:46 localhost elemental-operator[1656]:       Logger: &logrus.Logger{
Jul 19 06:43:46 localhost elemental-operator[1656]:         Out: &os.File{},
Jul 19 06:43:46 localhost elemental-operator[1656]:         Hooks: logrus.LevelHooks{},
Jul 19 06:43:46 localhost elemental-operator[1656]:         Formatter: &logrus.TextFormatter{
Jul 19 06:43:46 localhost elemental-operator[1656]:           ForceColors: true,
Jul 19 06:43:46 localhost elemental-operator[1656]:           DisableColors: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           ForceQuote: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           DisableQuote: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           EnvironmentOverrideColors: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           DisableTimestamp: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           FullTimestamp: true,
Jul 19 06:43:46 localhost elemental-operator[1656]:           TimestampFormat: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:           DisableSorting: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           SortingFunc: ,
Jul 19 06:43:46 localhost elemental-operator[1656]:           DisableLevelTruncation: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           PadLevelText: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           QuoteEmptyFields: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:           FieldMap: logrus.FieldMap(nil), // p1
Jul 19 06:43:46 localhost elemental-operator[1656]:           CallerPrettyfier: ,
Jul 19 06:43:46 localhost elemental-operator[1656]:         },
Jul 19 06:43:46 localhost elemental-operator[1656]:         ReportCaller: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:         Level: 5,
Jul 19 06:43:46 localhost elemental-operator[1656]:         ExitFunc: os.Exit,
Jul 19 06:43:46 localhost elemental-operator[1656]:       },
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     Fs: &vfs.osfs{}, // p2
Jul 19 06:43:46 localhost elemental-operator[1656]:     Mounter: &mount.Mounter{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     Runner: &v1.RealRunner{ // p3
Jul 19 06:43:46 localhost elemental-operator[1656]:       Logger: p0,
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     Syscall: &v1.RealSyscall{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     CloudInitRunner: &cloudinit.YipCloudInitRunner{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     Luet: &luet.Luet{
Jul 19 06:43:46 localhost elemental-operator[1656]:       VerifyImageUnpack: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:       TmpDir: "/tmp/elemental-1621924831",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     Client: &http.Client{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     Cosign: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:     Verify: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:     CosignPubKey: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     LocalImage: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:     Repos: []v1.Repository{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     Arch: "x86_64",
Jul 19 06:43:46 localhost elemental-operator[1656]:     SquashFsCompressionConfig: []string{
Jul 19 06:43:46 localhost elemental-operator[1656]:       "-comp",
Jul 19 06:43:46 localhost elemental-operator[1656]:       "xz",
Jul 19 06:43:46 localhost elemental-operator[1656]:       "-Xbcj",
Jul 19 06:43:46 localhost elemental-operator[1656]:       "x86",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     SquashFsNoCompression: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   },
Jul 19 06:43:46 localhost elemental-operator[1656]: }
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Loaded install spec: &v1.InstallSpec{
Jul 19 06:43:46 localhost elemental-operator[1656]:   Target: "/dev/sda",
Jul 19 06:43:46 localhost elemental-operator[1656]:   Firmware: "bios",
Jul 19 06:43:46 localhost elemental-operator[1656]:   PartTable: "gpt",
Jul 19 06:43:46 localhost elemental-operator[1656]:   Partitions: v1.ElementalPartitions{
Jul 19 06:43:46 localhost elemental-operator[1656]:     BIOS: &v1.Partition{
Jul 19 06:43:46 localhost elemental-operator[1656]:       Name: "bios",
Jul 19 06:43:46 localhost elemental-operator[1656]:       FilesystemLabel: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Size: 1,
Jul 19 06:43:46 localhost elemental-operator[1656]:       FS: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Flags: []string{
Jul 19 06:43:46 localhost elemental-operator[1656]:         "bios_grub",
Jul 19 06:43:46 localhost elemental-operator[1656]:       },
Jul 19 06:43:46 localhost elemental-operator[1656]:       MountPoint: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Path: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Disk: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     EFI: nil,
Jul 19 06:43:46 localhost elemental-operator[1656]:     OEM: &v1.Partition{
Jul 19 06:43:46 localhost elemental-operator[1656]:       Name: "oem",
Jul 19 06:43:46 localhost elemental-operator[1656]:       FilesystemLabel: "COS_OEM",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Size: 64,
Jul 19 06:43:46 localhost elemental-operator[1656]:       FS: "ext4",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Flags: []string{}, // p0
Jul 19 06:43:46 localhost elemental-operator[1656]:       MountPoint: "/run/cos/oem",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Path: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Disk: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     Recovery: &v1.Partition{
Jul 19 06:43:46 localhost elemental-operator[1656]:       Name: "recovery",
Jul 19 06:43:46 localhost elemental-operator[1656]:       FilesystemLabel: "COS_RECOVERY",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Size: 8192,
Jul 19 06:43:46 localhost elemental-operator[1656]:       FS: "ext4",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Flags: p0,
Jul 19 06:43:46 localhost elemental-operator[1656]:       MountPoint: "/run/cos/recovery",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Path: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Disk: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     State: &v1.Partition{
Jul 19 06:43:46 localhost elemental-operator[1656]:       Name: "state",
Jul 19 06:43:46 localhost elemental-operator[1656]:       FilesystemLabel: "COS_STATE",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Size: 15360,
Jul 19 06:43:46 localhost elemental-operator[1656]:       FS: "ext4",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Flags: p0,
Jul 19 06:43:46 localhost elemental-operator[1656]:       MountPoint: "/run/cos/state",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Path: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Disk: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:     Persistent: &v1.Partition{
Jul 19 06:43:46 localhost elemental-operator[1656]:       Name: "persistent",
Jul 19 06:43:46 localhost elemental-operator[1656]:       FilesystemLabel: "COS_PERSISTENT",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Size: 0,
Jul 19 06:43:46 localhost elemental-operator[1656]:       FS: "ext4",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Flags: p0,
Jul 19 06:43:46 localhost elemental-operator[1656]:       MountPoint: "/run/cos/persistent",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Path: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:       Disk: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     },
Jul 19 06:43:46 localhost elemental-operator[1656]:   },
Jul 19 06:43:46 localhost elemental-operator[1656]:   NoFormat: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   Force: false,
Jul 19 06:43:46 localhost elemental-operator[1656]:   CloudInit: "/tmp/767908735.yip",
Jul 19 06:43:46 localhost elemental-operator[1656]:   Iso: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:   GrubDefEntry: "cOS",
Jul 19 06:43:46 localhost elemental-operator[1656]:   Tty: "tty1",
Jul 19 06:43:46 localhost elemental-operator[1656]:   Active: v1.Image{
Jul 19 06:43:46 localhost elemental-operator[1656]:     File: "/run/cos/state/cOS/active.img",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Label: "COS_ACTIVE",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Size: 3072,
Jul 19 06:43:46 localhost elemental-operator[1656]:     FS: "ext2",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Source: &v1.ImageSource{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     MountPoint: "/run/cos/active",
Jul 19 06:43:46 localhost elemental-operator[1656]:     LoopDevice: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:   },
Jul 19 06:43:46 localhost elemental-operator[1656]:   Recovery: v1.Image{
Jul 19 06:43:46 localhost elemental-operator[1656]:     File: "/run/cos/recovery/cOS/recovery.img",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Label: "COS_SYSTEM",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Size: 0,
Jul 19 06:43:46 localhost elemental-operator[1656]:     FS: "ext2",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Source: &v1.ImageSource{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     MountPoint: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     LoopDevice: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:   },
Jul 19 06:43:46 localhost elemental-operator[1656]:   Passive: v1.Image{
Jul 19 06:43:46 localhost elemental-operator[1656]:     File: "/run/cos/state/cOS/passive.img",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Label: "COS_PASSIVE",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Size: 0,
Jul 19 06:43:46 localhost elemental-operator[1656]:     FS: "ext2",
Jul 19 06:43:46 localhost elemental-operator[1656]:     Source: &v1.ImageSource{},
Jul 19 06:43:46 localhost elemental-operator[1656]:     MountPoint: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:     LoopDevice: "",
Jul 19 06:43:46 localhost elemental-operator[1656]:   },
Jul 19 06:43:46 localhost elemental-operator[1656]:   GrubConf: "/etc/cos/grub.cfg",
Jul 19 06:43:46 localhost elemental-operator[1656]: }
Jul 19 06:43:46 localhost elemental-operator[1656]: INFO[2022-07-19T06:43:46Z] Install called
Jul 19 06:43:46 localhost elemental-operator[1656]: INFO[2022-07-19T06:43:46Z] Running before-install hook
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Running cmd: 'blkdeactivate --lvmoptions retry,wholevg --dmoptions force,retry --errors'
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] blkdeactivate command output: Deactivating block devices:
Jul 19 06:43:46 localhost elemental-operator[1656]: INFO[2022-07-19T06:43:46Z] Partitioning device...
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Running cmd: 'parted --script --machine -- /dev/sda unit s mklabel gpt'
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Running cmd: 'parted --script --machine -- /dev/sda unit s print'
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Adding partition bios
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] Running cmd: 'parted --script --machine -- /dev/sda unit s mkpart bios  2048 4095 set 1 bios_grub on'
Jul 19 06:43:46 localhost elemental-operator[1656]: DEBU[2022-07-19T06:43:46Z] partitioner output: Error: Partition(s) 1 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
Jul 19 06:43:46 localhost elemental-operator[1656]: ERRO[2022-07-19T06:43:46Z] Failed creating partition: exit status 1
Jul 19 06:43:46 localhost elemental-operator[1656]: ERRO[2022-07-19T06:43:46Z] Failed creating bios partition
Jul 19 06:43:46 localhost elemental-operator[1656]: Error: 1 error occurred:
Jul 19 06:43:46 localhost elemental-operator[1656]:         * exit status 1
Jul 19 06:43:46 localhost systemd[1]: elemental-operator.service: Main process exited, code=exited, status=1/FAILURE
Jul 19 06:43:46 localhost elemental-operator[1612]: time="2022-07-19T06:43:46Z" level=fatal msg="failed calling elemental client: exit status 1"
Jul 19 06:43:46 localhost systemd[1]: elemental-operator.service: Failed with result 'exit-code'.
Jul 19 06:43:46 localhost systemd[1]: Failed to start Elemental Operator Registration.

kkaempf · 2022-07-19T06:48:34Z

Partition got created

# cat /proc/partitions 
major minor  #blocks  name

   7        0     384448 loop0
   8        0   41943040 sda
   8        1       1024 sda1
  11        0     467968 sr0

davidcassany · 2022-07-19T08:47:13Z

Awesome thanks for the report including the logs... 🤔 I actually managed to reproduce something similar, to me it feels like some sort of race condition during boot, in my case it fails on the parted --script --machine -- /dev/vda unit s mklabel gpt call with the same error message. It only fails if there was a previous installation on the disk. Still debugging, but closer to the root cause.

At least for the issue I experienced I can asses that changes in rancher/elemental-cli#286 are having no effect.

davidcassany · 2022-07-19T09:56:54Z

@kkaempf here is a request to update the elemental-operator.service https://build.opensuse.org/request/show/990105 From my tests looks like the elemental-operator.service is having some sort of bad interaction with the cos-setup-network.service, they both are executed in parallel.

This issue brings me to the point if the service unit file should belong to this repository instead of OBS, or even more, do we actually need the service unit file? wouldn't it be more consistent to call elemental-operator register as part of one of our boot stages, more specific we could call it as part of a cloud-init config network.after stage so it would be last thing to do as part of cos-setup-network.service. This we concentrate our boot procedure within the cloud-init stages and its config files.

kkaempf · 2022-07-19T10:28:19Z

I'm fine to drop the .service file at any time 😉

davidcassany · 2022-07-19T13:58:11Z

The provided solution in https://build.opensuse.org/request/show/990105 was incomplete, my bad... now the service does not start automatically, on my local tests I just changed the unit service file but I did not realize that the symlinks to the multi-user.target were already created, hence new builds including https://build.opensuse.org/request/show/990105 do not start elemental-operator at boot... my bad, I am bout to provide a new SR.

kkaempf · 2022-07-19T14:34:54Z

Ah, that explains it 😆

davidcassany · 2022-07-19T15:15:33Z

Ah, that explains it laughing

My bad sorry.... 🙏 In may local env I was just placing the new unit file on top and I did not realize that there was some spurious symlink already created during the image build while running systemctl enable elemantl-operator.service. Here is the https://build.opensuse.org/request/show/990272 that should fix the regression and hopefully the former issue.

Gonna add a card to drop this service and call elemental-operator based on our cloud-init config files, think this approach is way more explicit and easier to maintain, however requires few changes regarding the /etc/motd and find a proper way store read elemental-operator logs. That's why a separate specific card makes sense.

kkaempf · 2022-07-19T18:33:32Z

Gonna add a card to drop this service and call elemental-operator based on our cloud-init config files,

👍🏻

kkaempf · 2022-07-20T09:46:29Z

After doing more than half a dozen installs without problem, I'd consider this done.

davidcassany · 2022-07-22T15:36:42Z

after the change into running elemental-operator as part of cloud-init this issue appeared again. Not very often, but still...

This are the relevant logs I get time to time:


Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Done executing stage 'network'
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Running stage: network.after
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/01_elemental-rootfs.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/04_accounting.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/05_motd_and_autologin.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/05_network.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/06_recovery.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/07_live.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/08_boot_assessment.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/09_services.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Executing /system/oem/99_elemental-operator.yaml
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Applying 'Elemental operator bootstrap' for stage 'network.after'. Total stages: 1
Jul 22 15:24:17 localhost elemental[1558]: INFO[2022-07-22T15:24:17Z] Processing stage step ''. ( commands: 1, files: 0, ... )
Jul 22 15:24:18 localhost elemental[1680]: time="2022-07-22T15:24:18Z" level=info msg="Operator version 0.3.0, commit f2ab68c, commit date git20220722"
Jul 22 15:24:18 localhost elemental[1680]: time="2022-07-22T15:24:18Z" level=info msg="Using TPMHash f6c611b8f2310511b1f1337e3b930754f9d9cf2863771f178636f3ce3de91da5 to dial wss://ranchersuse/elemental/registration/blz4scw6dxwvmgdpgx8grctf8tlm4wbcp5ntctjf5tjgz668rjw6fk"
Jul 22 15:24:19 localhost elemental[1680]: time="2022-07-22T15:24:19Z" level=debug msg="Fetched configuration from manager cluster:\nelemental:\n  install:\n    device: /dev/vda\n    debug: true\n    reboot: true\n  registration:\n    url: https://ranchersuse/elemental/registration/blz4scw6dxwvmgdpgx8grctf8tlm4wbcp5ntctjf5tjgz668rjw6fk\n    cacert: |-\n      -----BEGIN CERTIFICATE-----\n      MIIBpzCCAU2gAwIBAgIBADAKBggqhkjOPQQDAjA7MRwwGgYDVQQKExNkeW5hbWlj\n      bGlzdGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwHhcNMjIw\n      NzE0MDgzMTM3WhcNMzIwNzExMDgzMTM3WjA7MRwwGgYDVQQKExNkeW5hbWljbGlz\n      dGVuZXItb3JnMRswGQYDVQQDExJkeW5hbWljbGlzdGVuZXItY2EwWTATBgcqhkjO\n      PQIBBggqhkjOPQMBBwNCAARALzVe/qVHEhX4doODreuKydYKvLHRTJ+eG85jIimV\n      VN001MshCh3KusuYBuGH8GSY+Zz4ieJDICEnBOrdCVp2o0IwQDAOBgNVHQ8BAf8E\n      BAMCAqQwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUa+46+mWZVkp/r/S9wZdo\n      KS7ZNsswCgYIKoZIzj0EAwIDSAAwRQIhAMKAV3P9mLAu0Esrcg0iDUq+Q46+H4/r\n      XXseh20+NnXHAiABBrj0T6CiLScUxxRr3m9ciP4bIHDKQPMJPRtOJIK9BA==\n      -----END CERTIFICATE-----\n    emulatetpm: false\n    emulatedtpmseed: 0\n    nosmbios: false\n    labels: {}\n  systemagent:\n    url: https://ranchersuse/k8s/clusters/local\n    token: eyJhbGciOiJSUzI1NiIsImtpZCI6IjBpQkxscFdnbGFhaVk3LVU4NmlkcjUxQkJCdFZPTXpibFdqWmtNVlVubDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJmbGVldC1kZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6InRlc3Qtbm9kZXMtdG9rZW4tNng1MnciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoidGVzdC1ub2RlcyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjQ4ZmZiNTcyLTFmNTAtNDI3Ny04ZWFiLWMxN2E3NzRlYTQ1ZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpmbGVldC1kZWZhdWx0OnRlc3Qtbm9kZXMifQ.ZzehNpPW1AnWMQZLkquDjUdsry9jdoJIBzbR-IZ1gJ9czhoIjTpYSMmONx8m-Or7413XB0hHGwYKSrIYCrR6VTHpTpgLR5E9chvZ9E0tDIRkRnO-ERKt5QWOP5vwf-cb5QnMwL0MZNeCQRd2pIK90HaDCvUsVpbFFvnM4LL6rFMpQCR8HlB_5_DdSHwtMLfF55EAVeTopuiglQAX126CBfrLekT-7OmNofFT9IssCedzWGq0eDMPhmyVE46dP86ecN4ZNLEzLWls66GT65jtTq26M83x4OG1dUkbtzTh1SnH5WgIe-iEALNmLejKp2GnXULPtB88vbMQHKSBOL8e2Q\n    secretname: m-qemu-standard-pc-q35-ich9-2009-not-specified\n    secretnamespace: fleet-default\ncloud-config:\n  users:\n  - name: root\n    passwd: password\n  - name: dcassany\n    passwd: $6$z7WFaY8420.r91h3$io2HBMke9VNW6uodX/Ci4VotzWk89eZ8AvseLHekUSHPo55Qz3rv4jdrA9M6ONRAvjzGTJe1Wg.TC8YEDL0yU1\n\n\n"
Jul 22 15:24:19 localhost elemental[1680]: time="2022-07-22T15:24:19Z" level=debug msg="Computed environment variables:"
Jul 22 15:24:19 localhost elemental[1680]: time="2022-07-22T15:24:19Z" level=debug msg="ELEMENTAL_DEBUG=true\n"
Jul 22 15:24:19 localhost elemental[1680]: time="2022-07-22T15:24:19Z" level=debug msg="ELEMENTAL_REBOOT=true\n"
Jul 22 15:24:19 localhost elemental[1680]: time="2022-07-22T15:24:19Z" level=debug msg="ELEMENTAL_INSTALL_TARGET=/dev/vda\n"
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] Starting elemental version 0.0.15 on commit 605346d
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] Full config loaded: &v1.RunConfig{
Jul 22 15:24:19 localhost elemental[1687]:   Strict: false,
Jul 22 15:24:19 localhost elemental[1687]:   Reboot: true,
Jul 22 15:24:19 localhost elemental[1687]:   PowerOff: false,
Jul 22 15:24:19 localhost elemental[1687]:   CloudInitPaths: nil,
Jul 22 15:24:19 localhost elemental[1687]:   EjectCD: false,
Jul 22 15:24:19 localhost elemental[1687]:   Config: v1.Config{
Jul 22 15:24:19 localhost elemental[1687]:     Logger: &v1.logrusWrapper{ // p0
Jul 22 15:24:19 localhost elemental[1687]:       Logger: &logrus.Logger{
Jul 22 15:24:19 localhost elemental[1687]:         Out: &os.File{},
Jul 22 15:24:19 localhost elemental[1687]:         Hooks: logrus.LevelHooks{},
Jul 22 15:24:19 localhost elemental[1687]:         Formatter: &logrus.TextFormatter{
Jul 22 15:24:19 localhost elemental[1687]:           ForceColors: true,
Jul 22 15:24:19 localhost elemental[1687]:           DisableColors: false,
Jul 22 15:24:19 localhost elemental[1687]:           ForceQuote: false,
Jul 22 15:24:19 localhost elemental[1687]:           DisableQuote: false,
Jul 22 15:24:19 localhost elemental[1687]:           EnvironmentOverrideColors: false,
Jul 22 15:24:19 localhost elemental[1687]:           DisableTimestamp: false,
Jul 22 15:24:19 localhost elemental[1687]:           FullTimestamp: true,
Jul 22 15:24:19 localhost elemental[1687]:           TimestampFormat: "",
Jul 22 15:24:19 localhost elemental[1687]:           DisableSorting: false,
Jul 22 15:24:19 localhost elemental[1687]:           SortingFunc: ,
Jul 22 15:24:19 localhost elemental[1687]:           DisableLevelTruncation: false,
Jul 22 15:24:19 localhost elemental[1687]:           PadLevelText: false,
Jul 22 15:24:19 localhost elemental[1687]:           QuoteEmptyFields: false,
Jul 22 15:24:19 localhost elemental[1687]:           FieldMap: logrus.FieldMap(nil), // p1
Jul 22 15:24:19 localhost elemental[1687]:           CallerPrettyfier: ,
Jul 22 15:24:19 localhost elemental[1687]:         },
Jul 22 15:24:19 localhost elemental[1687]:         ReportCaller: false,
Jul 22 15:24:19 localhost elemental[1687]:         Level: 5,
Jul 22 15:24:19 localhost elemental[1687]:         ExitFunc: os.Exit,
Jul 22 15:24:19 localhost elemental[1687]:       },
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     Fs: &vfs.osfs{}, // p2
Jul 22 15:24:19 localhost elemental[1687]:     Mounter: &mount.Mounter{},
Jul 22 15:24:19 localhost elemental[1687]:     Runner: &v1.RealRunner{ // p3
Jul 22 15:24:19 localhost elemental[1687]:       Logger: p0,
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     Syscall: &v1.RealSyscall{},
Jul 22 15:24:19 localhost elemental[1687]:     CloudInitRunner: &cloudinit.YipCloudInitRunner{},
Jul 22 15:24:19 localhost elemental[1687]:     Luet: &luet.Luet{
Jul 22 15:24:19 localhost elemental[1687]:       VerifyImageUnpack: false,
Jul 22 15:24:19 localhost elemental[1687]:       TmpDir: "/tmp/elemental-4140224633",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     Client: &http.Client{},
Jul 22 15:24:19 localhost elemental[1687]:     Cosign: false,
Jul 22 15:24:19 localhost elemental[1687]:     Verify: false,
Jul 22 15:24:19 localhost elemental[1687]:     CosignPubKey: "",
Jul 22 15:24:19 localhost elemental[1687]:     LocalImage: false,
Jul 22 15:24:19 localhost elemental[1687]:     Repos: []v1.Repository{},
Jul 22 15:24:19 localhost elemental[1687]:     Arch: "x86_64",
Jul 22 15:24:19 localhost elemental[1687]:     SquashFsCompressionConfig: []string{
Jul 22 15:24:19 localhost elemental[1687]:       "-comp",
Jul 22 15:24:19 localhost elemental[1687]:       "xz",
Jul 22 15:24:19 localhost elemental[1687]:       "-Xbcj",
Jul 22 15:24:19 localhost elemental[1687]:       "x86",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     SquashFsNoCompression: false,
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]: }
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] Loaded install spec: &v1.InstallSpec{
Jul 22 15:24:19 localhost elemental[1687]:   Target: "/dev/vda",
Jul 22 15:24:19 localhost elemental[1687]:   Firmware: "bios",
Jul 22 15:24:19 localhost elemental[1687]:   PartTable: "gpt",
Jul 22 15:24:19 localhost elemental[1687]:   Partitions: v1.ElementalPartitions{
Jul 22 15:24:19 localhost elemental[1687]:     BIOS: &v1.Partition{
Jul 22 15:24:19 localhost elemental[1687]:       Name: "bios",
Jul 22 15:24:19 localhost elemental[1687]:       FilesystemLabel: "",
Jul 22 15:24:19 localhost elemental[1687]:       Size: 1,
Jul 22 15:24:19 localhost elemental[1687]:       FS: "",
Jul 22 15:24:19 localhost elemental[1687]:       Flags: []string{
Jul 22 15:24:19 localhost elemental[1687]:         "bios_grub",
Jul 22 15:24:19 localhost elemental[1687]:       },
Jul 22 15:24:19 localhost elemental[1687]:       MountPoint: "",
Jul 22 15:24:19 localhost elemental[1687]:       Path: "",
Jul 22 15:24:19 localhost elemental[1687]:       Disk: "",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     EFI: nil,
Jul 22 15:24:19 localhost elemental[1687]:     OEM: &v1.Partition{
Jul 22 15:24:19 localhost elemental[1687]:       Name: "oem",
Jul 22 15:24:19 localhost elemental[1687]:       FilesystemLabel: "COS_OEM",
Jul 22 15:24:19 localhost elemental[1687]:       Size: 64,
Jul 22 15:24:19 localhost elemental[1687]:       FS: "ext4",
Jul 22 15:24:19 localhost elemental[1687]:       Flags: []string{}, // p0
Jul 22 15:24:19 localhost elemental[1687]:       MountPoint: "/run/cos/oem",
Jul 22 15:24:19 localhost elemental[1687]:       Path: "",
Jul 22 15:24:19 localhost elemental[1687]:       Disk: "",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     Recovery: &v1.Partition{
Jul 22 15:24:19 localhost elemental[1687]:       Name: "recovery",
Jul 22 15:24:19 localhost elemental[1687]:       FilesystemLabel: "COS_RECOVERY",
Jul 22 15:24:19 localhost elemental[1687]:       Size: 8192,
Jul 22 15:24:19 localhost elemental[1687]:       FS: "ext4",
Jul 22 15:24:19 localhost elemental[1687]:       Flags: p0,
Jul 22 15:24:19 localhost elemental[1687]:       MountPoint: "/run/cos/recovery",
Jul 22 15:24:19 localhost elemental[1687]:       Path: "",
Jul 22 15:24:19 localhost elemental[1687]:       Disk: "",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     State: &v1.Partition{
Jul 22 15:24:19 localhost elemental[1687]:       Name: "state",
Jul 22 15:24:19 localhost elemental[1687]:       FilesystemLabel: "COS_STATE",
Jul 22 15:24:19 localhost elemental[1687]:       Size: 15360,
Jul 22 15:24:19 localhost elemental[1687]:       FS: "ext4",
Jul 22 15:24:19 localhost elemental[1687]:       Flags: p0,
Jul 22 15:24:19 localhost elemental[1687]:       MountPoint: "/run/cos/state",
Jul 22 15:24:19 localhost elemental[1687]:       Path: "",
Jul 22 15:24:19 localhost elemental[1687]:       Disk: "",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:     Persistent: &v1.Partition{
Jul 22 15:24:19 localhost elemental[1687]:       Name: "persistent",
Jul 22 15:24:19 localhost elemental[1687]:       FilesystemLabel: "COS_PERSISTENT",
Jul 22 15:24:19 localhost elemental[1687]:       Size: 0,
Jul 22 15:24:19 localhost elemental[1687]:       FS: "ext4",
Jul 22 15:24:19 localhost elemental[1687]:       Flags: p0,
Jul 22 15:24:19 localhost elemental[1687]:       MountPoint: "/run/cos/persistent",
Jul 22 15:24:19 localhost elemental[1687]:       Path: "",
Jul 22 15:24:19 localhost elemental[1687]:       Disk: "",
Jul 22 15:24:19 localhost elemental[1687]:     },
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]:   NoFormat: false,
Jul 22 15:24:19 localhost elemental[1687]:   Force: false,
Jul 22 15:24:19 localhost elemental[1687]:   CloudInit: []string{
Jul 22 15:24:19 localhost elemental[1687]:     "/tmp/3429870971.yaml",
Jul 22 15:24:19 localhost elemental[1687]:     "/tmp/825229707.yaml",
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]:   Iso: "",
Jul 22 15:24:19 localhost elemental[1687]:   GrubDefEntry: "cOS",
Jul 22 15:24:19 localhost elemental[1687]:   Tty: "tty1",
Jul 22 15:24:19 localhost elemental[1687]:   Active: v1.Image{
Jul 22 15:24:19 localhost elemental[1687]:     File: "/run/cos/state/cOS/active.img",
Jul 22 15:24:19 localhost elemental[1687]:     Label: "COS_ACTIVE",
Jul 22 15:24:19 localhost elemental[1687]:     Size: 3072,
Jul 22 15:24:19 localhost elemental[1687]:     FS: "ext2",
Jul 22 15:24:19 localhost elemental[1687]:     Source: &v1.ImageSource{},
Jul 22 15:24:19 localhost elemental[1687]:     MountPoint: "/run/cos/active",
Jul 22 15:24:19 localhost elemental[1687]:     LoopDevice: "",
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]:   Recovery: v1.Image{
Jul 22 15:24:19 localhost elemental[1687]:     File: "/run/cos/recovery/cOS/recovery.img",
Jul 22 15:24:19 localhost elemental[1687]:     Label: "COS_SYSTEM",
Jul 22 15:24:19 localhost elemental[1687]:     Size: 0,
Jul 22 15:24:19 localhost elemental[1687]:     FS: "ext2",
Jul 22 15:24:19 localhost elemental[1687]:     Source: &v1.ImageSource{},
Jul 22 15:24:19 localhost elemental[1687]:     MountPoint: "",
Jul 22 15:24:19 localhost elemental[1687]:     LoopDevice: "",
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]:   Passive: v1.Image{
Jul 22 15:24:19 localhost elemental[1687]:     File: "/run/cos/state/cOS/passive.img",
Jul 22 15:24:19 localhost elemental[1687]:     Label: "COS_PASSIVE",
Jul 22 15:24:19 localhost elemental[1687]:     Size: 0,
Jul 22 15:24:19 localhost elemental[1687]:     FS: "ext2",
Jul 22 15:24:19 localhost elemental[1687]:     Source: &v1.ImageSource{},
Jul 22 15:24:19 localhost elemental[1687]:     MountPoint: "",
Jul 22 15:24:19 localhost elemental[1687]:     LoopDevice: "",
Jul 22 15:24:19 localhost elemental[1687]:   },
Jul 22 15:24:19 localhost elemental[1687]:   GrubConf: "/etc/cos/grub.cfg",
Jul 22 15:24:19 localhost elemental[1687]: }
Jul 22 15:24:19 localhost elemental[1687]: INFO[2022-07-22T15:24:19Z] Install called
Jul 22 15:24:19 localhost elemental[1687]: INFO[2022-07-22T15:24:19Z] Running before-install hook
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] Running cmd: 'blkdeactivate --lvmoptions retry,wholevg --dmoptions force,retry --errors'
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] blkdeactivate command output: Deactivating block devices:
Jul 22 15:24:19 localhost elemental[1687]: INFO[2022-07-22T15:24:19Z] Partitioning device...
Jul 22 15:24:19 localhost elemental[1687]: DEBU[2022-07-22T15:24:19Z] Running cmd: 'parted --script --machine -- /dev/vda unit s mklabel gpt'
Jul 22 15:24:29 localhost elemental[1687]: ERRO[2022-07-22T15:24:29Z] Failed creating new partition table: Error: Partition(s) 1, 2, 3, 4, 5 on /dev/vda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
Jul 22 15:24:29 localhost elemental[1687]: Error: 1 error occurred:
Jul 22 15:24:29 localhost elemental[1687]:         * exit status 1
Jul 22 15:24:29 localhost elemental[1680]: time="2022-07-22T15:24:29Z" level=fatal msg="failed calling elemental client: exit status 1"
Jul 22 15:24:29 localhost elemental[1558]: ERRO[2022-07-22T15:24:29Z] : failed to run systemd-cat -t elemental elemental-operator register --debug /run/initramfs/live/: exit status 1
Jul 22 15:24:29 localhost elemental[1558]: ERRO[2022-07-22T15:24:29Z] 1 error occurred:
Jul 22 15:24:29 localhost elemental[1558]:         * failed to run systemd-cat -t elemental elemental-operator register --debug /run/initramfs/live/: exit status 1
Jul 22 15:24:29 localhost elemental[1558]:  
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Stage 'network.after'. Defined stages: 1. Errors: true
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Executing /oem/install-hook.yaml
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Done executing stage 'network.after'
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Running stage: network.before
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Executing BOOT_IMAGE=(cd)/boot/kernel.xz cdroot root=live:CDLABEL=COS_LIVE rd.live.dir=/ rd.live.squashimg=rootfs.squashfs console=tty1 console=ttyS0 rd.cos.disable
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Done executing stage 'network.before'
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Running stage: network
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Executing BOOT_IMAGE=(cd)/boot/kernel.xz cdroot root=live:CDLABEL=COS_LIVE rd.live.dir=/ rd.live.squashimg=rootfs.squashfs console=tty1 console=ttyS0 rd.cos.disable
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Done executing stage 'network'
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Running stage: network.after
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Executing BOOT_IMAGE=(cd)/boot/kernel.xz cdroot root=live:CDLABEL=COS_LIVE rd.live.dir=/ rd.live.squashimg=rootfs.squashfs console=tty1 console=ttyS0 rd.cos.disable
Jul 22 15:24:29 localhost elemental[1558]: INFO[2022-07-22T15:24:29Z] Done executing stage 'network.after'

IMHO this also raises a something we probably should consider. What is the expected behavior if the installation fails? Certainly we need to fix and understand the former issue, however installation failures will certainly happen this goes up to scale at some point. So I was wondering if some sort of reboot and retry mechanism is something desirable for the registration and provisioning process.

kkaempf · 2022-07-22T17:20:14Z

A systemctl restart elemental-operator helped in all such cases I encountered. Going back to a systemd service (with restart-on-failure - if this is even possible 🤔 ) would be one option.

davidcassany · 2022-07-25T14:30:04Z

I am pretty much convinced the issue is with the datasource plugin which is executed at network stage. More explicit I believe the problem is related to the cdrom provider which attempts to read all devices and leaves the files opened. In fact, right after booting and before elemental-operator kicks in a call to lsof | grep vda results into something like:

elemental 1558                            root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558                            root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558                            root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558                            root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558                            root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1559 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1559 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1559 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1559 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1559 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1560 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1560 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1560 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1560 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1560 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1561 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1561 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1561 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1561 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1561 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1562 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1562 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1562 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1562 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1562 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1563 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1563 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1563 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1563 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1563 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5
elemental 1558 1564 elemental             root    9r      BLK              253,1   0t1048576      16294 /dev/vda1
elemental 1558 1564 elemental             root   10r      BLK              253,2  0t67108864      16295 /dev/vda2
elemental 1558 1564 elemental             root   11r      BLK              253,3 0x200000000      16296 /dev/vda3
elemental 1558 1564 elemental             root   12r      BLK              253,4 0x3c0000000      16297 /dev/vda4
elemental 1558 1564 elemental             root   13r      BLK              253,5 0x1bbd00000      16300 /dev/vda5

I added davidcassany/linuxkit#5 as an attempt to solve it, however I can't understand why all devices seams to be opened by multiple thread.... 🤔 In any case the traces from below they disappear as soon as the cdrom datasource is removed, so we are certain it is the cdrom provider the responsible of opening all this devices.

Avoid the use of 'cdrom' provider for datasource plugin. This provider opens all available block devices to probe them, however an `lsof` call show they are not closed afterwards. This might cause issues if within the same process we pretend to repartition a disk for installation. This is a workaround for rancher/elemental-operator#50 Signed-off-by: David Cassany <dcassany@suse.com>

* Do not use cdrom provider datasource Avoid the use of 'cdrom' provider for datasource plugin. This provider opens all available block devices to probe them, however an `lsof` call show they are not closed afterwards. This might cause issues if within the same process we pretend to repartition a disk for installation. This is a workaround for rancher/elemental-operator#50 Signed-off-by: David Cassany <dcassany@suse.com> * Completely remove datasource plugin We are not targeting public cloud infraestructure for now, hence there is no need to include such a stage. Signed-off-by: David Cassany <dcassany@suse.com>

davidcassany · 2022-07-26T10:25:15Z

Closing again since there is already a workaround in place in rancher/elemental#212 Note this is not solve, this is just a workaround. A proper work is likely to require a dive into linuxkit and go-diskfs libraries to properly release devices after probing them.

davidcassany self-assigned this Jul 18, 2022

davidcassany mentioned this issue Jul 18, 2022

Pass debug flag to elemental client if requested #58

Merged

davidcassany mentioned this issue Jul 18, 2022

Wait some time after creating a new partition table to settle udev rancher/elemental-cli#286

Closed

davidcassany mentioned this issue Jul 19, 2022

Use coud-init files instead of a unit service file for elemental-operator #62

Closed

kkaempf closed this as completed Jul 20, 2022

This was referenced Jul 22, 2022

"elemental-operator register /run/initramfs/live" broken #78

Closed

Improve elemental-operator call rancher/elemental#204

Merged

davidcassany reopened this Jul 22, 2022

davidcassany mentioned this issue Jul 25, 2022

Update linuxkit fork dependency mudler/yip#38

Merged

davidcassany mentioned this issue Jul 25, 2022

Bump yip rancher/elemental-cli#295

Closed

davidcassany mentioned this issue Jul 26, 2022

Do not use cdrom provider datasource rancher/elemental#212

Merged

davidcassany closed this as completed Jul 26, 2022

davidcassany mentioned this issue Feb 16, 2023

Bump yip version rancher/elemental-cli#426

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Failed creating partition: exit status 1" - log missing essential information #50

"Failed creating partition: exit status 1" - log missing essential information #50

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 16, 2022

kkaempf commented Jul 16, 2022

davidcassany commented Jul 18, 2022

davidcassany commented Jul 18, 2022

kkaempf commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022 •

edited

Loading

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

kkaempf commented Jul 20, 2022

davidcassany commented Jul 22, 2022

kkaempf commented Jul 22, 2022

davidcassany commented Jul 25, 2022

davidcassany commented Jul 26, 2022

"Failed creating partition: exit status 1" - log missing essential information #50

"Failed creating partition: exit status 1" - log missing essential information #50

Comments

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 15, 2022

kkaempf commented Jul 16, 2022

kkaempf commented Jul 16, 2022

davidcassany commented Jul 18, 2022

davidcassany commented Jul 18, 2022

kkaempf commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022 • edited Loading

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

davidcassany commented Jul 19, 2022

kkaempf commented Jul 19, 2022

kkaempf commented Jul 20, 2022

davidcassany commented Jul 22, 2022

kkaempf commented Jul 22, 2022

davidcassany commented Jul 25, 2022

davidcassany commented Jul 26, 2022

davidcassany commented Jul 19, 2022 •

edited

Loading