Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

Rewrite the kiwi without MicroOS and remove non essential parts #354

Merged
merged 8 commits into from Apr 27, 2021

Conversation

AvengerMoJo
Copy link
Contributor

Same as #316 just remove all the commits by a new branch instead.
Removing dependency from MicroOS
Create LiveISO image type with persistent storage
Updated vagrant image and add VirtualBox support.

Signed-off-by: Alex Lau (AvengerMoJo) alau@suse.com

@ml8mr
Copy link
Contributor

ml8mr commented Mar 30, 2021

@AvengerMoJo I just tried this (using your branch) and the boot process left me at a login prompt - with no sign of aquarium starting.

@AvengerMoJo
Copy link
Contributor Author

@ml8mr which image type are you using?

@ml8mr
Copy link
Contributor

ml8mr commented Mar 31, 2021

@ml8mr which image type are you using?

I used the live image called project-aquarium-microOS.x86_64-16.0.0.iso. (I was multi-tasking so if there was another image I should have used I missed it.)

@AvengerMoJo
Copy link
Contributor Author

I used the live image called project-aquarium-microOS.x86_64-16.0.0.iso. (I was multi-tasking so if there was another image I should have used I missed it.)

I get the following log from my https://paste.opensuse.org/8988607

I believe the bootstrap is not being started.

@AvengerMoJo
Copy link
Contributor Author

I used the live image called project-aquarium-microOS.x86_64-16.0.0.iso. (I was multi-tasking so if there was another image I should have used I missed it.)

I get the following log from my https://paste.opensuse.org/8988607

I believe the bootstrap is not being started.
@ml8mr after I wait a little longer and restart the aquarium service.( not sure which is the required step) the bootstrap process able to start correctly.
chrome-capture (15)

@jecluis
Copy link
Member

jecluis commented Mar 31, 2021

@AvengerMoJo I just tried this (using your branch) and the boot process left me at a login prompt - with no sign of aquarium starting.

Aquarium starts as a web service, not on the node's terminal, if that's what you meant.

Copy link
Member

@jecluis jecluis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason my machine is getting stuck on boot with

dracut-initqueue: mount: /run/overlay/overlayfs: can't find LABEL="cow"
dracut-initqueue: The device contains 'iso9660' signature and it may remain on the device. It is recommended to wipe the device with wipefs(8) or fdisk --wipe, in order to avoid possible collisions.
dracut-initqueue: mke2fs 1.46.2 (28-Feb-2021)

And that's it. Stuck forever.

Other than this, the config looks sound. I'll try to figure out what's wrong, and approve once I get it working.

images/microOS/config.xml Outdated Show resolved Hide resolved
images/microOS/config.xml Outdated Show resolved Hide resolved
Copy link
Member

@jecluis jecluis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure we need the apparmor stuff included, but others might have a more informed opinion than things break if not installed.

Additionally, it feels like the image is relying on ram for its disk, instead of a persistent partition on the drive. Unsure this is due to how I created the stick, using dd if=foo.iso of=/dev/myusbdev.

Reality is that the disk, even though it has the two partitions that are expected (root and a ginormous partition at the end), the root fs ends up with very little space available.

[ceph: root@localhost /]# df -h
Filesystem      Size  Used Avail Use% Mounted on
LiveOS_rootfs   7.8G  7.8G   27M 100% /
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
devtmpfs        7.8G     0  7.8G   0% /dev
tmpfs           7.8G   84K  7.8G   1% /dev/shm
tmpfs           3.2G   43M  3.1G   2% /run/lvm
LiveOS_rootfs   7.8G  7.8G   27M 100% /dev/init
LiveOS_rootfs   7.8G  7.8G   27M 100% /etc/ceph/ceph.conf

And this is a 128GB usb stick :)

images/microOS/config.xml Show resolved Hide resolved
@jecluis
Copy link
Member

jecluis commented Mar 31, 2021

For some reason my machine is getting stuck on boot with

dracut-initqueue: mount: /run/overlay/overlayfs: can't find LABEL="cow"
dracut-initqueue: The device contains 'iso9660' signature and it may remain on the device. It is recommended to wipe the device with wipefs(8) or fdisk --wipe, in order to avoid possible collisions.
dracut-initqueue: mke2fs 1.46.2 (28-Feb-2021)

FWIW, looks like this only happens on the first run. Upon rebooting it works fine. It might be that the second partition needs to be created prior to using the stick.

@ml8mr
Copy link
Contributor

ml8mr commented Mar 31, 2021

@AvengerMoJo I just tried this (using your branch) and the boot process left me at a login prompt - with no sign of aquarium starting.

Aquarium starts as a web service, not on the node's terminal, if that's what you meant.

In the older version, wasn't there a message saying "Connect to Aquarium via https://[IP_ADDRESS]:8443" (or something like that)?

@jecluis
Copy link
Member

jecluis commented Apr 1, 2021

@AvengerMoJo I just tried this (using your branch) and the boot process left me at a login prompt - with no sign of aquarium starting.

Aquarium starts as a web service, not on the node's terminal, if that's what you meant.

In the older version, wasn't there a message saying "Connect to Aquarium via https://[IP_ADDRESS]:8443" (or something like that)?

I think the first PoC did, but that was back when we were doing the bootstrap on node start. We don't do that anymore, although we should!

(edit: to be clear, we should show that thing, not the bootstrap part ;) )

@jecluis
Copy link
Member

jecluis commented Apr 1, 2021

My latest findings:

  1. using ext4 as filesystem for the image makes it really slow, and first boot takes ages on the account of creating the third partition with ext4;
  2. using xfs instead makes it much quicker.
  3. everything is painfully slow in general, regardless of filesystem above -- installing vim on the machine is taking ages now, and I'm pretty sure it's a bandwidth problem with the drive (drive is USB3.0, but port is only USB2.0, so 🤷 )
  4. podman doesn't seem to play well with the overlayfs being used, and gets stuck creating containers we need to properly run Aquarium.

Regardless, running on real hardware managed to surface a couple of bugs that shall be fixed soon, that otherwise would likely not have been caught.

I'm also wondering if this might have anything to do with running on a 128G usb stick, but I find that hard to believe.

@jecluis
Copy link
Member

jecluis commented Apr 1, 2021

Let's kick this down to next week. I'll test the image more thoroughly once I get back home and have physical access to my lab.

@AvengerMoJo
Copy link
Contributor Author

For some reason my machine is getting stuck on boot with

dracut-initqueue: mount: /run/overlay/overlayfs: can't find LABEL="cow"
dracut-initqueue: The device contains 'iso9660' signature and it may remain on the device. It is recommended to wipe the device with wipefs(8) or fdisk --wipe, in order to avoid possible collisions.
dracut-initqueue: mke2fs 1.46.2 (28-Feb-2021)

FWIW, looks like this only happens on the first run. Upon rebooting it works fine. It might be that the second partition needs to be created prior to using the stick.

Correct, after the first boot an expended partition will be created in your USB and mount overlay to the read only iso. So all the configuration and data will be persistent

@AvengerMoJo
Copy link
Contributor Author

AvengerMoJo commented Apr 1, 2021

In the older version, wasn't there a message saying "Connect to Aquarium via https://[IP_ADDRESS]:8443" (or something like that)?

@ml8mr I can put something there with connecting to vagrant setup with http://localhost:1337
something like that.

@AvengerMoJo
Copy link
Contributor Author

My latest findings:

  1. using ext4 as filesystem for the image makes it really slow, and first boot takes ages on the account of creating the third partition with ext4;
  2. using xfs instead makes it much quicker.
  3. everything is painfully slow in general, regardless of filesystem above -- installing vim on the machine is taking ages now, and I'm pretty sure it's a bandwidth problem with the drive (drive is USB3.0, but port is only USB2.0, so 🤷 )
  4. podman doesn't seem to play well with the overlayfs being used, and gets stuck creating containers we need to properly run Aquarium.

Regardless, running on real hardware managed to surface a couple of bugs that shall be fixed soon, that otherwise would likely not have been caught.

I'm also wondering if this might have anything to do with running on a 128G usb stick, but I find that hard to believe.

I can update the filesystem for sure.

Correct when I'm using USB stick vs USB SSD. SSD is working and stick fail. Both are using VM.
I believe the podman write request is too much for stick to handle.

@jecluis
Copy link
Member

jecluis commented Apr 1, 2021

My latest findings:

  1. using ext4 as filesystem for the image makes it really slow, and first boot takes ages on the account of creating the third partition with ext4;
  2. using xfs instead makes it much quicker.
  3. everything is painfully slow in general, regardless of filesystem above -- installing vim on the machine is taking ages now, and I'm pretty sure it's a bandwidth problem with the drive (drive is USB3.0, but port is only USB2.0, so shrug )
  4. podman doesn't seem to play well with the overlayfs being used, and gets stuck creating containers we need to properly run Aquarium.

Regardless, running on real hardware managed to surface a couple of bugs that shall be fixed soon, that otherwise would likely not have been caught.
I'm also wondering if this might have anything to do with running on a 128G usb stick, but I find that hard to believe.

I can update the filesystem for sure.

Correct when I'm using USB stick vs USB SSD. SSD is working and stick fail. Both are using VM.
I believe the podman write request is too much for stick to handle.

@tserong this has me thinking that we can't get away without writing state to a partition, with or without a live persistent image, which might make the persistence on the stick a bit moot. Should we consider an install instead, or does live + mounts still appeal to you more?

@AvengerMoJo
Copy link
Contributor Author

I can update the filesystem for sure.
Correct when I'm using USB stick vs USB SSD. SSD is working and stick fail. Both are using VM.
I believe the podman write request is too much for stick to handle.

@tserong this has me thinking that we can't get away without writing state to a partition, with or without a live persistent image, which might make the persistence on the stick a bit moot. Should we consider an install instead, or does live + mounts still appeal to you more?

@jecluis let me also test USB stick with physical node and see the result as well.

@AvengerMoJo
Copy link
Contributor Author

Replace the ext4 with xfs, adopt the aqrdev (without the 3.8 requirement for build host as well for datetime conversion error) adding virtualbox template

Copy link
Member

@jecluis jecluis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit (14f810d) should be split in at least 2 different commits:

  1. a microOS-related one, with the changes to config.sh and config.xml, which should then be squashed with the original commit;
  2. the aqrdev / vagrantfile template changes.

Regardless, thanks for the great work you've been putting towards this! :)

@AvengerMoJo
Copy link
Contributor Author

The latest commit (14f810d) should be split in at least 2 different commits:

  1. a microOS-related one, with the changes to config.sh and config.xml, which should then be squashed with the original commit;
  2. the aqrdev / vagrantfile template changes.

Regardless, thanks for the great work you've been putting towards this! :)

Sure let me try to figure out how to split that, so the idea is rollback and maybe push to another PR?

@jecluis
Copy link
Member

jecluis commented Apr 2, 2021

The latest commit (14f810d) should be split in at least 2 different commits:

  1. a microOS-related one, with the changes to config.sh and config.xml, which should then be squashed with the original commit;
  2. the aqrdev / vagrantfile template changes.

Regardless, thanks for the great work you've been putting towards this! :)

Sure let me try to figure out how to split that, so the idea is rollback and maybe push to another PR?

Not at all. We can keep this PR open. You just need to perform repush to this branch once you're done tidying it up. Let me know if you'd like some help with the squashing. :)

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@AvengerMoJo
Copy link
Contributor Author

AvengerMoJo commented Apr 13, 2021

@kshtsk I still don't know how to resolve the jenkins error. Please advise, thanks

@jecluis jecluis dismissed their stale review April 17, 2021 21:03

comments were addressed

@jecluis
Copy link
Member

jecluis commented Apr 18, 2021

Vagrant image currently breaks when running with 2 or more network interfaces.

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifdown 'eth1' || true
mv '/tmp/vagrant-network-eth1-1618773225-0' '/etc/sysconfig/network/ifcfg-eth1'
/sbin/ifup 'eth1'


Stdout from the command:

eth1            no-device


Stderr from the command:

wicked: ifdown: no matching interface

This does not happen with the MicroOS image we currently have.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@AvengerMoJo
Copy link
Contributor Author

I can see the different now. One major different is the old image does see the new interface but I also get a fail with vagrant

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifdown 'eth1' || true
mv '/tmp/vagrant-network-eth1-1618972267-0' '/etc/sysconfig/network/ifcfg-eth1'
/sbin/ifup 'eth1'

Stdout from the command:

eth1 setup-in-progress

Stderr from the command:

wicked: skipping eth1 interface: device is not configured by wicked yet
wicked: ifdown: no matching interfaces

@AvengerMoJo
Copy link
Contributor Author

in the old image,

cat /etc/sysconfig/network/ifcfg-eth1

#VAGRANT-BEGIN

The contents below are automatically generated by Vagrant. Do not modify.

BOOTPROTO='dhcp'
STARTMODE='auto'
DEVICE='eth1'
#VAGRANT-END

So it is exactly the last step of settiing new interface up and moving the configuration by vagrant. I still trying to figure out why it is not working in my image but it should be ok.

@kshtsk
Copy link
Contributor

kshtsk commented Apr 21, 2021

rebase and resolve conflicts are required

Create LiveISO image type with persistent storage
Updated vagrant image and add VirtualBox support.

Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
remove conditional check for OS using pyenv

Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
Including motd for network interface display IP
Provide suse vm guest setting to allow faster startup time.

Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
tserong and others added 2 commits April 23, 2021 18:07
Using `tar --owner root --group root` means the tarballs are created
with files owned by root:root, so when kiwi extracts them, they have
the correct uid:gid.  This means we don't need to copy the files
around first.  I've also split the root tarball creation off from
the aquarium dist tarball, to make it a bit clearer what each tarball
is for.

Signed-off-by: Tim Serong <tserong@suse.com>
tools/build-image.sh: use --owner/--group with tar instead of sudo
@tserong
Copy link
Member

tserong commented Apr 23, 2021

Just for the record, we found the problem with multi-nic vagrant deployments - the images/microOS/root/etc/udev/rules.d/70-persistent-net.rules file is what messes things up. If we remove that, vagrant works just fine again. (I wonder if we should also remove images/microOS/root/usr/lib/systemd/network/80-mac-net.link, as that wasn't present before either? That one at least didn't seem to cause me any trouble with vagrant though)

…g interface being create correctly

remove the additional udev update.
Add config.vm.guest = suse to provide faster vm detection

Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
Signed-off-by: Alex Lau (AvengerMoJo) <alau@suse.com>
@AvengerMoJo
Copy link
Contributor Author

AvengerMoJo commented Apr 24, 2021

jenkins run tumbleweed

1 similar comment
@tserong
Copy link
Member

tserong commented Apr 26, 2021

jenkins run tumbleweed

@tserong
Copy link
Member

tserong commented Apr 26, 2021

(looks like the tumbleweed run Alex triggered yesterday has been stuck for, like, a day on the ">>> + git fetch --tags --progress https://github.com/aquarist-labs/aquarium '+refs/heads/:refs/remotes/origin/'" step, so I'm hoping asking for a re-run will unstick it....

Copy link
Member

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just re-tested the vagrant build with 2 NICs, and it works fine. As discussed with @AvengerMoJo on Slack, further improvements can happen in another branch / subsequent PRs, so I reckon we ought to merge this one now. Any objections @jecluis ?

@kshtsk
Copy link
Contributor

kshtsk commented Apr 26, 2021

jenkins test tumbleweed

1 similar comment
@kshtsk
Copy link
Contributor

kshtsk commented Apr 26, 2021

jenkins test tumbleweed

@tserong tserong added enhancement Iteration over existing code or feature images Related to microOS, vagrant, etc, images tooling Related with tools supporting development or deployment labels Apr 26, 2021
@tserong tserong merged commit 66c94db into aquarist-labs:main Apr 27, 2021
@AvengerMoJo AvengerMoJo deleted the final_liveiso-persistent branch April 28, 2021 07:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Iteration over existing code or feature images Related to microOS, vagrant, etc, images tooling Related with tools supporting development or deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants