Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump toolchain/luet to 0.16.6 #260

Closed
wants to merge 2 commits into from

Conversation

cOS-cibot
Copy link
Contributor

documentation

@cOS-cibot cOS-cibot force-pushed the bump_luet_toolchain branch 2 times, most recently from 01d6030 to 1e4fc5b Compare June 13, 2021 20:10
Copy link
Contributor

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI tests

@mudler
Copy link
Contributor

mudler commented Jun 15, 2021

I've opened an issue for #277 . The timeouts issues looks like a recurring problem which seems genuine to me.

@mudler mudler self-assigned this Jun 16, 2021
@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

I'm having a look closer, it looks like it fails 100% of times during upgrade tests

@mudler mudler added this to 💡 Untriaged in Releases via automation Jun 16, 2021
@mudler mudler moved this from 💡 Untriaged to 🏃🏼‍♂️ In Progress in Releases Jun 16, 2021
@davidcassany
Copy link
Contributor

davidcassany commented Jun 16, 2021

Yes something odd is happening. See that #272 also fails on upgrade tests but with a different error... something related to xattrs not being copied by rsync... @_@ I was looking at upgrade tests now too, and of course they pass locally 😢

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Yes something odd is happening. See that #272 also fails on upgrade tests but with a different error... something related to xattrs not being copied by rsync... @_@ I was looking at upgrade tests now too, and of course they pass locally cry

I've opened a separate PR for that 👍 #283

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Ok, something seems definetly not ok, this is what I get trying manually:

os:~ # luet --version
luet version 0.16.6-ga7b4ae67c9b86b22bd706df7cde43fdfd5121772 2021-06-16 09:14:14 UTC
cos:~ # cos-upgrade
Upgrading system..
-> Upgrade target: active.img
3240+0 records in
3240+0 records out
3397386240 bytes (3.4 GB, 3.2 GiB) copied, 5.25743 s, 646 MB/s
mke2fs 1.43.8 (1-Jan-2018)
Discarding device blocks: done                            
Creating filesystem with 829440 4k blocks and 207584 inodes
Filesystem UUID: 32acc853-434a-44e8-a4bb-fbc55f3e7d00
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200

Allocating group tables: done                            
Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done 

 Enabled plugins:
         image-mtree-check
 Downloading quay.io/costoolkit/releases-opensuse:repository.yaml
\  Pulled: sha256:46605c2f5987d28433b921b9b7c3bc3dad07e66b6d66a4465e905328937008f7
 Size: 640B
 Downloading quay.io/costoolkit/releases-opensuse:tree.tar.zst
-  Pulled: sha256:ec728a638d26d1a3c12d5b573d755005a037fb9499a161009e7114b28cd01b24
 Size: 2.547KiB
 Downloading quay.io/costoolkit/releases-opensuse:repository.meta.yaml.tar.zst
-  Pulled: sha256:c510d8ab087499df16cf90d188b1e867d0d8f9824712f8becbb2f9a0a853ed01
 Size: 198.6KiB
  Repository cOS revision: 16 - 2021-06-12 2030 +0000 UTC
/  Repository: cos Priority: 1 Type: docker
 Packages that are going to be installed in the system: 
  cos-system-0.5.3+1 (cos)
 Downloading image quay.io/costoolkit/releases-opensuse:cos-system-0.5.3-1
\  Pulled: sha256:1787478f4c27e46d8865e511039506d45de6cfa1ca0e5d575e25160d7a8facb4
 Size: 638.8MiB
  Package  system/cos-0.5.3+1 downloaded
  Package  system/cos-0.5.3+1 installed
 Cleaned:  0 packages.
tune2fs 1.43.8 (1-Jan-2018)
tune2fs 1.43.8 (1-Jan-2018)
Flush changes to disk
Upgrade done, now you might want to reboot
cos:~ # reboo^C
cos:~ # cat /etc/os-release 
ANSI_COLOR="0;32"
BUG_REPORT_URL="https://github.com/mudler/cOS/issues"
DOCUMENTATION_URL="https://github.com/mudler/cOS"
HOME_URL="https://github.com/mudler/cOS"
ID="cOS"
NAME="cOS"
PRETTY_NAME="cOS v0.5.3+2"
cos:~ # reboot
Connection to 127.0.0.1 closed by remote host.
Connection to 127.0.0.1 closed.

~/_git/cOS/tests cos-ci-forks-bump_luet_toolchain* 2m 50s
❯ vagrant ssh
Password: 

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

To note, the release version after booting it's fine. It's just that I cannot anymore login via ssh with the vagrant user. I can login to the terminal with root/cos just fine..

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

VirtualBox_tests_default_1623836200067_3343_16_06_2021_11_44_56

Even more puzzling, the user is there.. and also the ssh keys

VirtualBox_tests_default_1623836200067_3343_16_06_2021_11_46_26

@davidcassany
Copy link
Contributor

davidcassany commented Jun 16, 2021

Is the IP the same one after reboot? probably this is a known_hosts issue, I am seeing it in libvirt. If so probably persisting some of the volatile wicked config files might do the trick.

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Is the IP the same one after reboot? probably this is a known_hosts issue, I am seeing it in libvirt. If so probably persisting some of the volatile wicked config files might do the trick.

yup, I always get 10.0.2.15 in the VM. The weird thing is that if I set a password to the vagrant user, and later I ssh-copy-id my identity, I can login just fine ..

I've also checked all permissions and try to restore those manually, nothing wrong there :/

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Definetly it's something happening after a reboot. If I switch back to the fallback partition which is the same one where I was booting at the beginning, I can't login from there as well

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Another weird thing I'm noticing. Even if /etc/machine-id seems to persist correctly, across reboot I get prompted to re-trust the host anytime. Note I connect over it via 127.0.0.1:2222 all the times

@davidcassany
Copy link
Contributor

Another weird thing I'm noticing. Even if /etc/machine-id seems to persist correctly, across reboot I get prompted to re-trust the host anytime. Note I connect over it via 127.0.0.1:2222 all the times

SSH finger prints are recreated on each boot, see /etc/ssh probably we should persist that too.

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Another weird thing I'm noticing. Even if /etc/machine-id seems to persist correctly, across reboot I get prompted to re-trust the host anytime. Note I connect over it via 127.0.0.1:2222 all the times

SSH finger prints are recreated on each boot, see /etc/ssh probably we should persist that too.

oh right! I just assumed we persisted all of those. Going to try that 👍

@davidcassany
Copy link
Contributor

davidcassany commented Jun 16, 2021

What I can't understand it why is this popping up right now, its been like that since the very beginning 🤷‍♂️

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

What I can't understand it why is this popping up right now, its been like that since the very beginning man_shrugging

Yep indeed..

It could have also been some update on the GHA runners, it looks like things degraded significantly since last outage :/

@davidcassany
Copy link
Contributor

I just saw another failure that might cause this sort of issues, one of my local tests execution failed to login after reboot. I logged into the machine directly on tty0 from VirtualBox and I saw dbus service failed to start because of hitting restart limit, complaining of too many repeated restarts in short time, this causes a wicked service failure too, so it did not get any IP...

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

I've tried to move back to the old mechanism we used to unpack images: mudler/luet@8780e4f - but testing locally yields the same results

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

Created #286 for the persisting the ssh generated keys on first-boot

@mudler
Copy link
Contributor

mudler commented Jun 16, 2021

I'll try to see if the issue is still there with 0.16.5. if that works I'll do a switch with an env var to enable unpriv installs from packages in containers

@mudler mudler mentioned this pull request Jun 16, 2021
@mudler
Copy link
Contributor

mudler commented Jun 17, 2021

Alright, looks indeed an issue with the usage of unprivileged installs introduced in mudler/luet@796967c. A switch might do it for the time being ( see #287 )

@mudler mudler closed this Jun 17, 2021
Releases automation moved this from 🏃🏼‍♂️ In Progress to ✅ Done Jun 17, 2021
davidcassany added a commit that referenced this pull request Jun 16, 2022
Signed-off-by: David Cassany <dcassany@suse.com>
davidcassany added a commit that referenced this pull request Jun 16, 2022
Signed-off-by: David Cassany <dcassany@suse.com>
davidcassany added a commit that referenced this pull request Jun 16, 2022
Signed-off-by: David Cassany <dcassany@suse.com>
frelon pushed a commit to frelon/elemental-toolkit that referenced this pull request May 12, 2023
…ancher#260)

This commit ensures repository architecture is properly propageted from
`--repo` flag down to luet repository configuration.

In addition a repository without any architecture is set to the current
target architecture in v1.Config.

Finally a `Sanitize()` method is added to `v1.BuildConfig` type.

Signed-off-by: David Cassany <dcassany@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Releases
✅ Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants