Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNM: debug extensions build #12

Closed
wants to merge 2 commits into from

Conversation

LorbusChris
Copy link
Contributor

@LorbusChris LorbusChris commented Sep 19, 2022

The extensions build that is run in the SCOS base container fails with:

[cosa-build : fetch-and-build] (rpm-ostree compose extensions:1): libdnf-WARNING **: 04:28:55.246: failed to setup monitor: Operation not supported
[cosa-build : fetch-and-build] error: Unknown rpm-md repository: baseos

This seems to come from:
https://github.com/GNOME/glib/blob/main/gio/gfile.c#L5680
via
https://github.com/rpm-software-management/libdnf/blob/dnf-4-master/libdnf/dnf-repo-loader.cpp#L553

The c9s.repo is indeed present and contains the baseos repo. Logs from PipelineRun on operate-first (https://console-openshift-console.apps.smaug.na.operate-first.cloud/k8s/ns/okd-team/tekton.dev~v1beta1~PipelineRun/okd-coreos-all-y6y30e/logs):

[1/3] STEP 7/7: RUN ls -al && cat c9s.repo && rpm-ostree compose extensions --rootfs=/ --output-dir=/usr/share/rpm-ostree/extensions/ {manifest,extensions}.yaml
total 364
drwxr-xr-x.  1 root root   4096 Sep 19 14:13 .
dr-xr-xr-x.  1 root root   4096 Sep 19 14:13 ..
-rw-rw-r--.  1 root root     41 Sep 19 13:41 .fedora-coreos-config-base
drwxrwxr-x.  9 root root   4096 Sep 19 13:48 .git
-rw-rw-r--.  1 root root     27 Sep 19 13:41 .gitignore
-rw-rw-r--.  1 root root    142 Sep 19 13:41 .gitmodules
-rwxrwxr-x.  1 root root     49 Sep 19 13:41 .prow.sh
-rw-rw-r--.  1 root root    909 Sep 19 13:41 OWNERS
-rw-rw-r--.  1 root root   1269 Sep 19 13:41 README.md
-rw-rw-r--.  1 root root   1653 Sep 19 13:41 c9s.repo
drwxrwxr-x.  2 root root   4096 Sep 19 13:41 ci
-rw-rw-r--.  1 root root   1304 Sep 19 13:41 common-el9.yaml
-rw-rw-r--.  1 root root   9382 Sep 19 13:41 common.yaml
drwxrwxr-x.  2 root root   4096 Sep 19 13:41 docs
drwxrwxr-x.  4 root root   4096 Sep 19 14:07 extensions
-rw-rw-r--.  1 root root   2001 Sep 19 13:41 extensions-c9s.yaml
-rw-rw-r--.  1 root root   2023 Sep 19 13:41 extensions-rhel-8.6.yaml
-rw-rw-r--.  1 root root   1976 Sep 19 13:41 extensions-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     24 Sep 19 13:41 extensions-rhel-coreos-8.yaml -> extensions-rhel-8.6.yaml
lrwxrwxrwx.  1 root root     24 Sep 19 13:41 extensions-rhel-coreos-9.yaml -> extensions-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     19 Sep 19 13:41 extensions-scos.yaml -> extensions-c9s.yaml
lrwxrwxrwx.  1 root root     20 Sep 19 13:41 extensions.yaml -> extensions-scos.yaml
drwxrwxr-x.  8 root root   4096 Sep 19 13:41 fedora-coreos-config
-rw-rw-r--.  1 root root   4851 Sep 19 13:41 go.mod
-rw-rw-r--.  1 root root 190200 Sep 19 13:41 go.sum
-rw-rw-r--.  1 root root    575 Sep 19 13:41 group
lrwxrwxrwx.  1 root root     19 Sep 19 13:41 image-c9s.yaml -> image-rhel-9.0.yaml
-rw-rw-r--.  1 root root    847 Sep 19 13:41 image-rhel-8.6.yaml
-rw-rw-r--.  1 root root    695 Sep 19 13:41 image-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     19 Sep 19 13:41 image-rhel-coreos-8.yaml -> image-rhel-8.6.yaml
lrwxrwxrwx.  1 root root     19 Sep 19 13:41 image-rhel-coreos-9.yaml -> image-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     14 Sep 19 13:41 image-scos.yaml -> image-c9s.yaml
lrwxrwxrwx.  1 root root     15 Sep 19 13:41 image.yaml -> image-scos.yaml
-rw-rw-r--.  1 root root   1848 Sep 19 13:41 kola-denylist.yaml
drwxrwxr-x.  4 root root   4096 Sep 19 13:41 live
-rw-rw-r--.  1 root root   4256 Sep 19 14:07 manifest-c9s.yaml
-rw-rw-r--.  1 root root   6199 Sep 19 13:41 manifest-rhel-8.6.yaml
-rw-rw-r--.  1 root root   4442 Sep 19 13:41 manifest-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     22 Sep 19 13:41 manifest-rhel-coreos-8.yaml -> manifest-rhel-8.6.yaml
lrwxrwxrwx.  1 root root     22 Sep 19 13:41 manifest-rhel-coreos-9.yaml -> manifest-rhel-9.0.yaml
lrwxrwxrwx.  1 root root     17 Sep 19 13:41 manifest-scos.yaml -> manifest-c9s.yaml
lrwxrwxrwx.  1 root root     18 Sep 19 13:41 manifest.yaml -> manifest-scos.yaml
-rw-rw-r--.  1 root root    295 Sep 19 13:41 oscontainer.yaml
drwxrwxr-x. 11 root root   4096 Sep 19 13:41 overlay.d
-rw-rw-r--.  1 root root   1342 Sep 19 13:41 passwd
-rw-rw-r--.  1 root root   4743 Sep 19 13:41 platforms.yaml
-rw-rw-r--.  1 root root    742 Sep 19 13:41 rhcos-packages.yaml
drwxrwxr-x.  2 root root   4096 Sep 19 13:41 scripts
drwxrwxr-x.  4 root root   4096 Sep 19 13:41 tests
[baseos]
name=CentOS Stream 9 - BaseOS
baseurl=http://mirror.stream.centos.org/9-stream/BaseOS/$basearch/os
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-Official

[appstream]
name=CentOS Stream 9 - AppStream
baseurl=http://mirror.stream.centos.org/9-stream/AppStream/$basearch/os
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-Official

[nfv]
name=CentOS Stream 9 - NFV
baseurl=http://mirror.stream.centos.org/9-stream/NFV/$basearch/os
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-Official

[rt]
name=CentOS Stream 9 - RT
baseurl=http://mirror.stream.centos.org/9-stream/RT/$basearch/os
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-Official

[sig-nfv]
name=CentOS Stream 9 - SIG NFV
baseurl=http://mirror.stream.centos.org/SIGs/9-stream/nfv/$basearch/openvswitch-2/
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-SIG-NFV

[sig-virtualization]
name=CentOS Stream 9 - SIG Virtualization
baseurl=http://mirror.stream.centos.org/SIGs/9-stream/virt/x86_64/kata-containers/
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-SIG-Virtualization

[okd-copr]
name=OKD COPR
baseurl=https://download.copr.fedorainfracloud.org/results/@OKD/okd/centos-stream-9-$basearch/
gpgcheck=1
repo_gpgcheck=0
enabled=1
gpgkey=https://download.copr.fedorainfracloud.org/results/@OKD/okd/pubkey.gpg

[artifacts]
name=OKD RPM artifacts
baseurl=file:///workspace/coreos/rpms/
repo_gpgcheck=0
gpgcheck=0
enabled=1


(rpm-ostree compose extensions:1): libdnf-WARNING **: 14:13:47.883: failed to setup monitor: Operation not supported
error: Unknown rpm-md repository: baseos
Error: error building at STEP "RUN ls -al && cat c9s.repo && rpm-ostree compose extensions --rootfs=/ --output-dir=/usr/share/rpm-ostree/extensions/ {manifest,extensions}.yaml": error while running runtime: exit status 1
error: exit status 1

@LorbusChris
Copy link
Contributor Author

cc @travier @cgwalters
Any idea where the libdnf failure to setup the monitor might come from?
Note that this still pulls in rpm-ostree 2022.12, as 2022.13 is still gated behind c9s-gate, not sure how long packages have to wait there.

@cgwalters
Copy link

Filed coreos/rpm-ostree#4029 for the monitor thing but that's just cosmetic, it's not the real issue.

The real issue is:

error: Unknown rpm-md repository: baseos

@LorbusChris
Copy link
Contributor Author

@cgwalters hm, is this not the monitor looking for *.repo files in the compose dir? How else would rpm-ostree know about the yum repos? As the logs show, the c9s.repo is present and defines the baseos repo. If the failed monitor is not the reason, why is it not picked up and what can we do to fix this?

@cgwalters
Copy link

What's the mounted filesystem here? Can you add a stat -f . here e.g.? Are there any strict seccomp policies in play here?

Looking at the code it doesn't seem like inotify failing should be fatal, but maybe.

I think it's more likely that the .repo file isn't being found at all for some reason. We could try to add some more debugging info to print out which repo files we loaded.

@LorbusChris
Copy link
Contributor Author

What's the mounted filesystem here? Can you add a stat -f . here e.g.? Are there any strict seccomp policies in play here?

Running a new build now, will post logs here

We could try to add some more debugging info to print out which repo files we loaded.

That would be very helpful indeed!

@LorbusChris
Copy link
Contributor Author

It's an OverlayFS

[1/3] STEP 7/7: RUN stat -f . && ls -al && cat c9s.repo && rpm-ostree compose extensions --rootfs=/ --output-dir=/usr/share/rpm-ostree/extensions/ {manifest,extensions}.yaml
  File: "."
    ID: f0528d92d52841ea Namelen: 255     Type: overlayfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 2574551    Free: 1876684    Available: 1745612
Inodes: Total: 655360     Free: 584536

Full logs:
cosa-build.log

@cgwalters
Copy link

Somewhat related to this...I personally consider the extensions container stuff as a short term crutch specifically for RHEL/OCP. The most load-bearing extension by far is kernel-rt but I want to do https://issues.redhat.com/browse/MCO-378 and then we could probably just disable extensions support via MachineConfig for OKD - anyone who wants to add custom content (usbgaurd, etc.) can build their own images in a fully supported and sane way.

@LorbusChris
Copy link
Contributor Author

That makes a lot of sense to me! Maybe in SCOS, instead of jumping through these hoops now, we could simply never ship the extensions image then. As of today we're not able to build it here anyway, and would thus have to disable it for the MVP release next week.

@LorbusChris
Copy link
Contributor Author

The pipeline will continue to attempt extension builds, but won't fail if unsuccessful. I've opened openshift/machine-config-operator#3356 to remove the ext image ref from MCO for the time being.

I'm closing this PR. This work can be picked up elsewhere.

@cgwalters
Copy link

So we just now hit this in https://issues.redhat.com/browse/COS-2000 ...should have debugged it fully at the time 😦

Current status quo is that OKD/SCOS is still not shipping the extensions container?

@LorbusChris
Copy link
Contributor Author

Yes, unfortunately :(

@travier
Copy link
Member

travier commented Jan 24, 2023

We likely need https://github.com/coreos/coreos-assembler/blob/main/src/build-extensions-container.sh#L18 to copy the in-tree c9s.repo file too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants