Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenShift compatibility #44

Closed
tjungblu opened this issue Jan 6, 2022 · 25 comments
Closed

OpenShift compatibility #44

tjungblu opened this issue Jan 6, 2022 · 25 comments

Comments

@tjungblu
Copy link
Contributor

tjungblu commented Jan 6, 2022

Hey,

I'm currently looking into this project for OpenShift, you mention for ARO/ROSA:

building compatible binaries seems to be the next step

what is missing to get this to work with RHCOS?

Cheers,
Thomas

@No9
Copy link
Collaborator

No9 commented Jan 6, 2022

Hey @tjungblu
To be honest I'm not 100% what's required for RHCOS.
I ran a couple of very quick tests on ARO/ROSA and they didn't work.
Specifically the agent that uploads the core-dumps worked as expected but the cores weren't collected by the exe on the host.

Some aspects I was going to investigate further:

The core-dump-composer is copied to a shared host location during the deployment. Is this actually supported by CoreOS?

a) If copying the binary to the host is supported does it require compatible builds or will RHEL ubi7/8 builds work ok?
[edit] It may just require copying to a specific write location which can be changed with the daemonset.hostDirectory and daemonset.coreDirectory options.

b) If copying the file to the host isn't supported the best practices for supplying host services needs to be researched. From initial reading it seems as though host services are provided as containers and these need to be defined at cluster creation time but I may have that totally wrong.

I don't think the issues are ARO/ROSA specific and my next step was going to be to setup a local cluster and investigate further but I haven't had time to get around to it.

I'm also conscious that the way this project works might impact how Red Hat provides general support for aborting processes so that's another area that will need clarification.

[edit]
Kind of related it would be useful to confirm if the manual scc config is required as I hit this bug when a while back and haven't gone back to revalidate
openshift/origin#20788

Sorry to give you more questions than answers but that's the status of where I got too.

Any help would be greatly appreciated.

@travier
Copy link

travier commented Jan 7, 2022

If I understand correctly, this program sets itself up to handle coredumps from the kernel, taking over systemd-coredump. As the binary is launched directly by the kernel, the easiest option is to place it directly on the host in /usr/local/bin for example which is a writable path on RHCOS. Then the agent can run as a container with access to the path where the dumps are stored.

RHCOS is based on RHEL so building binaries in UBI 7/8 will give compatible binaries for RHCOS.

Another option is to have the agent directly talk to systemd-coredump via its socket (/run/systemd/coredump) and process the results. That would requires less kernel configuration.

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

Thanks for your help here @travier, much appreciated.
I just had some time and tried this out with our cluster-bot installing 4.9 on AWS directly (Red Hat Enterprise Linux CoreOS 49.84.202201042103-0) and it works flawlessly with just these adjustments:

in values.yaml
hostDirectory: "/mnt/core-dump-handler"
coreDirectory: "/mnt/core-dump-handler/cores"

and the scc grant:

oc adm policy add-scc-to-user privileged -z core-dump-admin -n observe

image

do you think we should add a switch to helm to make it work on open shift? I haven't tried the add the scc change into the chart yet, but I'm sure we can also fix this somehow. Otherwise we can just use a post-installation hook job that runs the OC command.

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

@tjungblu can you confirm the contents of the zip - there should be 7 files in there.
Just want to confirm that crictl is being picked up properly
Or did you use https://github.com/IBM/core-dump-handler/blob/main/integration/run.sh with a .env file set up in the root of the project?

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

@No9 interesting, I got only 5:

-r--r--r--. 1 tjungblu tjungblu 229376 Jan  7 10:53 6327835c-532b-447b-98be-40dfb46bb130-dump-1641552839-segfaulter-segfaulter-1-4.core
-r--r--r--. 1 tjungblu tjungblu    293 Jan  7 10:53 6327835c-532b-447b-98be-40dfb46bb130-dump-1641552839-segfaulter-segfaulter-1-4-dump-info.json
-r--r--r--. 1 tjungblu tjungblu    596 Jan  7 10:53 6327835c-532b-447b-98be-40dfb46bb130-dump-1641552839-segfaulter-segfaulter-1-4-pod-info.json
-r--r--r--. 1 tjungblu tjungblu    996 Jan  7 10:53 6327835c-532b-447b-98be-40dfb46bb130-dump-1641552839-segfaulter-segfaulter-1-4-ps-info.json
-r--r--r--. 1 tjungblu tjungblu  27059 Jan  7 10:53 6327835c-532b-447b-98be-40dfb46bb130-dump-1641552839-segfaulter-segfaulter-1-4-runtime-info.json

The coredump looks fine though with objdump (as far as I can tell)

I ran the segfaulter directly after installing the helm chart (no .env file setup):

kubectl run -it segfaulter --image=quay.io/icdh/segfaulter --restart=Never

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

OK the good news is it looks like crictl is there and functioning but you are missing the image info.
Can you redeploy with the chart adding --set daemonset.composerCrioImageCmd="images" and rerun the coredump.

[Edit ]@tjungblu To be clear the test includes the original zip file so it's just the image file that's missing.

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

yep, that's now on the daemonset:

COMP_CRIO_IMAGE_CMD = images

and in the logs

[2022-01-07T11:27:02Z INFO  core_dump_agent] Creating /mnt/core-dump-handler/.env file with LOG_LEVEL=Warn
[2022-01-07T11:27:02Z INFO  core_dump_agent] Writing composer .env
    LOG_LEVEL=Warn
    IGNORE_CRIO=false
    CRIO_IMAGE_CMD=images
    USE_CRIO_CONF=false

[2022-01-07T11:27:02Z INFO  core_dump_agent] Executing Agent with location : /mnt/core-dump-handler/cores

I'm afraid the image file isn't there however:

Archive:  aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4.zip
  inflating: aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4-dump-info.json  
  inflating: aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4.core  
  inflating: aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4-pod-info.json  
  inflating: aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4-runtime-info.json  
  inflating: aa08a83e-5b13-4a32-9ed4-e8670d738d83-dump-1641554896-segfaulter-segfaulter-1-4-ps-info.json  

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

OK in the pod can you run cat /mnt/core-dump-handler/composer.log
That is the log for the composer - hopefully there is a clue there
If there isn't can we can redeploy with --set daemonset.composerLogLevel="Debug" to get verbose output.

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

debug did the trick, here's the log output:

sh-4.4# cat /mnt/core-dump-handler/composer.log
INFO - 2022-01-07T11:44:48.149442800+00:00 - Loading .env
INFO - 2022-01-07T11:44:48.149491609+00:00 - Set logfile to: "/var/mnt/core-dump-handler/composer.log"
DEBUG - 2022-01-07T11:44:48.149576456+00:00 - Creating dump for 39817ae8-aa0e-4887-b22d-a20d3f3deb7e-dump-1641555888-segfaulter-segfaulter-1-4
INFO - 2022-01-07T11:44:48.152804767+00:00 - Running crictl ["pods", "--name", "segfaulter", "-o", "json"]
DEBUG - 2022-01-07T11:44:48.173483374+00:00 - Using runtime_file_name:39817ae8-aa0e-4887-b22d-a20d3f3deb7e-dump-1641555888-segfaulter-segfaulter-1-4-pod-info.json
DEBUG - 2022-01-07T11:44:48.173901132+00:00 - pod object {"items":[{"annotations":{"kubernetes.io/config.seen":"2022-01-07T11:44:46.300339165Z","kubernetes.io/config.source":"api"},"createdAt":"1641555886634313048","id":"be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93","labels":{"io.kubernetes.container.name":"POD","io.kubernetes.pod.name":"segfaulter","io.kubernetes.pod.namespace":"default","io.kubernetes.pod.uid":"6a52412a-0a32-425f-830f-87a95abb57d4","run":"segfaulter"},"metadata":{"attempt":0,"name":"segfaulter","namespace":"default","uid":"6a52412a-0a32-425f-830f-87a95abb57d4"},"runtimeHandler":"","state":"SANDBOX_READY"}]}
DEBUG - 2022-01-07T11:44:48.173929712+00:00 - Using pod_id:be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93
INFO - 2022-01-07T11:44:48.173933462+00:00 - Running crictl ["inspectp", "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93"]
DEBUG - 2022-01-07T11:44:48.195405333+00:00 - inspectp_output status: exit status: 0
DEBUG - 2022-01-07T11:44:48.195432985+00:00 - inspectp_output stderr, 
DEBUG - 2022-01-07T11:44:48.195438941+00:00 - Using runtime_file_name:39817ae8-aa0e-4887-b22d-a20d3f3deb7e-dump-1641555888-segfaulter-segfaulter-1-4-runtime-info.json
DEBUG - 2022-01-07T11:44:48.195687868+00:00 - inspectp_output: {
  "status": {
    "id": "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93",
    "metadata": {
      "attempt": 0,
      "name": "segfaulter",
      "namespace": "default",
      "uid": "6a52412a-0a32-425f-830f-87a95abb57d4"
    },
    "state": "SANDBOX_READY",
    "createdAt": "2022-01-07T11:44:46.634313048Z",
    "network": {
      "additionalIps": [],
      "ip": "10.128.2.16"
    },
    "linux": {
      "namespaces": {
        "options": {
          "ipc": "POD",
          "network": "POD",
          "pid": "CONTAINER",
          "targetId": ""
        }
      }
    },
    "labels": {
      "io.kubernetes.container.name": "POD",
      "io.kubernetes.pod.name": "segfaulter",
      "io.kubernetes.pod.namespace": "default",
      "io.kubernetes.pod.uid": "6a52412a-0a32-425f-830f-87a95abb57d4",
      "run": "segfaulter"
    },
    "annotations": {
      "kubernetes.io/config.seen": "2022-01-07T11:44:46.300339165Z",
      "kubernetes.io/config.source": "api"
    },
    "runtimeHandler": ""
  },
  "info": {
    "runtimeSpec": {
      "ociVersion": "1.0.2-dev",
      "process": {
        "user": {
          "uid": 0,
          "gid": 0
        },
        "args": [
          "/usr/bin/pod"
        ],
        "env": [
          "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
          "TERM=xterm"
        ],
        "cwd": "/",
        "capabilities": {
          "bounding": [
            "CAP_CHOWN",
            "CAP_DAC_OVERRIDE",
            "CAP_FSETID",
            "CAP_FOWNER",
            "CAP_SETGID",
            "CAP_SETUID",
            "CAP_SETPCAP",
            "CAP_NET_BIND_SERVICE",
            "CAP_KILL"
          ],
          "effective": [
            "CAP_CHOWN",
            "CAP_DAC_OVERRIDE",
            "CAP_FSETID",
            "CAP_FOWNER",
            "CAP_SETGID",
            "CAP_SETUID",
            "CAP_SETPCAP",
            "CAP_NET_BIND_SERVICE",
            "CAP_KILL"
          ],
          "inheritable": [
            "CAP_CHOWN",
            "CAP_DAC_OVERRIDE",
            "CAP_FSETID",
            "CAP_FOWNER",
            "CAP_SETGID",
            "CAP_SETUID",
            "CAP_SETPCAP",
            "CAP_NET_BIND_SERVICE",
            "CAP_KILL"
          ],
          "permitted": [
            "CAP_CHOWN",
            "CAP_DAC_OVERRIDE",
            "CAP_FSETID",
            "CAP_FOWNER",
            "CAP_SETGID",
            "CAP_SETUID",
            "CAP_SETPCAP",
            "CAP_NET_BIND_SERVICE",
            "CAP_KILL"
          ]
        },
        "oomScoreAdj": -998,
        "selinuxLabel": "system_u:system_r:container_t:s0:c465,c727"
      },
      "root": {
        "path": "/var/lib/containers/storage/overlay/62ed5caf9aaf57b459b5ef9ad219519a714e78a9ab11f95f98266af70d389c48/merged",
        "readonly": true
      },
      "hostname": "segfaulter",
      "mounts": [
        {
          "destination": "/proc",
          "type": "proc",
          "source": "proc",
          "options": [
            "nosuid",
            "noexec",
            "nodev"
          ]
        },
        {
          "destination": "/dev",
          "type": "tmpfs",
          "source": "tmpfs",
          "options": [
            "nosuid",
            "strictatime",
            "mode=755",
            "size=65536k"
          ]
        },
        {
          "destination": "/dev/pts",
          "type": "devpts",
          "source": "devpts",
          "options": [
            "nosuid",
            "noexec",
            "newinstance",
            "ptmxmode=0666",
            "mode=0620",
            "gid=5"
          ]
        },
        {
          "destination": "/dev/mqueue",
          "type": "mqueue",
          "source": "mqueue",
          "options": [
            "nosuid",
            "noexec",
            "nodev"
          ]
        },
        {
          "destination": "/sys",
          "type": "sysfs",
          "source": "sysfs",
          "options": [
            "nosuid",
            "noexec",
            "nodev",
            "ro"
          ]
        },
        {
          "destination": "/etc/resolv.conf",
          "type": "bind",
          "source": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/resolv.conf",
          "options": [
            "ro",
            "bind",
            "nodev",
            "nosuid",
            "noexec"
          ]
        },
        {
          "destination": "/dev/shm",
          "type": "bind",
          "source": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/shm",
          "options": [
            "rw",
            "bind"
          ]
        },
        {
          "destination": "/etc/hostname",
          "type": "bind",
          "source": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/hostname",
          "options": [
            "ro",
            "bind",
            "nodev",
            "nosuid",
            "noexec"
          ]
        }
      ],
      "annotations": {
        "io.kubernetes.pod.uid": "6a52412a-0a32-425f-830f-87a95abb57d4",
        "io.kubernetes.cri-o.Image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c56c86d030185bda241514593970e80f75ae75afd9bc6288388944bc2a1dfb1f",
        "io.kubernetes.cri-o.ShmPath": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/shm",
        "io.kubernetes.cri-o.PortMappings": "[]",
        "run": "segfaulter",
        "io.kubernetes.cri-o.SeccompProfilePath": "runtime/default",
        "io.kubernetes.cri-o.NamespaceOptions": "{\"pid\":1}",
        "io.container.manager": "cri-o",
        "org.systemd.property.CollectMode": "'inactive-or-failed'",
        "kubernetes.io/config.seen": "2022-01-07T11:44:46.300339165Z",
        "io.kubernetes.cri-o.RuntimeHandler": "",
        "io.kubernetes.cri-o.ResolvPath": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/resolv.conf",
        "io.kubernetes.cri-o.CgroupParent": "kubepods-besteffort-pod6a52412a_0a32_425f_830f_87a95abb57d4.slice",
        "io.kubernetes.pod.name": "segfaulter",
        "io.kubernetes.cri-o.MountPoint": "/var/lib/containers/storage/overlay/62ed5caf9aaf57b459b5ef9ad219519a714e78a9ab11f95f98266af70d389c48/merged",
        "io.kubernetes.cri-o.HostnamePath": "/run/containers/storage/overlay-containers/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93/userdata/hostname",
        "io.kubernetes.cri-o.IP.0": "10.128.2.16",
        "io.kubernetes.cri-o.SandboxID": "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93",
        "io.kubernetes.cri-o.Created": "2022-01-07T11:44:46.634313048Z",
        "io.kubernetes.pod.namespace": "default",
        "io.kubernetes.cri-o.Spoofed": "true",
        "io.kubernetes.cri-o.Annotations": "{\"kubernetes.io/config.seen\":\"2022-01-07T11:44:46.300339165Z\",\"kubernetes.io/config.source\":\"api\"}",
        "io.kubernetes.cri-o.ContainerID": "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93",
        "io.kubernetes.cri-o.CNIResult": "{\"cniVersion\":\"0.4.0\",\"interfaces\":[{\"name\":\"eth0\",\"sandbox\":\"/var/run/netns/d8ad3ec9-82db-4993-99b0-413bd1235127\"}],\"ips\":[{\"version\":\"4\",\"interface\":0,\"address\":\"10.128.2.16/23\"}],\"routes\":[{\"dst\":\"0.0.0.0/0\",\"gw\":\"10.128.2.1\"},{\"dst\":\"224.0.0.0/4\"},{\"dst\":\"10.128.0.0/14\"}],\"dns\":{}}",
        "io.kubernetes.cri-o.HostNetwork": "false",
        "kubernetes.io/config.source": "api",
        "io.kubernetes.container.name": "POD",
        "io.kubernetes.cri-o.Metadata": "{\"Name\":\"segfaulter\",\"UID\":\"6a52412a-0a32-425f-830f-87a95abb57d4\",\"Namespace\":\"default\",\"Attempt\":0}",
        "io.kubernetes.cri-o.Name": "k8s_segfaulter_default_6a52412a-0a32-425f-830f-87a95abb57d4_0",
        "io.kubernetes.cri-o.PrivilegedRuntime": "false",
        "io.kubernetes.cri-o.ContainerType": "sandbox",
        "io.kubernetes.cri-o.ContainerName": "k8s_POD_segfaulter_default_6a52412a-0a32-425f-830f-87a95abb57d4_0",
        "io.kubernetes.cri-o.HostName": "segfaulter",
        "io.kubernetes.cri-o.KubeName": "segfaulter",
        "io.kubernetes.cri-o.Labels": "{\"io.kubernetes.container.name\":\"POD\",\"io.kubernetes.pod.uid\":\"6a52412a-0a32-425f-830f-87a95abb57d4\",\"io.kubernetes.pod.namespace\":\"default\",\"io.kubernetes.pod.name\":\"segfaulter\",\"run\":\"segfaulter\"}",
        "io.kubernetes.cri-o.LogPath": "/var/log/pods/default_segfaulter_6a52412a-0a32-425f-830f-87a95abb57d4/be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93.log",
        "io.kubernetes.cri-o.Namespace": "default"
      },
      "linux": {
        "sysctl": {
          "net.ipv4.ping_group_range": "0 2147483647"
        },
        "resources": {
          "devices": [
            {
              "allow": false,
              "access": "rwm"
            }
          ],
          "cpu": {
            "shares": 2
          }
        },
        "cgroupsPath": "kubepods-besteffort-pod6a52412a_0a32_425f_830f_87a95abb57d4.slice:crio:be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93",
        "namespaces": [
          {
            "type": "pid"
          },
          {
            "type": "network",
            "path": "/var/run/netns/d8ad3ec9-82db-4993-99b0-413bd1235127"
          },
          {
            "type": "ipc",
            "path": "/var/run/ipcns/d8ad3ec9-82db-4993-99b0-413bd1235127"
          },
          {
            "type": "uts",
            "path": "/var/run/utsns/d8ad3ec9-82db-4993-99b0-413bd1235127"
          },
          {
            "type": "mount"
          }
        ],
        "seccomp": {
          "defaultAction": "SCMP_ACT_ERRNO",
          "defaultErrnoRet": 38,
          "architectures": [
            "SCMP_ARCH_X86_64",
            "SCMP_ARCH_X86",
            "SCMP_ARCH_X32"
          ],
          "syscalls": [
            {
              "names": [
                "bdflush",
                "io_pgetevents",
                "kexec_file_load",
                "kexec_load",
                "migrate_pages",
                "move_pages",
                "nfsservctl",
                "nice",
                "oldfstat",
                "oldlstat",
                "oldolduname",
                "oldstat",
                "olduname",
                "pciconfig_iobase",
                "pciconfig_read",
                "pciconfig_write",
                "sgetmask",
                "ssetmask",
                "swapcontext",
                "swapoff",
                "swapon",
                "sysfs",
                "uselib",
                "userfaultfd",
                "ustat",
                "vm86",
                "vm86old",
                "vmsplice"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "_llseek",
                "_newselect",
                "accept",
                "accept4",
                "access",
                "adjtimex",
                "alarm",
                "bind",
                "brk",
                "capget",
                "capset",
                "chdir",
                "chmod",
                "chown",
                "chown32",
                "clock_adjtime",
                "clock_adjtime64",
                "clock_getres",
                "clock_getres_time64",
                "clock_gettime",
                "clock_gettime64",
                "clock_nanosleep",
                "clock_nanosleep_time64",
                "clone",
                "clone3",
                "close",
                "close_range",
                "connect",
                "copy_file_range",
                "creat",
                "dup",
                "dup2",
                "dup3",
                "epoll_create",
                "epoll_create1",
                "epoll_ctl",
                "epoll_ctl_old",
                "epoll_pwait",
                "epoll_pwait2",
                "epoll_wait",
                "epoll_wait_old",
                "eventfd",
                "eventfd2",
                "execve",
                "execveat",
                "exit",
                "exit_group",
                "faccessat",
                "faccessat2",
                "fadvise64",
                "fadvise64_64",
                "fallocate",
                "fanotify_mark",
                "fchdir",
                "fchmod",
                "fchmodat",
                "fchown",
                "fchown32",
                "fchownat",
                "fcntl",
                "fcntl64",
                "fdatasync",
                "fgetxattr",
                "flistxattr",
                "flock",
                "fork",
                "fremovexattr",
                "fsconfig",
                "fsetxattr",
                "fsmount",
                "fsopen",
                "fspick",
                "fstat",
                "fstat64",
                "fstatat64",
                "fstatfs",
                "fstatfs64",
                "fsync",
                "ftruncate",
                "ftruncate64",
                "futex",
                "futex_time64",
                "futimesat",
                "get_robust_list",
                "get_thread_area",
                "getcpu",
                "getcwd",
                "getdents",
                "getdents64",
                "getegid",
                "getegid32",
                "geteuid",
                "geteuid32",
                "getgid",
                "getgid32",
                "getgroups",
                "getgroups32",
                "getitimer",
                "get_mempolicy",
                "getpeername",
                "getpgid",
                "getpgrp",
                "getpid",
                "getppid",
                "getpriority",
                "getrandom",
                "getresgid",
                "getresgid32",
                "getresuid",
                "getresuid32",
                "getrlimit",
                "getrusage",
                "getsid",
                "getsockname",
                "getsockopt",
                "gettid",
                "gettimeofday",
                "getuid",
                "getuid32",
                "getxattr",
                "inotify_add_watch",
                "inotify_init",
                "inotify_init1",
                "inotify_rm_watch",
                "io_cancel",
                "io_destroy",
                "io_getevents",
                "io_setup",
                "io_submit",
                "ioctl",
                "ioprio_get",
                "ioprio_set",
                "ipc",
                "keyctl",
                "kill",
                "lchown",
                "lchown32",
                "lgetxattr",
                "link",
                "linkat",
                "listen",
                "listxattr",
                "llistxattr",
                "lremovexattr",
                "lseek",
                "lsetxattr",
                "lstat",
                "lstat64",
                "madvise",
                "mbind",
                "memfd_create",
                "mincore",
                "mkdir",
                "mkdirat",
                "mknod",
                "mknodat",
                "mlock",
                "mlock2",
                "mlockall",
                "mmap",
                "mmap2",
                "mount",
                "move_mount",
                "mprotect",
                "mq_getsetattr",
                "mq_notify",
                "mq_open",
                "mq_timedreceive",
                "mq_timedreceive_time64",
                "mq_timedsend",
                "mq_timedsend_time64",
                "mq_unlink",
                "mremap",
                "msgctl",
                "msgget",
                "msgrcv",
                "msgsnd",
                "msync",
                "munlock",
                "munlockall",
                "munmap",
                "name_to_handle_at",
                "nanosleep",
                "newfstatat",
                "open",
                "openat",
                "openat2",
                "open_tree",
                "pause",
                "pidfd_getfd",
                "pidfd_open",
                "pidfd_send_signal",
                "pipe",
                "pipe2",
                "pivot_root",
                "pkey_alloc",
                "pkey_free",
                "pkey_mprotect",
                "poll",
                "ppoll",
                "ppoll_time64",
                "prctl",
                "pread64",
                "preadv",
                "preadv2",
                "prlimit64",
                "pselect6",
                "pselect6_time64",
                "pwrite64",
                "pwritev",
                "pwritev2",
                "read",
                "readahead",
                "readdir",
                "readlink",
                "readlinkat",
                "readv",
                "reboot",
                "recv",
                "recvfrom",
                "recvmmsg",
                "recvmmsg_time64",
                "recvmsg",
                "remap_file_pages",
                "removexattr",
                "rename",
                "renameat",
                "renameat2",
                "restart_syscall",
                "rmdir",
                "rseq",
                "rt_sigaction",
                "rt_sigpending",
                "rt_sigprocmask",
                "rt_sigqueueinfo",
                "rt_sigreturn",
                "rt_sigsuspend",
                "rt_sigtimedwait",
                "rt_sigtimedwait_time64",
                "rt_tgsigqueueinfo",
                "sched_get_priority_max",
                "sched_get_priority_min",
                "sched_getaffinity",
                "sched_getattr",
                "sched_getparam",
                "sched_getscheduler",
                "sched_rr_get_interval",
                "sched_rr_get_interval_time64",
                "sched_setaffinity",
                "sched_setattr",
                "sched_setparam",
                "sched_setscheduler",
                "sched_yield",
                "seccomp",
                "select",
                "semctl",
                "semget",
                "semop",
                "semtimedop",
                "semtimedop_time64",
                "send",
                "sendfile",
                "sendfile64",
                "sendmmsg",
                "sendmsg",
                "sendto",
                "setns",
                "set_mempolicy",
                "set_robust_list",
                "set_thread_area",
                "set_tid_address",
                "setfsgid",
                "setfsgid32",
                "setfsuid",
                "setfsuid32",
                "setgid",
                "setgid32",
                "setgroups",
                "setgroups32",
                "setitimer",
                "setpgid",
                "setpriority",
                "setregid",
                "setregid32",
                "setresgid",
                "setresgid32",
                "setresuid",
                "setresuid32",
                "setreuid",
                "setreuid32",
                "setrlimit",
                "setsid",
                "setsockopt",
                "setuid",
                "setuid32",
                "setxattr",
                "shmat",
                "shmctl",
                "shmdt",
                "shmget",
                "shutdown",
                "sigaltstack",
                "signalfd",
                "signalfd4",
                "sigreturn",
                "socketcall",
                "socketpair",
                "splice",
                "stat",
                "stat64",
                "statfs",
                "statfs64",
                "statx",
                "symlink",
                "symlinkat",
                "sync",
                "sync_file_range",
                "syncfs",
                "sysinfo",
                "syslog",
                "tee",
                "tgkill",
                "time",
                "timer_create",
                "timer_delete",
                "timer_getoverrun",
                "timer_gettime",
                "timer_gettime64",
                "timer_settime",
                "timer_settime64",
                "timerfd_create",
                "timerfd_gettime",
                "timerfd_gettime64",
                "timerfd_settime",
                "timerfd_settime64",
                "times",
                "tkill",
                "truncate",
                "truncate64",
                "ugetrlimit",
                "umask",
                "umount",
                "umount2",
                "uname",
                "unlink",
                "unlinkat",
                "unshare",
                "utime",
                "utimensat",
                "utimensat_time64",
                "utimes",
                "vfork",
                "wait4",
                "waitid",
                "waitpid",
                "write",
                "writev"
              ],
              "action": "SCMP_ACT_ALLOW"
            },
            {
              "names": [
                "personality"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 0,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "personality"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 8,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "personality"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 131072,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "personality"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 131080,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "personality"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 4294967295,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "arch_prctl"
              ],
              "action": "SCMP_ACT_ALLOW"
            },
            {
              "names": [
                "modify_ldt"
              ],
              "action": "SCMP_ACT_ALLOW"
            },
            {
              "names": [
                "open_by_handle_at"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "bpf",
                "fanotify_init",
                "lookup_dcookie",
                "perf_event_open",
                "quotactl",
                "setdomainname",
                "sethostname",
                "setns"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "chroot"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "delete_module",
                "init_module",
                "finit_module",
                "query_module"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "acct"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "kcmp",
                "process_madvise",
                "process_vm_readv",
                "process_vm_writev",
                "ptrace"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "iopl",
                "ioperm"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "settimeofday",
                "stime",
                "clock_settime",
                "clock_settime64"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "vhangup"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 1
            },
            {
              "names": [
                "socket"
              ],
              "action": "SCMP_ACT_ERRNO",
              "errnoRet": 22,
              "args": [
                {
                  "index": 0,
                  "value": 16,
                  "op": "SCMP_CMP_EQ"
                },
                {
                  "index": 2,
                  "value": 9,
                  "op": "SCMP_CMP_EQ"
                }
              ]
            },
            {
              "names": [
                "socket"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 2,
                  "value": 9,
                  "op": "SCMP_CMP_NE"
                }
              ]
            },
            {
              "names": [
                "socket"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 0,
                  "value": 16,
                  "op": "SCMP_CMP_NE"
                }
              ]
            },
            {
              "names": [
                "socket"
              ],
              "action": "SCMP_ACT_ALLOW",
              "args": [
                {
                  "index": 2,
                  "value": 9,
                  "op": "SCMP_CMP_NE"
                }
              ]
            }
          ]
        },
        "mountLabel": "system_u:object_r:container_file_t:s0:c465,c727"
      }
    }
  }
}

INFO - 2022-01-07T11:44:48.196130494+00:00 - Running crictl ["ps", "-o", "json", "-p", "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93"]
DEBUG - 2022-01-07T11:44:48.216846591+00:00 - ps_output status: exit status: 0
DEBUG - 2022-01-07T11:44:48.216880171+00:00 - ps_output stderr, 
DEBUG - 2022-01-07T11:44:48.216886910+00:00 - ps_output: {
  "containers": [
    {
      "id": "d521d46f5a54477537feaa4c27e13a30655aeb4c4e81439f7d0afc45baf6a43d",
      "podSandboxId": "be1703e196f6449e5a42ee2eab0bf1e05f97a3dc5115325aea4a3ffb14919a93",
      "metadata": {
        "name": "segfaulter",
        "attempt": 0
      },
      "image": {
        "image": "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd",
        "annotations": {
        }
      },
      "imageRef": "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd",
      "state": "CONTAINER_RUNNING",
      "createdAt": "1641555888115767207",
      "labels": {
        "io.kubernetes.container.name": "segfaulter",
        "io.kubernetes.pod.name": "segfaulter",
        "io.kubernetes.pod.namespace": "default",
        "io.kubernetes.pod.uid": "6a52412a-0a32-425f-830f-87a95abb57d4"
      },
      "annotations": {
        "io.kubernetes.container.hash": "1b45ece1",
        "io.kubernetes.container.restartCount": "0",
        "io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
        "io.kubernetes.container.terminationMessagePolicy": "File",
        "io.kubernetes.pod.terminationGracePeriod": "30"
      }
    }
  ]
}

DEBUG - 2022-01-07T11:44:48.217176763+00:00 - Successfully got the process details
DEBUG - 2022-01-07T11:44:48.217181203+00:00 - found img_id "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
INFO - 2022-01-07T11:44:48.217184162+00:00 - Running crictl ["images", "-o", "json"]
DEBUG - 2022-01-07T11:44:48.237318408+00:00 - Found 30 images

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

running the crictl command directly gives:

sh-4.4# ./crictl images -o json
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock]. As the default settings are now deprecated, you should set the endpoint instead. 
ERRO[0002] connect endpoint 'unix:///var/run/dockershim.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded 
ERRO[0004] connect endpoint 'unix:///run/containerd/containerd.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded 
FATA[0006] connect: connect endpoint 'unix:///run/crio/crio.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded 

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

Hmm but it's reporting DEBUG - 2022-01-07T11:44:48.237318408+00:00 - Found 30 images
30 is larger than the error output which would suggest it's running ok https://github.com/IBM/core-dump-handler/blob/main/core-dump-composer/src/main.rs#L458
But the mapping of ids isn't working https://github.com/IBM/core-dump-handler/blob/main/core-dump-composer/src/main.rs#L460
Typical that the debugging doesn't capture the comparison.
Let me put a build together with specific logging and we can take a closer look

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

@tjungblu Building ... https://quay.io/repository/icdh/core-dump-handler/build/e99996b6-6385-4d72-8e7c-2aa180f6c326
I'll run the integration test once the build is complete to confirm everything is in order and drop a note in here.
[Edit]As I look at the code and output I think this is going to be around how different k8s flavours populate "imageref"

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

sounds good, let me spin up another cluster for this

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 7, 2022

I took the liberty to run it already, since it's Friday :)

here's the output:

INFO - 2022-01-07T15:08:30.879247169+00:00 - Running crictl ["ps", "-o", "json", "-p", "981abd6f7c6cf3fab61aeba4d1bcc34e90cae5f73885046d7c1c67159ff8dcc6"]
DEBUG - 2022-01-07T15:08:30.903168753+00:00 - ps_output status: exit status: 0
DEBUG - 2022-01-07T15:08:30.903209056+00:00 - ps_output stderr, 
DEBUG - 2022-01-07T15:08:30.903215958+00:00 - ps_output: {
  "containers": [
    {
      "id": "4286a6ab9fd70f1b443905696c74b44586e0ee0729709c51d14b1a236f46c29d",
      "podSandboxId": "981abd6f7c6cf3fab61aeba4d1bcc34e90cae5f73885046d7c1c67159ff8dcc6",
      "metadata": {
        "name": "segfaulter",
        "attempt": 0
      },
      "image": {
        "image": "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd",
        "annotations": {
        }
      },
      "imageRef": "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd",
      "state": "CONTAINER_RUNNING",
      "createdAt": "1641568110589056944",
      "labels": {
        "io.kubernetes.container.name": "segfaulter",
        "io.kubernetes.pod.name": "segfaulter",
        "io.kubernetes.pod.namespace": "default",
        "io.kubernetes.pod.uid": "12a448e5-9978-415f-b48f-c2b55a997899"
      },
      "annotations": {
        "io.kubernetes.container.hash": "b72928a9",
        "io.kubernetes.container.restartCount": "0",
        "io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
        "io.kubernetes.container.terminationMessagePolicy": "File",
        "io.kubernetes.pod.terminationGracePeriod": "30"
      }
    }
  ]
}

DEBUG - 2022-01-07T15:08:30.903536444+00:00 - Successfully got the process details
DEBUG - 2022-01-07T15:08:30.903540804+00:00 - found img_id "quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
INFO - 2022-01-07T15:08:30.903543625+00:00 - Running crictl ["images", "-o", "json"]
DEBUG - 2022-01-07T15:08:30.926063517+00:00 - Found image list:
 {"images":[{"id":"f37769f487c171c99f84ef0db018ce2055f0ae3a350721ba3e9b98d8c7860563","repoDigests":["quay.io/icdh/core-dump-handler@sha256:958f48e4e18bc822ad9b7feb054d445dae3d92cd2cd84d45ce51f72543fe1c33"],"repoTags":["quay.io/icdh/core-dump-handler:img-logger"],"size":"576651861","spec":null,"uid":null,"username":""},{"id":"d8087c58ebe51554d52054e955680805d86969dc9b6917f5e3fa3ecb81c86e33","repoDigests":["quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"],"repoTags":["quay.io/icdh/segfaulter:latest"],"size":"10229047","spec":null,"uid":null,"username":""},{"id":"9fe6cec96704ffdf512ad2755c42ddfd36f2ab2aec3a27bae4cce42a8c480e14","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0213887c325d3967d122fe875b45c9259e2b8388db9dc4e0a25c0561414b8737"],"repoTags":[],"size":"400637400","spec":null,"uid":null,"username":""},{"id":"33ef73131becd5dbc3d8f913659a9d82fc6584f22aba85d3840226f891d8a16a","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:03cbd8048e00835d29ced43ddb4548e979e51dda727239cfdf027c9ef47339cf"],"repoTags":[],"size":"416462685","spec":null,"uid":null,"username":""},{"id":"28ea52b98c63aa5dd899d67bf267a3b7dd623f5a694b97a56793bb12597e2de9","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:051c76b923d9826bd577aeb1e176a4096a9af47e9a7c1819158282a5a417170b"],"repoTags":[],"size":"493445698","spec":null,"uid":null,"username":""},{"id":"9efb1f8bb8ab8197515e03b151f90d9828726c9c53564497b25a28bdd7a9753d","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0b8a09ab3c370f7ef89319f3ff66cc346bbfc1cc48b58c2d40ef7d61b33a349c"],"repoTags":[],"size":"293771129","spec":null,"uid":null,"username":""},{"id":"06bcdf9e5bffca01d0395f349a5c6fe8522560425b81adca4f6d54b2e6b8e854","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0fdd27a12ee71d1268ef2e7c4cfe8dbdd3a86e3010f77db3f2e530b928fa2a42"],"repoTags":[],"size":"338690057","spec":null,"uid":null,"username":""},{"id":"5e77e74e95e2dbff030da2f1d1f6d8913893735a609211561fa72896d11d0069","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1267dd9b35b81041888212be03415b2fab37d1ac9e0fb4d9ddcf60c72f7a99ad"],"repoTags":[],"size":"549557329","spec":null,"uid":null,"username":""},{"id":"762b58d25362b9b53b71a0330ebb197d079fd7d5c7556bb20941b96598b7e20e","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:22e31d53a7c6a92176d5a183fd213fff4e2e68c343ccf6cca9c7fc1363e34836"],"repoTags":[],"size":"480226761","spec":null,"uid":null,"username":""},{"id":"1d3b81473a678baf01f66f2d7ad2e31406bfbeb6f2d0c29a2889eb0282290fa5","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2fc2c9d9fae3070c00239ea3d0d1d9fa7477c99c296b17b2fc352794c535912e"],"repoTags":[],"size":"358933144","spec":null,"uid":null,"username":""},{"id":"2cb68ba3a6a2704c8c8b171b643dda06525437744f72cbf9430bb3bb3d06b6cd","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:426f99588fbce292824dba75372675b83daea64a1cf5d321fb5e4182fc43867e"],"repoTags":[],"size":"444563870","spec":null,"uid":null,"username":""},{"id":"f8517523838468766fe503f52b6909274a3e96d9779c1b8a6caf01f56c308dc5","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4b14ffcea52f9eb5952546634cb26bfb1d523a4dd81382021c71673fed91efa2"],"repoTags":[],"size":"648096880","spec":null,"uid":{"value":"1001"},"username":""},{"id":"bc8fbc6cfc5c904a48c69b1c8939312ff8edb2c57f3a79dfa08b5b0ee7b2b2c0","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4f4021f6a725ee1bc3c393535742b720ee2fc5ffc978849a2b67fc437debc283"],"repoTags":[],"size":"305719765","spec":null,"uid":null,"username":"nobody"},{"id":"7a846eb1c95bae86701ec53973c5f8e5e51298e14ac19902be92bce44025bc52","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:57d82a9ceb60734194a2edf462de97285e271cf9d63776eca95da92bef96ce11"],"repoTags":[],"size":"450245265","spec":null,"uid":null,"username":""},{"id":"d1a9e73e12ad162d62471317fb715eaa01cad24145a5cf48345ff7e41cb37d4d","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5e33f9d095952866b9743cc8268fb740cce6d93439f00ce333a2de1e5974837e"],"repoTags":[],"size":"365861279","spec":null,"uid":{"value":"65534"},"username":""},{"id":"4e80d22d9377aa6c13076868d997de1dd71dad1117e92169b11961bec39553ee","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7d78a787b5655292d4440942ae018ad5d1c985881c4e9d95d887d4f8450c7899"],"repoTags":[],"size":"337131856","spec":null,"uid":null,"username":""},{"id":"12e74538ccea688b6f2b9bab20d680a6409317e23643a91cf640f168f201614c","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:810097f053d1859d516ab784d975d41ae435ee91f5eaa7c90a02e643620c18fb"],"repoTags":[],"size":"480455294","spec":null,"uid":null,"username":""},{"id":"51f1f8de7be3bdf89050b4e69e8f42876311556ec1bd83857d5609cd40735c60","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a6322f222f98adc1585ff8777c73140a56ac3cdcd8a6949309884b79496bcbb6"],"repoTags":[],"size":"605665179","spec":null,"uid":null,"username":""},{"id":"08bc210159fafe42e9b1bfe3d494f3dd42ba73b03890a050445dc75f28186302","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aaf0854244f356ccb3f1cd70356362ac5ef7a9f227e08128740306430fd75497"],"repoTags":[],"size":"387003238","spec":null,"uid":null,"username":""},{"id":"55425c0237e89acd2523f9a24f3fe21c9aa7df00ce5f490bc722794b6e2e10ee","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b98f4e019aa3cae1ac333a8241bafdbe52caccfcdcd7f640d1a7410dd33dd788"],"repoTags":[],"size":"457157493","spec":null,"uid":null,"username":""},{"id":"abd5ea3a48e346ec0480185c10c1c747300b38cad4b98e52205324375ff838a1","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c017f8b24e9d9a913456373c69dec63e3315ddae052ac6ad9cee25f856abe502"],"repoTags":[],"size":"393788198","spec":null,"uid":{"value":"1001"},"username":""},{"id":"f74a3835778df0df7489a77b7532f4ebbbd449b9930b0795485d21988de84137","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c56c86d030185bda241514593970e80f75ae75afd9bc6288388944bc2a1dfb1f"],"repoTags":[],"size":"323372661","spec":null,"uid":null,"username":""},{"id":"85eb1eba8745c22b36bd85cf97febb02567f13a5c98e5decc38ed726a6167c87","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:df22255abc474a2b61b323925f76126fa2f8c99affb1c48c2a0eb16c4b4a1056"],"repoTags":[],"size":"398242909","spec":null,"uid":null,"username":""},{"id":"66c3e8e94022ed1a02ec9197196195fdc4272f8e8498947bc3360f5a83a74b4b","repoDigests":["quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e146605c1b75917d26c07268b361134aeda68983b2e2b060c202420b8267aa45"],"repoTags":[],"size":"331811979","spec":null,"uid":null,"username":""},{"id":"2797d420788fb40db6638bec2b5688dab9b0fdc23c211eb073a7cc67eb1b5971","repoDigests":["registry.redhat.io/redhat/certified-operator-index@sha256:bea26c044ebfcbd29fd543c54c1370098462cec13233ad0a5630e0b3a09d8e42","registry.redhat.io/redhat/certified-operator-index@sha256:f7cfa84674fd6b5a9c071ef029dcf1a529fdd35def12884190449ce6048c2f73"],"repoTags":["registry.redhat.io/redhat/certified-operator-index:v4.9"],"size":"708788997","spec":null,"uid":{"value":"1001"},"username":""},{"id":"5852d7cd10d9ac8586c182357ef598bb556e4336e87d51ba04a839d158affd74","repoDigests":["registry.redhat.io/redhat/redhat-marketplace-index@sha256:37b18b852ec1ddc14e211fd801d7d59fac2208ff34231612873b899238577410","registry.redhat.io/redhat/redhat-marketplace-index@sha256:f366e8f7bdd010cf5779659f063b57ff0d478ee12eb1ca5888f19c55a279bd04"],"repoTags":["registry.redhat.io/redhat/redhat-marketplace-index:v4.9"],"size":"697620658","spec":null,"uid":{"value":"1001"},"username":""},{"id":"54488905263c2e726a32a23362addc373eab1582fb708317a339374013a28e0c","repoDigests":["registry.redhat.io/redhat/redhat-operator-index@sha256:caefc33c79258eb4604df24bbd4fc99c0915dad22e354ed2bb1569116bebce88","registry.redhat.io/redhat/redhat-operator-index@sha256:ea5696af4e6ef9827b45a8b89cb88630af4fba363ec18aa7720e1ad1a4fcc9d8"],"repoTags":["registry.redhat.io/redhat/redhat-operator-index:v4.9"],"size":"735415686","spec":null,"uid":{"value":"1001"},"username":""}]}
DEBUG - 2022-01-07T15:08:30.926134795+00:00 - Found 27 images
DEBUG - 2022-01-07T15:08:30.926140967+00:00 - Matching "f37769f487c171c99f84ef0db018ce2055f0ae3a350721ba3e9b98d8c7860563" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926146966+00:00 - Matching "d8087c58ebe51554d52054e955680805d86969dc9b6917f5e3fa3ecb81c86e33" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926150959+00:00 - Matching "9fe6cec96704ffdf512ad2755c42ddfd36f2ab2aec3a27bae4cce42a8c480e14" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926154549+00:00 - Matching "33ef73131becd5dbc3d8f913659a9d82fc6584f22aba85d3840226f891d8a16a" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926157991+00:00 - Matching "28ea52b98c63aa5dd899d67bf267a3b7dd623f5a694b97a56793bb12597e2de9" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926161463+00:00 - Matching "9efb1f8bb8ab8197515e03b151f90d9828726c9c53564497b25a28bdd7a9753d" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926164899+00:00 - Matching "06bcdf9e5bffca01d0395f349a5c6fe8522560425b81adca4f6d54b2e6b8e854" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926168239+00:00 - Matching "5e77e74e95e2dbff030da2f1d1f6d8913893735a609211561fa72896d11d0069" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926171626+00:00 - Matching "762b58d25362b9b53b71a0330ebb197d079fd7d5c7556bb20941b96598b7e20e" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926177011+00:00 - Matching "1d3b81473a678baf01f66f2d7ad2e31406bfbeb6f2d0c29a2889eb0282290fa5" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926180492+00:00 - Matching "2cb68ba3a6a2704c8c8b171b643dda06525437744f72cbf9430bb3bb3d06b6cd" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926184217+00:00 - Matching "f8517523838468766fe503f52b6909274a3e96d9779c1b8a6caf01f56c308dc5" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926187964+00:00 - Matching "bc8fbc6cfc5c904a48c69b1c8939312ff8edb2c57f3a79dfa08b5b0ee7b2b2c0" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926191526+00:00 - Matching "7a846eb1c95bae86701ec53973c5f8e5e51298e14ac19902be92bce44025bc52" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926195229+00:00 - Matching "d1a9e73e12ad162d62471317fb715eaa01cad24145a5cf48345ff7e41cb37d4d" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926198740+00:00 - Matching "4e80d22d9377aa6c13076868d997de1dd71dad1117e92169b11961bec39553ee" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926202137+00:00 - Matching "12e74538ccea688b6f2b9bab20d680a6409317e23643a91cf640f168f201614c" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926205444+00:00 - Matching "51f1f8de7be3bdf89050b4e69e8f42876311556ec1bd83857d5609cd40735c60" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926208798+00:00 - Matching "08bc210159fafe42e9b1bfe3d494f3dd42ba73b03890a050445dc75f28186302" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926212100+00:00 - Matching "55425c0237e89acd2523f9a24f3fe21c9aa7df00ce5f490bc722794b6e2e10ee" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926215511+00:00 - Matching "abd5ea3a48e346ec0480185c10c1c747300b38cad4b98e52205324375ff838a1" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926218960+00:00 - Matching "f74a3835778df0df7489a77b7532f4ebbbd449b9930b0795485d21988de84137" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926222295+00:00 - Matching "85eb1eba8745c22b36bd85cf97febb02567f13a5c98e5decc38ed726a6167c87" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926225642+00:00 - Matching "66c3e8e94022ed1a02ec9197196195fdc4272f8e8498947bc3360f5a83a74b4b" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926229297+00:00 - Matching "2797d420788fb40db6638bec2b5688dab9b0fdc23c211eb073a7cc67eb1b5971" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926233133+00:00 - Matching "5852d7cd10d9ac8586c182357ef598bb556e4336e87d51ba04a839d158affd74" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"
DEBUG - 2022-01-07T15:08:30.926236908+00:00 - Matching "54488905263c2e726a32a23362addc373eab1582fb708317a339374013a28e0c" to 0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"

full file, copied from the pod:
composer.log

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

Love a Friday vibe :)
Looks like the issue is imageRef should be matched with the repoDigests array rather than just mapping to id.

I was hoping the id in the image collection was based on the hash in theimageRef but alas not.

As a fix I will also iterate over the repoDigests for a match as that will be the least impacting to the current functionality.

I'll ping when a test image is ready

@No9
Copy link
Collaborator

No9 commented Jan 7, 2022

ok @tjungblu that's building here
https://quay.io/repository/icdh/core-dump-handler/build/8942eba5-352d-447f-b19f-ae419caa422e
Glad we have got that in as it will be needed for consistent post processing and the client cli

Let me know if you want to add the scc fix to this and I will hold off merging this into main and bundle both aspects as a single release.

@tjungblu
Copy link
Contributor Author

@No9 awesome. I sadly couldn't figure out which container image tag was built (can't see the build logs), but I took the last tag img-logs and I now have a proper image info in my archive:

-r--r--r--. 1 tjungblu tjungblu 229376 Jan 10 08:54 <uuid>-segfaulter-1-4.core
-r--r--r--. 1 tjungblu tjungblu    293 Jan 10 08:54 <uuid>-segfaulter-1-4-dump-info.json
-r--r--r--. 1 tjungblu tjungblu    288 Jan 10 08:54 <uuid>-segfaulter-1-4-image-info.json
-r--r--r--. 1 tjungblu tjungblu   1184 Jan 10 08:54 <uuid>-segfaulter-1-4-pod-info.json
-r--r--r--. 1 tjungblu tjungblu    996 Jan 10 08:54 <uuid>-segfaulter-1-4-ps-info.json
-r--r--r--. 1 tjungblu tjungblu  27059 Jan 10 08:54 <uuid>-segfaulter-1-4-runtime-info.json
{"id":"d8087c58ebe51554d52054e955680805d86969dc9b6917f5e3fa3ecb81c86e33","repoDigests":["quay.io/icdh/segfaulter@sha256:0630afbcfebb45059794b9a9f160f57f50062d28351c49bb568a3f7e206855bd"],"repoTags":["quay.io/icdh/segfaulter:latest"],"size":"10229047","spec":null,"uid":null,"username":""}

now we're at six files, you have mentioned earlier there should be seven. Anything crucial missing here?

Let me know if you want to add the scc fix to this and I will hold off merging this into main and bundle both aspects as a single release.

I can send you a separate PR towards the end of the week once I got helm properly working. You're fine with adding a post-installation hook job for this?

@No9
Copy link
Collaborator

No9 commented Jan 10, 2022

Hey @tjungblu
So it was 7 files including the .zip file.
There are 6 files inside the zip so we are in excellent shape (Sorry I mis-read my test)
If your happy with a post-hook then I am happy to take it as long as it doesn't interfere with the xKS platforms.

How do you want to deal with the recommendation to use /run/systemd/coredump. I'm happy to capture it as a separate issue and look at it when someone is actually comes looking for it.

@tjungblu
Copy link
Contributor Author

If your happy with a post-hook then I am happy to take it as long as it doesn't interfere with the xKS platforms.

I think that's the least invasive, I reckon we put it behind an enableOpenShift value flag in helm. I send you a PR later this week :)

How do you want to deal with the recommendation to use /run/systemd/coredump. I'm happy to capture it as a separate issue and look at it when someone is actually comes looking for it.

yeah, I'd suggest we do it that way - it seems to work for now :)

@No9
Copy link
Collaborator

No9 commented Jan 10, 2022

This is actually a little tricky as there is currently a --set daemonset.vendor=rhel7 to support ROKS
So for enableOpenShift to work across all providers it would need to do the following:

  1. set the host directories hostDirectory: "/mnt/core-dump-handler" coreDirectory: "/mnt/core-dump-handler/cores"
  2. apply the scc
  3. Determine the host OS to set the VENDOR env_var on the daemonset
    Step 1 should be OK but will need testing to validate but Step 3 may need to establish the OS of the Node programmatically beforehand as it's used determine which binary to copy to the host so it might not be something that works as part of the post hook.

It might be better to have a flag like --set target=aws-openshift --set target=azure-openshift --set target=ibm-openshift?
This would also leave it open to implement the other providers in the compatibility matrix as well as providing an entry point to do other provider specific items down the line. i.e. provision storage.
https://github.com/IBM/core-dump-handler/#public-cloud-kubernetes-service-compatibility.

Or you can just add an addSccToUser flag and document the directory options in the compatibility matrix.
This is how the likes of sysdig do it today.
https://charts.sysdig.com/charts/sysdig/

My preference would be for the --set target oraddSccToUser as the user needs to specify some flags anyway in order to provide the storage configuration. But I'm not against enableOpenShift but it's probably not the easiest to support.

There are likely other options that merge these so feel free to suggest some ideas :)

@tjungblu
Copy link
Contributor Author

tjungblu commented Jan 11, 2022

great point, after all your explanation it seems that enableOpenShift is certainly too broad.

It might be better to have a flag like --set target=aws-openshift --set target=azure-openshift --set target=ibm-openshift?

I'm wondering how many platforms we really need here, it seems that ROKS is different because it uses RHEL instead of RHCOS.

I can see where you want to go with the different providers, especially in relation to the "provider-local" storage options. Another solution would be to have different values files for the respective environments you want to support (which is just a textual representation of your --set directives).

Or you can just add an addSccToUser flag and document the directory options in the compatibility matrix.

that certainly sounds like a better and more composable solution, I just have to figure out how to patch the scc from Helm :)
If I can't get it to work, I will send you a README update nevertheless.

Looking into the feature, I think we (OpenShift) should build an operator to wrap this to have proper support on OpenShift across all envs - that also solves the issue in (3) as we can easily detect the environment and operating systems.

@No9
Copy link
Collaborator

No9 commented Jan 11, 2022

I'm wondering how many platforms we really need here, it seems that ROKS is different because it uses RHEL instead of RHCOS.

The other xKS services such as GCP/AWS seem to offer "own brand linux" by default and an Ubuntu option for their nodes so I think this will have wider utility

Another solution would be to have different values files for the respective environments you want to support (which is just a textual representation of your --set directives).

I really like the idea of a different values file! Lets go with that along with the addSccToUser flag

Agree with the operator - There is a helm wrapper for this project here https://github.com/IBM/core-dump-operator but it's really just a stub at the moment.

If OpenShift folks are going to pick up an operator it would be great to understand what the plan is so I can either shut that repo down or grant access - whatever makes sense.

@tjungblu
Copy link
Contributor Author

I couldn't get the job to work that would patch the existing SCC, which makes sense as this would be an easy privilege escalation path. I could make it work by creating a new SCC - so please have a look at #46 :)

I really like the idea of a different values file! Lets go with that along with the addSccToUser flag

awesome! then let's add a couple more, let me know what you think about the naming in the PR - I just bluntly named it openshift again.

Agree with the operator - There is a helm wrapper for this project here https://github.com/IBM/core-dump-operator but it's really just a stub at the moment. If OpenShift folks are going to pick up an operator it would be great to understand what the plan is so I can either shut that repo down or grant access - whatever makes sense.

Nice! The reason I came here is that we have a lighthouse customer that wants this functionality - I'm meeting them on Thursday and we'll decide on the operator aspects based on that. Generally we would work upstream in your operator project if it already exists, so we can discuss that when we get there.

Thanks for your help so far, much appreciated. 🚀

@No9
Copy link
Collaborator

No9 commented Jan 11, 2022

ok - away from keyboard for the rest of the day but will look at the PR in the morning so you will have an update for Thursday.
Delighted this is a customer use case.
Sounds like we can work something out on the operator if required.

@No9
Copy link
Collaborator

No9 commented Jan 21, 2022

OK - I've merged the work associated with this issue and we have created separate issues for follow on work so I am closing this.
Please track the release project for updates.
https://github.com/IBM/core-dump-handler/projects/1

@No9 No9 closed this as completed Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants