Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Bug 2062899 #5723

Closed
wants to merge 212 commits into from
Closed

WIP: Bug 2062899 #5723

wants to merge 212 commits into from

Conversation

harche
Copy link
Contributor

@harche harche commented Mar 15, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


fidencio and others added 30 commits April 19, 2021 12:34
There's no reason for us to keep maintaining our own copy of typeurl,
let's directly rely on the `github.com/containerd/typeurl` instead.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
As this function is only used by runtimeVM::ContainerStats(), let's move
it to the runtime_vm.go file, making our life easier when doing upcoming
changes on runtimeVM::ContainerStats().

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
CRI-O has been using `containers/libpod/pkg/cgroups` in order to get
metrics and, later on, convert it to CRI Stats. Although this approach
is fine (and desired) for the OCI runtime type. we can't rely on that
for the VM runtime type as the data sent by Kata Containers comes from
`containerd/cgroup`.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
WorkingSetBytes is the bit that needs to be set in order to provide
kubelet the pod's memory information.

Although it's calculated in a slightly different way for "oci" runtime
type, the logic is quite similar for the "vm" runtime type, with the
only difference being where the TotalInactiveFile information comes
from.

This is the last bit needed in order to have `kubectl top pod $pod`
working, as shown below:
```
[fidencio@localhost cri-o]$ kubectl get pods
NAME             READY   STATUS    RESTARTS   AGE
example-fedora   1/1     Running   0          130m

[fidencio@localhost cri-o]$ kubectl get pod example-fedora -o yaml | grep runtimeClassName
      {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"app":"example-fedora-app"},"name":"example-fedora","namespace":"default"},"spec":{"containers":[{"args":["-m","http.server","8080"],"command":["python3"],"image":"fedora:33","name":"example-fedora","ports":[{"containerPort":8080}]}],"runtimeClassName":"kata"}}
        f:runtimeClassName: {}
  runtimeClassName: kata

[fidencio@localhost cri-o]$ kubectl top pod
NAME             CPU(cores)   MEMORY(bytes)
example-fedora   1m           9Mi
```

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Fix Bug 1942608, during the call of ListImages, do not list the image if the there's no manifest exist.
Skip reporting the error locating manifest if the imaage is in the ImageBeingPulled list.

Signed-off-by: Qi Wang <qiwan@redhat.com>
…pick-4470-to-release-1.21

[release-1.21] Properly implement metrics for Kata Containers when using CRI stats.
…pick-4726-to-release-1.21

[release-1.21] Bug 1942608: do not list the image with error locating manifest
Signed-off-by: Peter Hunt <pehunt@redhat.com>
If we fail to initially attach the namespaces, we need to cleanup the ones we've already created
Also, we are currently unconditionally recreating the namespaces when restoring. I believe that's from a rebase issue

Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
[1.21] bump to containers/image 5.11.1
…pick-4758-to-release-1.21

[release-1.21] nsmgr: fix some leaks with GetNamespace
Signed-off-by: Peter Hunt <pehunt@redhat.com>
…pick-4796-to-release-1.21

[release-1.21] container server: fix silly typo
This is a partial revert of 975ffc6,
where we started using the internal log when the context is avaiable.

It turns out that we missed a few pieces during the review, where we end
up using the internal log without having the interceptors set up.  This
was raised by @haircommander during the review, we checked a few parts
of the patch, but we still missed a few others.

We can only use the internal logging after
https://github.com/cri-o/cri-o/blob/d3dbaec060e33870e5cb5c3f7ec4207837804b00/cmd/crio/main.go#L222
happens.  From this moment, we start having different ways of logging in
the very same file and consistency becomes a problem then.

With the consistency in mind, I'd like to **only** use logrus for
debugging as part of this file.

Note: This was noticed when debugging
cri-o#4798

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
as well as mark the feature as experimental

Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
…pick-4787-to-release-1.21

[release-1.21] workloads: update how overrides are specified
We now partially revert cri-o#4650 to
still use the runtime internal context for its tasks.

Fixes: cri-o#4798

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
…pick-4803-to-release-1.21

[release-1.21] Use extra context for runtime VM
…pick-4804-to-release-1.21

[release-1.21] main: still rely on logrus (rather than using the internal log)
This allows us to predictably shut down the node and help ensure
clean shutdown.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
The internal seccomp profile (`RuntimeDefault`) should be ignored in the
same way as it was before using the new field. This aligns the
implementation with CRI-O releases before v1.21.0.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
…pick-4789-to-release-1.21

[release-1.21] Fix RuntimeDefault seccomp behavior if disabled
…pick-4766-to-release-1.21

[release-1.21] Add After=crio.service dependency to containers and conmon
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
…pick-4828-to-release-1.21

[release-1.21] Pin gocapability to v0.0.0-20180916011248-d98352740cb2
haircommander and others added 12 commits January 21, 2022 09:45
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
[1.21] prepend commands with taskset if InfraCtrCPUSet is configured
…pick-5503-to-release-1.21

[release-1.21] oci: always reap conmon zombies
previously, CRI-O assumed that the kubelet wouldn't bump the attempt number
if it returned an error after a series of retries.

in reality, the kubelet *is* doing that, which means situations where we have lots of load on a node
means the issue is made worse by this workaround, not better.

Mitigate this by waiting until the timeout is completed before returning the fact that the resource is ready.
This will cause the kubelet to continue to believe the timing out has happened, and retry with the same attempt number.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
…pick-5602-to-release-1.21

[release-1.21]server: fix race with kubelet
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
[1.21] server: filter sysctls passed through CRI if host namespace
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Harshal Patil <harpatil@redhat.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 15, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

@harche: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. dco-signoff: no Indicates the PR's author has not DCO signed all their commits. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 15, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

@harche: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@harche
Copy link
Contributor Author

harche commented Mar 15, 2022

/close

@openshift-ci openshift-ci bot closed this Mar 15, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

@harche: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: harche
To complete the pull request process, please assign haircommander after the PR has been reviewed.
You can assign the PR to them by writing /assign @haircommander in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

@harche: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/kata-jenkins d7c79aa link true /test kata-containers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 15, 2022

@harche: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/openshift-jenkins/critest_rhel d7c79aa link true /test critest_rhel
ci/openshift-jenkins/integration_rhel d7c79aa link true /test integration_rhel
ci/openshift-jenkins/e2e_features_rhel d7c79aa link true /test e2e_features_rhel
ci/openshift-jenkins/e2e_rhel d7c79aa link true /test e2e_rhel
ci/openshift-jenkins/e2e_fedora d7c79aa link true /test e2e_fedora
ci/openshift-jenkins/integration_crun_cgroupv2 d7c79aa link false /test integration_cgroupv2
ci/openshift-jenkins/e2e_features_fedora d7c79aa link true /test e2e_features_fedora
ci/openshift-jenkins/critest_fedora d7c79aa link true /test critest_fedora
ci/openshift-jenkins/integration_crun d7c79aa link true /test integration_crun
ci/openshift-jenkins/integration_fedora d7c79aa link true /test integration_fedora
ci/openshift-jenkins/e2e_crun d7c79aa link true /test e2e_crun
ci/openshift-jenkins/e2e_crun_cgroupv2 d7c79aa link false /test e2e_cgroupv2
ci/kata-jenkins d7c79aa link true /test kata-containers

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: no Indicates the PR's author has not DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet