Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Rebase 1.30 #1942

Draft
wants to merge 2,168 commits into
base: master
Choose a base branch
from
Draft

[WIP] Rebase 1.30 #1942

wants to merge 2,168 commits into from

Conversation

soltysh
Copy link
Member

@soltysh soltysh commented Apr 9, 2024

Slightly updated version of #1920

k8s-ci-robot and others added 30 commits March 8, 2024 06:35
…test

Integration test for change in syncOrphanPod for managedBy jobs
Follow up fix to the job status update test
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
Require email_verified to be used when email is set as username via CEL
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
[Storage Version Migration] feat: implements Storage Version Migration
…b-unit

Job: Use the fake clock in TestTrackJobStatusAndRemoveFinalizers
The map is changed to an array so as to retain the order of the original array
propagated from the CRI runtime.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This commit modifies the following files:

- pkg/apis/core/types.go
- staging/src/k8s.io/api/core/v1/types.go

Other changes were auto-generated by running `make update`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
See <https://kep.k8s.io/3857>.

An example manifest:
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: rro
spec:
  volumes:
    - name: mnt
      hostPath:
        # tmpfs is mounted on /mnt/tmpfs
        path: /mnt
  containers:
    - name: busybox
      image: busybox
      args: ["sleep", "infinity"]
      volumeMounts:
        # /mnt-rro/tmpfs is not writable
        - name: mnt
          mountPath: /mnt-rro
          readOnly: true
          mountPropagation: None
          recursiveReadOnly: IfPossible
        # /mnt-ro/tmpfs is writable
        - name: mnt
          mountPath: /mnt-ro
          readOnly: true
        # /mnt-rw/tmpfs is writable
        - name: mnt
          mountPath: /mnt-rw
```

Requirements:
- Feature gate "RecursiveReadOnlyMounts" to be enabled
- Linux kernel >= 5.12
- runc >= 1.1

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Usage:
```
make test-e2e-node \
  TEST_ARGS='--service-feature-gates=RecursiveReadOnlyMounts=true --kubelet-flags="--feature-gates=RecursiveReadOnlyMounts=true"' \
  FOCUS="Mount recursive read-only" SKIP=""
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
KEP-3857: Recursive Read-only (RRO) mounts
Signed-off-by: Monis Khan <mok@microsoft.com>
Add dynamic reload support for authentication configuration
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
…fig_reload_metrics

Add metrics for authentication config reload
Mark StructuredAuthenticationConfiguration feature gate as beta
We stop releasing NPD tar files to gs://kubernetes-release. This PR
changes it to pull from github release notes by default. It still
supports overriding the defaults and pulling from a GCS bucket,
which is used by NPD CI tests.
stlaz and others added 25 commits April 25, 2024 20:47
We need this in order to be able to retrieve better reports from
PodSecurityViolation alerts.

UPSTREAM: <carry>: PSa metrics: unset ocp_namespace on non-platform namespaces
…aged is enabled

Previously, cpu load balancing was enabled in cri-o by manually changing the sched_domain of cpus in sysfs.
However, RHEL 9 dropped support for this knob, instead requiring it be changed in cgroups directly.

To enable cpu load balancing on cgroupv1, the specified cgroup must have cpuset.sched_load_balance set to 0, as well as
all of that cgroup's parents, plus all of the cgroups that contain a subset of the cpus that load balancing is disabled for.

By default, all cpusets inherit the set from their parent and sched_load_balance as 1. Since we need to keep the cpus that need
load balancing disabled in the root cgroup, all slices will inherit the full cpuset.

Rather than rebalancing every cgroup whenever a new guaranteed cpuset cgroup is created, the approach this PR takes is to
set load balancing to disabled for all slices. Since slices definitionally don't have any processes in them, setting load balancing won't
affect the actual scheduling decisions of the kernel. All it will do is open the opportunity for CRI-O to set the actually set load balancing to
disabled for containers that request it.

Signed-off-by: Peter Hunt <pehunt@redhat.com>

UPSTREAM: <carry>: kubelet/cm: disable cpu load balancing on slices when using static cpu manager policy

There are situations where cpu load balance disabling is desired when the kubelet is not in managed state.
Instead of using that condition, set the cpu load balancing parameter for new slices when the cpu policy is static

Signed-off-by: Peter Hunt <pehunt@redhat.com>

UPSTREAM: <carry>: cm: reorder setting of sched_load_balance for sandbox slice

If we call mgr.Apply() first, libcontainer's cpusetCopyIfNeeded()
will copy the parent cpuset and set load balancing to 1 by default.
This causes the kernel to set the cpus to not load balanced for a brief moment
which causes churn.

instead, create the cgroup and set load balance, then have Apply() copy the values into it.

Signed-off-by: Peter Hunt <pehunt@redhat.com>

UPSTREAM: <carry>: kubelet/cm: use MkdirAll when creating cpuset to ignore file exists error

Signed-off-by: Peter Hunt <pehunt@redhat.com>
If it is useful we will combine this with the following carry:
20caad9: UPSTREAM: 115328: annotate early and late requests

UPSTREAM: <carry>: add conditional shutdown response header
…util/managedfields

Some of the code we use in openshift-tests was recently made internal
in kubernetes#115065. This patch
exposes the code we need there.
…rnetes.default.svc, don't wait for aggregated availability
…roups

that have kinds that are served by both CRDs
and external apiservers (eg openshift-apiserver)

this includes:
- authorization.openshift.io (rolebindingrestrictions served by a CRD)
- security.openshift.io (securitycontextconstraints served by a CRD)
- quota.openshift.io (clusterresourcequotas served by a CRD)

By merging all sources, we ensure that kinds served by a CRD will have
openapi discovery and spec available even when openshift-apiserver is
unavailable.
…self-SARs that have user:check-access

Otherwise, the request will inherit any scopes that an access token might have
and the scopeAuthorizer will deny the access review if the scopes do not include
user:full
This commit renews openshift#327

What has changed compared to the original PR is:
- The retryClient interface has been adapted to storage.Interface.
- The isRetriableEtcdError method has been completely changed; it seems that previously the error we wanted to retry was not being retried. Even the unit tests were failing.

Overall, I still think this is not the correct fix. The proper fix should be added to the etcd client.

UPSTREAM: <carry>: retry etcd Unavailable errors

This is the second commit for the retry logic.
This commit adds unit tests and slightly improves the logging.

During a rebase squash with the previous one.
When a PerformanceProfile configures a node for cpu partitioning,
it also lets OVS use all the cpus available to burstable pods.
To be able to do that, OVS was moved to its own slice and that
slice needs to be re-added to cAdvisor for monitoring purposes.
Signed-off-by: Harshal Patil <harpatil@redhat.com>
Kubelet should advertise the shared cpus as extedned resources.
This has the benefit of limiting the amount of containers
that can request an access to the shared cpus.

For more information see - openshift/enhancements#1396

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
This commit needs to be carried until we rebase onto Kube 1.31.

We have backported the library changes to 1.28, which means they can then be used in 1.29.

Upstream, they were only introduced in 1.30 which means they wouldn't be usable until 1.31.

This allows us to improve our API validation from OpenShift 4.16 onwards, instead of OpenShift 4.18 onwards.

UPSTREAM: <carry>: Set up CEL IP/CIDR library from 4.14 onwards

Carry until K8s 1.31 rebase.
Adding a new mutation plugin that handles the following:

1. In case of `workload.openshift.io/enable-shared-cpus` request, it
   adds an annotation to hint runtime about the request. runtime
   is not aware of extended resources, hence we need the annotation.
2. It validates the pod's QoS class and return an error if it's not a
   guaranteed QoS class
3. It validates that no more than a single resource is being request.
4. It validates that the pod deployed in a namespace that has mixedcpus
   workloads allowed annotation.

For more information see - openshift/enhancements#1396

Signed-off-by: Talor Itzhak <titzhak@redhat.com>

UPSTREAM: <carry>: Update management webhook pod admission logic

Updating the logic for pod admission to allow a pod creation with workload partitioning annotations to be run in a namespace that has no workload allow annoations.

The pod will be stripped of its workload annotations and treated as if it were normal, a warning annoation will be placed to note the behavior on the pod.

Signed-off-by: ehila <ehila@redhat.com>

UPSTREAM: <carry>: add support for cpu limits into management workloads

Added support to allow workload partitioning to use the CPU limits for a container, to allow the runtime to make better decisions around workload cpu quotas we are passing down the cpu limit as part of the cpulimit value in the annotation. CRI-O will take that information and calculate the quota per node. This should support situations where workloads might have different cpu period overrides assigned.

Updated kubelet for static pods and the admission webhook for regular to support cpu limits.

Updated unit test to reflect changes.

Signed-off-by: ehila <ehila@redhat.com>
…ject openshift feature gates into pkg/features

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
This is a short term fix, once we improve the cert rotation logic
in library-go that does not depend on this hack, then we can
remove this carry patch.

squash with the previous PR during the rebase
openshift#1924

squash with the previous PRs during the rebase
openshift#1924
openshift#1929
…bjectValidator

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>

UPSTREAM: <carry>: Fix incorrect type casting in admission validate_apiserver

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>

UPSTREAM: <carry>: react to library-go changes
Signed-off-by: Vu Dinh <vudinh@outlook.com>
Signed-off-by: Vu Dinh <vudinh@outlook.com>
Temporarily remove LockToDefault for ValidatingAdmissionPolicy feature
gate so it can be disabled.

Signed-off-by: Vu Dinh <vudinh@outlook.com>
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 26, 2024
@openshift-ci-robot
Copy link

@soltysh: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@soltysh
Copy link
Member Author

soltysh commented Apr 26, 2024

/test verify

@soltysh
Copy link
Member Author

soltysh commented Apr 26, 2024

/test images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet