Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.10] Fast-Forward from main #1233

Merged
merged 53 commits into from
Apr 6, 2022

Conversation

alvaroaleman
Copy link
Contributor

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

sjenning and others added 30 commits March 17, 2022 14:44
This is completely expected during deletion, logging it at error level
makes it look like an issue, which it is not.
HO: Don't report NotFund for hostedcluster as error
The KAS needs the proxy settings to communicate with the cloud provider.
However, the egress transport it uses wraps another transport that
respects proxy settings which is why we need to excempt pod and service
CIDR of the guest cluster to not break Konnektivity.

I also tried to stop using the egress config and use the
konnektivity-socks5-proxy, but that breaks SPDY connections (exec,
port-forward).

Ref https://issues.redhat.com/browse/HOSTEDCP-333
KAS: Set proxy, but exempt pod and service CIDR
add external-dns flags to CI install make target
sync MaxConcurrentReconciles across all controllers
HostedCluster can optionally reference a configmap, in which case we copy
the configmap to the HostedControlPlane namespace (similar to SSHKey and
other fields).
When AdditionalTrustBundle is defined we create this ConfigMap to
align with the behavior of regular OCP clusters and enable
consumption of user-defined CA certs by the guest cluster.
When AdditionalTrustBundle is specified, we serialize the configmap and
pass to the MCO bootstrap command via the default user-ca-bundle-config.yaml
location - this means the MCO bootstrap will read the file when included,
(the code already ignores the case where the file doesn't exist, since
openshift/installer only conditionally creates the manifest)
This can be used to reference a ConfigMap that contains a user CA
bundle.
The CPO and ignition server need the user CA so the registryclient
can access a local registry with a self-signed cert
Adds a CLI option and corresponding volume to the operator pod,
this is needed so the operator can look up release image metadata
when the release image specified is locally mirrored.

Note the mount path/filename were chosen to align with the expected
defaults ref https://go.dev/src/crypto/x509/root_linux.go (and also
current OCP docs for cert injection using operators)
In its current state, the hosted cluster config operator overwrites any
changes made by the guest cluster admin to the registry configuration.
This prevents changes like enabling a route or increasing the number of
replicas.

This commit limits what we change to things we need to change and leave
everything else as is.
Currently, dump just drops a lot of files. This is useful for browsing
them in the CI job output, but terrible for downloading them for local
inspection, as downloading a lot of files is extremely slow, even if the
files aren't big.

This change makes us always create an archive of the dump to not require
extending every CI job to do this manually.
The version we currently use can not compile anything and fails with
errors like this:

could not load export data: cannot import "math/bits" (unknown iexport format version 2), export data is newer version - update tool (compile)

Note that this doesn't mean staticcheck supports generics, it just means
it can be compiled with go 1.18.
Signed-off-by: David Vossel <davidvossel@gmail.com>
Registry configuration: reconcile only what we need to changes
…s-v1

Unique OpenShift vxlan port for KubeVirt Platform
Update staticcheck to a version that works with go 1.18
This upgrades mkdocs/material to fix Netlify docs compilation breakages
resulting from mkdocs/mkdocs#2799.
These components watch both management cluster (Machine scalable resources) and guest cluster.
Originally we were pinning the images to a version that would cross any HostedCluster.
This PR let us pick them from each particular payload resulting in some benefits:

Each hostedCluster runs the component version that was tested with that particular kube/ocp version
No additional work needed to productise the images as they com from the payload.
Since CAPI CRDs should be backward compatible, having different controller versions shouldn't cause an issue.
Once the CAPI image is in the payload we can do the same for it.
docs: Upgrade mkdocs/material to fix Netlify breakages
davidvossel and others added 20 commits March 28, 2022 16:59
Signed-off-by: David Vossel <davidvossel@gmail.com>
…rom-payload

read apiserver-network-proxy image from ocp payload
The single-hyphen flags do not work anymore due to
operator-framework/operator-lifecycle-manager#2362
Signed-off-by: David Vossel <davidvossel@gmail.com>
Before this commit, EIP tagging failures resulting from the EIP not
being found after the EIP was successfully created led to infra creation
failing overall because the tagging operation was not retried.

This commit adds retry logic to EIP tagging to account for the case when
EIP creation succeeds but tagging fails because the AWS tagging API doesn't
yet see the new EIP.
Retry EIP tagging failures during infra creation
…y-v1

AntiAffinity rules to spread KubeVirt VMs across mgmt nodes
Signed-off-by: David Vossel <davidvossel@gmail.com>
…ss-v1

Document KubeVirt Platform Ingress Setup
The olm cronjob had a prioryClass of openshift-user-critical which has a
priority that is above all other controlplane components in the
management cluster. Downgrade it to the standard
hypershift-control-plane and add an e2e test that verifies that no pod
has a priority higher than the etcd priority.
Get autoscaler/machine-approver images from the payload
Before this commit, calls to `WaitForConditionsOnHostedControlPlane()` could
fail a test if an API lookup fails even though that lookup is recoverable and
retried automatically. This made the test flaky.

This commit fixes the code so that these retriable errors are logged but do
not fail the test.

This commit also moves a log message which was intended to emit during retries
but was instead placed at the exit point.
Before this commit, UWM was enabled by the e2e `setup` command, which
was used in the past but is no longer used. The UWM stack is thus wasting
resources on management clusters used for e2e runs.

This commit removes UWM from the monitoring setup for e2e tests.
Hypershift operator: Give a priority that is higher than any controlplane component
e2e: Don't fail test on transient recoverable API lookup
Fix priority class for olm cronjob and verify priorityclasses in e2e
e2e: Don't enable user workload monitoring on management clusters
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 5, 2022

@alvaroaleman: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

[release-4.10] Fast-Forward from main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from enxebre and sjenning April 5, 2022 13:33
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 5, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 5, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 5, 2022

@alvaroaleman: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sjenning
Copy link
Contributor

sjenning commented Apr 6, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 6, 2022
@openshift-merge-robot openshift-merge-robot merged commit c6ce37a into openshift:release-4.10 Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants