Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2077975: Merge upstream tag 'v3.4.19' into 4.8 #136

Closed
wants to merge 143 commits into from

Conversation

Elbehery
Copy link

This PR attempts to merge upstream v3.14.19 tag into openshift-4.8.

This PR is a result of the following steps

dbavatar and others added 30 commits September 16, 2019 11:49
In case of URLs that are synonyms, the current lexicographic sorting
and compare of the URLs fails with frustrating errors. Make sure to do
a full comparison between every set of PeerURLs before failing.

Fixes etcd-io#11013
Use golang.org/x/sys/unix for F_OFD_* constants.

This fixes the issue that F_OFD_GETLK was defined incorrectly,
resulting in bugs such as moby/moby#31182

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
[3.4 backport] pkg/fileutil: fix F_OFD_ constants
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
This fixes etcd being unable to send any message longer than 64 KB as
a notification over the websocket. This was because the older version
of grpc-websocket-proxy was used and WithMaxRespBodyBufferSize option
wasn't set.
etcdserver: Fix 64 KB websocket notification message limit
[Backport-3.4] etcdserver/api/etcdhttp: log successful etcd server side health check in debug level
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
There are situations where we don't wish to fsync but we do want to
write the data.

Typically this occurs in clusters where fsync latency (often the
result of firmware) transiently spikes.  For Kubernetes clusters this
causes (many) elections which have knock-on effects such that the API
server will transiently fail causing other components fail in turn.

By writing the data (buffered and asynchronously flushed, so in most
situations the write is fast) and avoiding the fsync we no longer
trigger this situation and opportunistically write out the data.

Anecdotally:
  Because the fsync is missing there is the argument that certain
  types of failure events will cause data corruption or loss, in
  testing this wasn't seen.  If this was to occur the expectation is
  the member can be readded to a cluster or worst-case restored from a
  robust persisted snapshot.

  The etcd members are deployed across isolated racks with different
  power feeds.  An instantaneous failure of all of them simultaneously
  is unlikely.

  Testing was usually of the form:
   * create (Kubernetes) etcd write-churn by creating replicasets of
     some 1000s of pods
   * break/fail the leader

  Failure testing included:
   * hard node power-off events
   * disk removal
   * orderly reboots/shutdown

  In all cases when the node recovered it was able to rejoin the
  cluster and synchronize.
When using --unsafe-no-fsync still write out the data
The integration jobs fail with timeouts slightly over 3s, increase
this marginally so false failures are less prevalent.
integration: relax leader timeout from 3s to 4s
etcdserver: fix incorrect metrics generated when clients cancel watches
As go 1.12.2 is what is tested in CI as well as recommended to be built
with 1.12.2 we should also pin to this in the go directive version.
[release-3.4]: Pin go version in go.mod to 1.12
Currently in CI the tests are only run with go v1.12, this adds also go
v1.15.11.

Excludes certain variants for v1.15.
This patch is needed due to go 1.15 erroring on:

"Setctty set but Ctty not valid in child".
@Elbehery
Copy link
Author

@Elbehery did you vendor it properly? I can see a build failure on a missing variable in:

go.etcd.io/etcd/embed
embed/serve.go:294:5: undefined: wsproxy.WithMaxRespBodyBufferSize

import is

"github.com/tmc/grpc-websocket-proxy/wsproxy"

Thanks a lot for reviewing this

So this file was changed by a cherry-pick of a previous downstream commit

I have merged the upstream tag into a branch checked out from 4.8. Resolving conflicts by taking upstream changes.

Then cherry picked all downstream commit, resolving conflict by taking downstream changes.

Then ran go mod tidy && go mod vendor

@Elbehery
Copy link
Author

Elbehery commented Jul 12, 2022

@tjungblu I just checked the file in my repo, so the import itself exists, but the function used from the lib does no longer exist :/ .. I think it is due to versions

see https://pkg.go.dev/github.com/matgabriel/grpc-websocket-proxy/wsproxy#WithMaxRespBodyBufferSize

@tjungblu
Copy link

tjungblu commented Jul 12, 2022

Resolving conflicts by taking upstream changes.

May I suggest that we do rebases the same way the workloads team with o/k and k/k does? as in checking in the conflicts and their resolution as a separate commit - I can not tell whether we're missing an important carry patch from just the diff here.

basically the script that Josef wrote does this, I think we should also use this here:
https://github.com/openshift/kubernetes/blob/master/openshift-hack/rebase.sh#L104-L124

@tjungblu I just checked the file in my repo, so the import itself exists, but the function used from the lib does no longer exist :/ .. I think it is due to versions

can that merge be correct then? I assume that the upstream release builds fine

@Elbehery
Copy link
Author

https://github.com/openshift/kubernetes/blob/master/openshift-hack/rebase.sh#L104-L124

I am gonna check the import upstream

But I dont understand, how to check-in conflicts ? :/

@tjungblu
Copy link

But I dont understand, how to check-in conflicts ? :/

git will add merge markers like

<<<<<<<<< HEAD

GITREF

just commit those as is, then resolve those conflicts in a separate commit. That makes reviewing this much easier.

@Elbehery
Copy link
Author

But I dont understand, how to check-in conflicts ? :/

git will add merge markers like

<<<<<<<<< HEAD

GITREF

just commit those as is, then resolve those conflicts in a separate commit. That makes reviewing this much easier.

Shall I redo everything again :O

@Elbehery
Copy link
Author

@tjungblu I fixed the problem with vendor, used the same commit hash used upstream 3.4 branch for lib github.com/tmc/grpc-websocket-proxy

@Elbehery
Copy link
Author

running make locally, got

./bin/etcd --version
etcd Version: 3.4.19
Git SHA: 4466f3563
Go Version: go1.16.15
Go OS/Arch: darwin/amd64
./bin/etcdctl version
etcdctl version: 3.4.19
API version: 3.4

@tjungblu
Copy link

you need to point it to the vendor folder, as the build does:

STEP 3/5: RUN ["/bin/bash","-c","set -o errexit; umask 0002; make build --warn-undefined-variables"]
GO111MODULE=on GO_BUILD_FLAGS="-v -mod vendor" ./build
build go.etcd.io/etcd: cannot load crypto/ed25[51](https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-> logs/pull/openshift_etcd/136/pull-ci-openshift-etcd-openshift-4.8-unit/1546895039962025984#1:build-log.txt%3A51)9: open /go/src/go.etcd.io/etcd/vendor/crypto/ed25519: no such file or directory

@Elbehery Elbehery changed the title Merge upstream tag 'v3.4.19' into 4.8 Bug 2077975: Merge upstream tag 'v3.4.19' into 4.8 Jul 12, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jul 12, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jul 12, 2022

@Elbehery: This pull request references Bugzilla bug 2077975, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.z) matches configured target release for branch (4.8.z)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2077501 is in the state CLOSED (ERRATA), which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2077501 targets the "4.9.z" release, which is one of the valid target releases: 4.9.0, 4.9.z
  • bug has dependents

Requesting review from QA contact:
/cc @geliu2016

In response to this:

Bug 2077975: Merge upstream tag 'v3.4.19' into 4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from geliu2016 July 12, 2022 16:45
Copy link

@geliu2016 geliu2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jul 13, 2022
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 13, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jul 13, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Elbehery, geliu2016
To complete the pull request process, please ask for approval from deads2k after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Elbehery
Copy link
Author

/retest-required

@openshift-ci
Copy link

openshift-ci bot commented Jul 13, 2022

@Elbehery: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws 4466f35 link true /test e2e-aws
ci/prow/unit 4466f35 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tjungblu
Copy link

tjungblu commented Oct 4, 2022

/close

let's continue this with #150

@openshift-ci openshift-ci bot closed this Oct 4, 2022
@openshift-ci
Copy link

openshift-ci bot commented Oct 4, 2022

@tjungblu: Closed this PR.

In response to this:

/close

let's continue this with #150

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Oct 4, 2022

@Elbehery: This pull request references Bugzilla bug 2077975. The bug has been updated to no longer refer to the pull request using the external bug tracker.

In response to this:

Bug 2077975: Merge upstream tag 'v3.4.19' into 4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Elbehery Elbehery deleted the 3.4.19->4.8 branch May 4, 2023 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.