Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't exclude .git from rsync #117827

Merged
merged 1 commit into from
Oct 24, 2023
Merged

Conversation

BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 5, 2023

What type of PR is this?

What this PR does / why we need it:

See #117821

This would be something of a performance regression, but first I just want to see what breaks. Letting CI identify that with this PR.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 5, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 5, 2023
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 5, 2023
@BenTheElder
Copy link
Member Author

Actually the performance doesn't feel bad versus build/run.sh before (which was already a bit slow), without measuring it.

Just syncing in .git fixes #117821 but I'm still not recalling what it was that depended on excluding it.

@BenTheElder
Copy link
Member Author

locally this broke a few things with make verify, but they look pretty solvable (e.g. kubernetes/publishing-bot#345, we don't have pyyaml in the container but should probably consider not requiring it anyhow)

@@ -674,7 +674,6 @@ function kube::build::sync_to_container() {
# necessary.
kube::build::rsync \
--delete \
--filter='H /.git' \
--filter='- /_tmp/' \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_output filters /_temp/ but not /_tmp/ ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lot of thingsd encoded in .gitignore - I wonder if we can figure out how to respect that instead of encoding many of the same things again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to focus on eliminating the rsync dance entirely.

... but I also don't think this will work very well unless rsync is taught to read .gitignore files directly, it seems pretty expensive to read .gitignore files recursively compared to the handful of exclusions here?

@BenTheElder
Copy link
Member Author

hack/make-rules/../../hack/verify-golangci-lint.sh
hack/make-rules/../../hack/verify-openapi-spec.sh
hack/make-rules/../../hack/verify-shellcheck.sh
hack/make-rules/../../hack/verify-publishing-bot.py

publishing-bot is the pyyaml thing

shellcheck is trying to invoke docker since shellcheck binary isn't available. I wrote a new version of this for registry.k8s.io that doesn't need docker we can backport here.

openapi-spec failed on:

+++ [0505 22:17:30] Placing binaries
+++ [0505 22:17:32] Setting GOMAXPROCS: 48

You can use 'hack/install-etcd.sh' to install a copy in third_party/.
etcd must be in your PATH

+++ [0505 22:17:32] Clean up complete
+++ exit code: 1
+++ error: 1
FAILED verify-openapi-spec.sh 33s

which seems like something we should just do automatically instead of asking the user to do it ...

golangci-lint failed on:

level=error msg="Unable to load custom analyzer logcheck:../_output/local/bin/logcheck.so,
plugin.Open("/go/src/k8s.io/kubernetes/_output/local/bin/logcheck.so"): realpath failed"
level=error msg="Running error: unknown linters: 'logcheck', run 'golangci-lint help linters' to see the list of supported linters"

@BenTheElder
Copy link
Member Author

filed kubernetes/publishing-bot#345 for publishing-bot verify

@BenTheElder
Copy link
Member Author

#117831 for golangci-lint

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other way to discover the root dir without relying on .git / GIT_DISCOVERY_ACROSS_FILESYSTEM?

@BenTheElder
Copy link
Member Author

Is there any other way to discover the root dir without relying on .git / GIT_DISCOVERY_ACROSS_FILESYSTEM?

I don't know why that's relevant? But yes there's other ways to find the root dir.

See #117821, the issue is not GIT_DISCOVERY_ACROSS_FILESYSTEM

@dims dims removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 12, 2023
@BenTheElder
Copy link
Member Author

/test all

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2023
@BenTheElder
Copy link
Member Author

/sig release testing
/kind cleanup

Will continue to iterate on this intermittently, at least some of the issues are fixed now.

See #117821 for why we need this to move forward.

Also, at some point #112862

@k8s-ci-robot k8s-ci-robot added sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 16, 2023
@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 16, 2023
@BenTheElder
Copy link
Member Author

BenTheElder commented Jun 16, 2023

Ah, it always worked in CI, the issue is when using build/run.sh locally (CI doesn't use run.sh to execute verify scripts), but that already doesn't work for some verify scripts currently.

Still need to fix:

hack/make-rules/../../hack/verify-golangci-lint.sh
hack/make-rules/../../hack/verify-openapi-spec.sh
hack/make-rules/../../hack/verify-shellcheck.sh
hack/make-rules/../../hack/verify-publishing-bot.py

but only when using build/run.sh make verify.

Whereas on master / 1ff1a26 all of these fail without this patch build/run/.sh make verify fails:

hack/make-rules/../../hack/verify-codegen.sh
hack/make-rules/../../hack/verify-generated-stable-metrics.sh
hack/make-rules/../../hack/verify-golangci-lint.sh
hack/make-rules/../../hack/verify-internal-modules.sh
hack/make-rules/../../hack/verify-mocks.sh
hack/make-rules/../../hack/verify-openapi-spec.sh
hack/make-rules/../../hack/verify-shellcheck.sh
hack/make-rules/../../hack/verify-spelling.sh
hack/make-rules/../../hack/verify-yamlfmt.sh
hack/make-rules/../../hack/verify-publishing-bot.py

So there's no regression here, it should be less broken. (Note: tested on Linux but results should be same on mac)

@BenTheElder BenTheElder changed the title WIP: don't exclude .git from rsync Don't exclude .git from rsync Jun 16, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2023
@BenTheElder
Copy link
Member Author

cc @thockin this is related to the git worktree work.

@thockin
Copy link
Member

thockin commented Jun 16, 2023

Alternately: git init && git add . && git commit -am "add"

@BenTheElder
Copy link
Member Author

Alternately: git init && git add . && git commit -am "add"

We could, but I want to avoid load-bearing filtering in rsync anyhow because it's going to block migrating to a simple volume mount for #112862.

The remaining broken scripts are not broken by git, they're just broken in the build container for other reasons (assuming the host binary output path, trying to use docker themselves, etc)

@thockin
Copy link
Member

thockin commented Jun 18, 2023 via email

@BenTheElder
Copy link
Member Author

The .git dir can be huge - the point of rsync was to allow remote builds
and make mac's not suck.

We stopped supporting remote builds O(years) ago, that just leaves macOS.

If you sync that, won't it be as bad as not using sync?

Not necessarily, since you do a full read through once and then everything is in the linux VM, repeated I/O during builds across the VM boundary is still expensive (though I think fast enough these days that a hopefully soon follow-up is to eliminate the sync entirely and just "bind mount").

@dims
Copy link
Member

dims commented Oct 23, 2023

/approve
/lgtm

/hold Ben, please remove hold when you feel this is ready to land

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 23, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 23, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3a639fdea38eed0c8e5ca373b740ebbeaed039d0

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@thockin
Copy link
Member

thockin commented Oct 23, 2023

What's the impact on build/test loops? I seem to recall trying this and it being bad, even on my Linux desktop

@BenTheElder
Copy link
Member Author

BenTheElder commented Oct 23, 2023

I recall when I wrote this PR it wasn't bad [on linux] but I'd prefer to confirm hard numbers with the latest changes so ... let me report back later and leave the hold until then ...

@BenTheElder
Copy link
Member Author

BenTheElder commented Oct 23, 2023

What's the impact on build/test loops? I seem to recall trying this and it being bad, even on my Linux desktop

If you're somehow running an extremely trivial command like build/run.sh git status after make clean but after having already downloaded the docker images etc you might notice a few extra seconds at most (Tested on a large Linux VM). This scenario is pretty unlikely though.

If you're running subsequent commands this is ~free because you've already synced in .git and it's not really changing.
For any less trivial containerized build commands e.g. build/run.sh make WHAT=cmd/kube-apiserver the variance in build times is higher than any measurable difference with/without this patch (~2m for a clean build, ~29s for calling it again no-op).

NOTE: build/run.sh git status won't even work without this patch anyhow ...

Ensuring we're not depending on a load-bearing "don't sync .git" is the first step towards eliminating rsync in favor of a source bind mount. We haven't supported remote rsync/docker hosts for years and mac+docker VFS has gotten faster.

Shipping this patch unbreaks a lot of containerized git interactions as well.

@thockin
Copy link
Member

thockin commented Oct 23, 2023

LGTM

@BenTheElder
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 23, 2023
@k8s-ci-robot k8s-ci-robot merged commit 13ee40c into kubernetes:master Oct 24, 2023
13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Oct 24, 2023
@BenTheElder BenTheElder deleted the git-fun branch October 24, 2023 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants