Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dependencycheck tool to address long running no-vendor-cycles test #93448

Merged
merged 2 commits into from Aug 1, 2020

Conversation

hasheddan
Copy link
Contributor

@hasheddan hasheddan commented Jul 25, 2020

What type of PR is this?

/kind cleanup

/kind flake

What this PR does / why we need it:

Adds a new tool, dependencycheck, to detect cyclical dependencies on main and staging repos in vendored packages. This is currently being handled by hack/verify-no-vendor-cycles.sh, which takes exceedingly long to run (3-4 minutes) for the task it is accomplishing. dependencycheck typically runs in a few seconds. In addition, when taking a look at the logs for verify-master and trying it out locally, I am not certain that this script is actually accomplishing its goal. For instance, if a package depended on k8s.io/apimachinery/pkg/runtime, it appears it is being printed as such, yet the script is checking for a match of k8s.io/kubernetes/vendor/k8s.io then k8s.io/apimachinery. I am guessing this may have to do with dependencies formerly being printed by their full path, but I am not sure. See more here: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-verify-master/1287005337945116672/build-log.txt

Which issue(s) this PR fixes:

Ref: #87807 #92937

Special notes for your reviewer:

I wasn't sure who to add to the OWNERS file here so I just picked some of the usual suspects.

/cc @BenTheElder @dims @liggitt

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 25, 2020
@hasheddan
Copy link
Contributor Author

/sig testing
/priority important-soon

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 25, 2020
@hasheddan
Copy link
Contributor Author

/test pull-kubernetes-files-remake

@hasheddan
Copy link
Contributor Author

/test pull-kubernetes-e2e-kind

@hasheddan
Copy link
Contributor Author

/retest
verify.no-vendor-cycles did not get run that time because of the spike in other tests

@liggitt liggitt self-assigned this Jul 26, 2020
@liggitt
Copy link
Member

liggitt commented Jul 26, 2020

Thanks for jumping on this, much appreciated. I need to double check whether the current impl detects module level transitive dependencies on k8s.io/kubernetes or staging repos even if the particular packages we use don't depend back on these repos. I think the source scanning technique would miss those. If we currently catch/prevent those, we shouldn't regress that. I'll queue this up to look at early next week.

@hasheddan
Copy link
Contributor Author

@liggitt thanks! I think that go list will only check package level transitive dependencies, but I'll test and make sure. Regardless of whether it does, I think an adjustment is needed here to make this check "correct" even if we are unable to improve its latency for the time-being.

@dims
Copy link
Member

dims commented Jul 27, 2020

/uncc

glad to see @liggitt on this!

@k8s-ci-robot k8s-ci-robot removed the request for review from dims July 27, 2020 10:46
@liggitt
Copy link
Member

liggitt commented Jul 27, 2020

I need to double check whether the current impl detects module level transitive dependencies on k8s.io/kubernetes or staging repos even if the particular packages we use don't depend back on these repos. I think the source scanning technique would miss those. If we currently catch/prevent those, we shouldn't regress that. I'll queue this up to look at early next week.

looks like we only checked packages found in vendor previously

if possible, I'd like to lean on go list to do the dependency resolution, with something like this:

go list -mod=vendor -test -deps -json ./vendor/... > deps.json

A go program to process that deps.json file seems better than bash

I would envision something like:

  1. Determine the list of staging repos
  2. Run go list to capture vendor dependencies
  3. Run dependencycheck deps.json k8s.io/kubernetes to fail if any package in deps.json depends on k8s.io/kubernetes/...
  4. Run dependencycheck deps.json k8s.io/client-go k8s.io/api k8s.io/apimachinery ... to fail if any package in deps.json depends on staging modules

As a follow-up, for the module-level cycle checks, we can separately use go mod graph to detect/protect against this, forbidding things that match go mod graph | grep " k8s.io/kubernetes", etc

@hasheddan
Copy link
Contributor Author

@liggitt do we need to explicitly run 3 & 4 separately? This check will execute quickly, so it is not a huge deal, but we are iterating twice through when it seems like one iteration checking for both would be faster.

@liggitt
Copy link
Member

liggitt commented Jul 27, 2020

it's a little easier to treat staging repos and k8s.io/kubernetes separately, but there's not a strong reason to. I expect the tool will take less than a second to run so I doubt calling it twice would matter

@RobertKielty
Copy link
Member

RobertKielty commented Jul 29, 2020

@hasheddan @liggitt What's the next step here?

I'm looking at verify-master failures and I just want to understand if there are more changes needed on this PR?

@hasheddan
Copy link
Contributor Author

@RobertKielty I need to refactor as has been suggested. I should have this up today. However, this is not currently causing the verify-master failures. That is due to pinning resource requests and limits, then typecheck exceeding the memory limit. Opening an issue now with artifacts.

@RobertKielty
Copy link
Member

Thanks @hasheddan

@hasheddan
Copy link
Contributor Author

@liggitt updated with module checks

cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved
cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved
hack/verify-no-vendor-cycles.sh Show resolved Hide resolved
hack/verify-no-vendor-cycles.sh Outdated Show resolved Hide resolved
hack/verify-no-vendor-cycles.sh Outdated Show resolved Hide resolved
hack/verify-no-vendor-cycles.sh Outdated Show resolved Hide resolved
cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved
cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved

var (
exclude = flag.String("exclude", `a^`, "skip packages pattern")
restrict = flag.String("restrict", `a^`, "restricted dependencies pattern")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave empty and require to be set

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify in the help this is a regex and maybe give an example?


var (
exclude = flag.String("exclude", `a^`, "skip packages pattern")
restrict = flag.String("restrict", `a^`, "restricted dependencies pattern")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify in the help this is a regex and maybe give an example?

cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved
cmd/dependencycheck/dependencycheck.go Outdated Show resolved Hide resolved
@hasheddan
Copy link
Contributor Author

@liggitt I believe all comments have been addressed and this is ready for another pass 👍

@hasheddan
Copy link
Contributor Author

/retest

1 similar comment
@hasheddan
Copy link
Contributor Author

/retest

@liggitt liggitt added this to the v1.19 milestone Jul 31, 2020
@liggitt
Copy link
Member

liggitt commented Jul 31, 2020

no-vendor-cycles 6s

down from 5m18s in a recent run 🎉

args := flag.Args()

if len(args) != 1 {
log.Fatalf("usage: dependencycheck <json-dep-file>")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, give an example of how to generate this file (e.g. go list -mod=vendor -test -deps -json ./vendor/...)

log.Fatalf("Error compiling excluded package regex: %v", err)
}
}
fmt.Println(excludePattern)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop debug logging?

exit 1
fi
# Check for any module that is not main or staging and depends on main or staging
go mod graph | grep -vE "^k8s.io\/(kubernetes|${staging_repos_pattern})" | grep -E "\sk8s.io\/(kubernetes|${staging_repos_pattern})" && exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not output a helpful error if it fails

suggest something like

bad_deps=$(go mod graph | grep -vE "^k8s.io\/(kubernetes|${staging_repos_pattern})" | grep -E "\sk8s.io\/(kubernetes|${staging_repos_pattern})" || true)
if [[ -n "${bad_deps}" ]]; then
  echo "Found disallowed dependencies that transitively depend on k8s.io/kubernetes or staging modules:"
  echo "${bad_deps}"
  exit 1
fi

go run cmd/dependencycheck/dependencycheck.go -restrict "^k8s\.io/kubernetes/" "${KUBE_TEMP}/deps.json"

# Check for any vendored package that depends on staging
go run cmd/dependencycheck/dependencycheck.go -restrict "^k8s\.io/(${staging_repos_pattern})(/|$)" "${KUBE_TEMP}/deps.json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both of these need to exclude staging repos from the checks:

-exclude "^k8s\.io/(${staging_repos_pattern})(/|$)"

type goPackage struct {
Name string
ImportPath string
Deps []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deps includes all transitive dependencies of this package, which is likely to be enormous

Imports + TestImports + XTestImports is likely to be more useful in identifying the actual problematic import that needs to be dealt with

suggest:

--- a/cmd/dependencycheck/dependencycheck.go
+++ b/cmd/dependencycheck/dependencycheck.go
@@ -34,9 +34,11 @@ var (
 )
 
 type goPackage struct {
-       Name       string
-       ImportPath string
-       Deps       []string
+       Name         string
+       ImportPath   string
+       Imports      []string
+       TestImports  []string
+       XTestImports []string
 }
 
 func main() {
@@ -86,7 +88,12 @@ func main() {
                        continue
                }
                importViolations := []string{}
-               for _, d := range p.Deps {
+
+               allImports := []string{}
+               allImports = append(allImports, p.Imports...)
+               allImports = append(allImports, p.TestImports...)
+               allImports = append(allImports, p.XTestImports...)
+               for _, d := range allImports {
                        if depsPattern.MatchString(d) {
                                importViolations = append(importViolations, d)
                        }

@justaugustus
Copy link
Member

Very cool, Dan!

dependencycheck verifies that no violating depdendencies exist in pkgs
passed via a JSON file generated from go list.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
verify-no-vendor-cycles currently routinely takes 3-4 minutes to run due
to the way that it checks for cyclical imports in vendored packages.
Moving to using the new dependencycheck tool makes this test much faster.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
@hasheddan
Copy link
Contributor Author

@liggitt updates complete 👍 However, this is going to fail on pull-kubernetes-verify due to #93623

@liggitt
Copy link
Member

liggitt commented Aug 1, 2020

/lgtm
/approve
/hold for publishing bot fix

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 1, 2020
@liggitt
Copy link
Member

liggitt commented Aug 1, 2020

/test pull-kubernetes-e2e-kind

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hasheddan, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 1, 2020
@liggitt
Copy link
Member

liggitt commented Aug 1, 2020

/retest
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 1, 2020
@k8s-ci-robot k8s-ci-robot merged commit e8f58be into kubernetes:master Aug 1, 2020
@BenTheElder
Copy link
Member

thanks @hasheddan @liggitt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants