-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gogitdiff: incremental build tool #21
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21 +/- ##
===========================================
- Coverage 63.63% 48.41% -15.23%
===========================================
Files 3 4 +1
Lines 506 694 +188
===========================================
+ Hits 322 336 +14
- Misses 169 342 +173
- Partials 15 16 +1 Continue to review full report at Codecov.
|
d4674f2
to
1d23755
Compare
Nice. There seems to be a binary file committed btw currently: |
b898026
to
b4b5684
Compare
utilities/ggd/lib/dag.go
Outdated
for _, imp := range imports { | ||
impPath := imp.Path() | ||
if strings.HasSuffix(impPath, "_test") { | ||
println(fmt.Sprintf(">> %+v", imp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dafuq
26f78a2
to
e22a67b
Compare
2bd9362
to
0ab136c
Compare
@jeromefroe ended up using your dag calculation code from #17, it had minor issues but handled the |
misc cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good to me, just had a few nits and questions
utilities/ggd/README.md
Outdated
ggd - GoGitDiff | ||
=============== | ||
|
||
This is a tool to compute the packages affected by the difference between two Git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I found this first sentence kind of confusing. What do you think about simplifying to something like the following:
`ggd` is a tool to compute which packages have changed between two Git revisions and their respective dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
`ggd` is a tool to compute Go packages affected by changes between two Git revisions. It does so by identifying the Go packages changed between two Git revisions, and computing all other packages which depend upon them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
visited := make(map[string]string) | ||
|
||
// ensure we add the pkg to the graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, good catch, yea definitely need to add the package here, as the steps below will only add the things which import it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay for tests :)
if !ok { | ||
// It's okay for Import to return an error as not all packages that can be found in | ||
// a package will necessarily be present. For example, packages imported only by test | ||
// files in vendored packages will not be installed. In the case of an error, Import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (self-nit?): "In the case of an error, Import always returns a non-nil *Package. In the case of an error it will only contain partial information." -> "In the case of an error, Import always returns a non-nil *Package that contains partial information."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/lib/dag.go
Outdated
edges[to] = struct{}{} | ||
} | ||
|
||
// Closure returns all the transitive closure of all packages reachable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/returns all the/returns the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/lib/git.go
Outdated
return res, nil | ||
} | ||
|
||
// CwgIsDirty returns whether the current directory is dirty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: CwgIsDirty
-> CWDIsDirty
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/main.go
Outdated
debug("changed packages: %v", packageChanges) | ||
|
||
baseFullChangeSet := make(lib.ImportSet) | ||
if !*includePrechanges { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the intention here? Why compute the graph and changed packages with the base reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original thought was we'd need to consider both DAGs to answer the question of all affected changes but I'm not convinced that's right anymore. I need to draw some graphs to convince myself before I would be willing to delete the code. Will put down a TODO for now,
utilities/ggd/main.go
Outdated
debug("affected packages (including transitive changes): %v", baseFullChangeSet) | ||
} | ||
|
||
currentSha1, err := sha1Fn() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to checkout the current SHA again before we return?
also, super nit: currentSha1
-> currentSHA1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean, that we should retry or ensure we're on the right SHA1.
Re: retry - why? it'll only fail when the directory is dirty, at which point, retries buy you nothing.
Re: the right sha1 - we do that.
utilities/ggd/lib/dag.go
Outdated
if _, ok := visited[node]; ok { | ||
return | ||
} | ||
visited[node] = struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we can unconditionally add the package. For example, if we delete a package in our branch then it will be a changed package even though it no longer exists and go test
will complain that it cant find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point. Lemme add a test and ensure we handle the situation correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added testcase7
for the case you're describing, here's the mini-diff: f5b1fa0
utilities/ggd/README.md
Outdated
=============== | ||
|
||
This is a tool to compute the packages affected by the difference between two Git | ||
revisions. It's primary goal is to help speed up CI jobs by figuring out the packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It was deSigned to speed up C.I jobs by determining which packages depend on packages that were changed in the target branch vs. the base branch" or something like that.
Also would be cool to link to Digital Ocean's blog post on the topic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: nevermind you did link to it
utilities/ggd/README.md
Outdated
affected by git changes, and transitively looking through the Go Imports. It takes | ||
inspiration from `gta`, and the work done at [DigitalOcean]. | ||
|
||
NB: debug mode (`-d`) allows users to inspect how the tool arrives at the decisions it does. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"allows users to inspect the tool's decision making process"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/lib/dag.go
Outdated
"golang.org/x/tools/go/buildutil" | ||
) | ||
|
||
// ImportGraph is a map from a package -> all the packages that import it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: -> a set of all the packages that import (depend?) on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded.
utilities/ggd/lib/dag.go
Outdated
) | ||
|
||
// ImportGraph is a map from a package -> all the packages that import it. | ||
type ImportGraph map[string]map[string]struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use the ImportSet
as the value for this map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
func (g ImportGraph) maybeAddEdge( | ||
ctx *build.Context, buildPkg *build.Package, visited map[string]string, pkg, path string, | ||
) { | ||
if path == "C" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, what is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CGO .. For context, similar code in go build: https://github.com/golang/go/blob/ae2a2d12f6d8cde35637a13f384f6de524112768/src/go/build/build.go#L860
utilities/ggd/lib/go.go
Outdated
// | ||
// also filters any `vendor/` directory paths | ||
func filterChanges(input string) bool { | ||
if strings.Contains(input, "/vendor/") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why filter the vendor? Can't we use the DAG to determine exactly what packages depend on them if we strip the vendor prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, will make it optional.
I'm thinking of the case for the repos where we include vendor (internally) v the ones we don't (OSS/GH). To your point, it'll be helpful for the former, don't want to include it in the computation for the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/lib/dag.go
Outdated
func (g ImportGraph) Closure(paths ...string) (ImportSet, error) { | ||
// add all the paths we're starting at, as they are by definition | ||
// reachable. | ||
for _, p := range paths { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of including the logic to determine if a package was deleted so we can add all packages here would it make sense to instead only add the package if it exists in the graph since otherwise we know it was deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"only add the package if it exists in the graph since otherwise we know it was deleted"
I actually tried to do this the first. Unfortunately, it doesn't work in certain cases.
Can repro yourself, steps:
$ cd $GOPATH/src/github.com/m3db/build-tools
$ git checkout prateek/ggd/demo-testdata-issue
$ cd ./utilities/ggd
$ ./test.sh
# testcase6 will fail
# can see output for the run in ./utilities/ggd/bin/testcase6.out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that seems to fail because we unconditionally panic if we can't find a package, but what if we first strip the root packages which don't appear in the DAG before walking the graph? I'm thinking something like the following:
func (g ImportGraph) Closure(roots ...string) ImportSet {
closure := make(ImportSet)
for _, r := range roots {
if _, ok := g[r]; !ok {
log.Debugf("root package '%s' is not in the import graph, it has been removed", r)
}
g.walk(r, closure)
}
return closure
}
func (g ImportGraph) walk(node string, visited ImportSet) {
if _, ok := visited[node]; ok {
return
}
if _, ok := g[node]; !ok {
panic(fmt.Sprintf("node (%s) doesn't exist in the graph", node))
}
visited[node] = struct{}{}
for to := range g[node] {
g.walk(to, visited)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
talked offline: if we detect changes in */testdata/*
, run everything; otherwise - do ^^ some version of what Jerome's suggesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. There's one more edge case we didn't talk about: a package that's not imported by any other packages, and gets deleted (testcase7). It isn't present the import graph either. I throw an error for this case and run tests on all non-filtered packages.
utilities/ggd/lib/go.go
Outdated
// | ||
// also filters any `vendor/` directory paths | ||
func filterChanges(input string, filterDirPatterns []string) bool { | ||
for _, ptrn := range filterDirPatterns { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this mean we won't run tests if we just change vendor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an option passed in from the CLI, so up to the user's discretion.
We'd use it in the following modes:
(a) CI in our internal repo which has vendor/
checked in, we would not filter vendor
(b) CI in our OSS repos (where we do not checkin vendor/), we would filter
vendor/`
Local development for both would mirror the choices made in the CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we then be relying on local CI to verify changes to vendor? I'm kinda of the opinion that if you don't vendor you're dependencies in your repo you need to run tests in CI each time when you pull them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
talked offline: we're going to the change the default to include vendor/
. Standard usage for us is going to be run all tests on master, and if we see any glide.yaml/lock changes; otherwise rely on ggd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
9878563
to
b8dca14
Compare
utilities/ggd/lib/dag.go
Outdated
} | ||
if _, ok := g[node]; !ok { | ||
// NB(prateek): this happens in the case of un-used deleted packages. Look at testcase7. | ||
return DanglingDeleteError{node} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return an error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to indicate to main this happened so we can act on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I guess my question is why do we need to act on it? At least, it doesn't seem like an error to me if I remove a package that wasn't imported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, but I'm being paranoid and rerun the entire test suite if this situation occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, maybe worth just calling that out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
utilities/ggd/main.go
Outdated
defaultCatchallPatterns = []string{ | ||
"/testdata/", // go list ./... is unable to gauge impact of changes to this directory | ||
} | ||
defaultFilterPatterns = []string{"/vendor/"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the default for this be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Going to stick with this as the default because it's typically what people want when using the tool by hand. I prefer to optimise for that over what the CI invocation for this will be.
go test
does this too so it's not unexpected behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but if working locally and I update a vendored dependency wouldn't it be preferable to test the packages which depend on the changed vendor package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. But in that case, you've updated glide.yaml/glide.lock
(added to catchall patterns) and we'd run all tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or, they can alternatively run with ggd -f=''
and get the same behaviour (w/o using glide changes as catch all)
utilities/ggd/lib/misc.go
Outdated
func ChangedPackages(changedFiles []string) []string { | ||
changedPkgsMap := make(map[string]struct{}, len(changedFiles)) | ||
for _, f := range changedFiles { | ||
dir := filepath.Dir(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok to take the whole directory path? Was thinking we might need to potentially remove GOPATH
if it's in the prefix so we can ensure we have the correct import path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the changed files are relative paths already. Will add a comment to indicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so then the packages in the import statements are relative as well? For example, if I do a git diff in statsdex the paths of files which have changed will be relative to the base of the repo (e.g. services/collector/etc
), but if I were to import one of those packages they would use the full Git path (e.g. <hostname>/infra/statsdex/services/collector/etc
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I were to import one of those packages they would use the full Git path (e.g. /infra/statsdex/services/collector/etc
I don't follow. Go imports are always GOPATH
or repo/vendor/
relative, what situation are they not going to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Boo yah! Many thanks for the reviews @jeromefroe! |
ggd
, to compute packages affected by git changesNB: don't worry about test coverage, this tool uses e2e integration tests using bash to create a separate gopath/git repos (see
build-tools/utilities/ggd/test.sh
for more details).