refactor(convert): preload dependency dirs via go list#650
Conversation
Summary of ChangesHello @luoliwoshang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the package dependency loading mechanism to improve performance and determinism. Instead of performing individual lookups for each package import, it now preloads all dependency package directories into a cache using a single Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the dependency loading mechanism to improve performance by preloading package directories with a single go list command. While the changes are well-structured, a significant concern is the introduction of a global configuration variable BuildMod. This variable is not thread-safe and can lead to race conditions in concurrent environments, requiring it to be moved to instance-specific configuration for isolation and reliability. Additionally, there are suggestions to improve the robustness of the new go list output parsing and to address some minor redundancy.
| return &PkgDepLoader{ | ||
| // BuildMod is passed to `go list -mod=...` when preloading package directories. | ||
| // Keep it empty to use the go command default behavior. | ||
| var BuildMod string |
There was a problem hiding this comment.
The BuildMod variable is defined as a global string, which is not thread-safe. In a concurrent environment (e.g., if this library is used in a web service or a multi-threaded application), multiple goroutines calling NewPkgDepLoader will share and potentially overwrite this global state. This can lead to race conditions where one caller's configuration is used by another, potentially causing incorrect package loading or security issues if the -mod flag is used to enforce security policies (like readonly). Consider moving this configuration into the PkgDepLoader struct or passing it as a parameter to the constructor.
| for _, line := range strings.Split(string(data), "\n") { | ||
| pos := strings.Index(line, "=") | ||
| if pos != -1 { | ||
| pm.pkgs[line[:pos]] = line[pos+1:] | ||
| } | ||
| } |
There was a problem hiding this comment.
The current parsing logic using strings.Index can be fragile. For example, if a line from go list is "=foo", it will result in an empty key in the map, which is likely unintended.
Using strings.SplitN is a more robust and idiomatic way to parse key-value pairs. This ensures you correctly handle various edge cases.
| for _, line := range strings.Split(string(data), "\n") { | |
| pos := strings.Index(line, "=") | |
| if pos != -1 { | |
| pm.pkgs[line[:pos]] = line[pos+1:] | |
| } | |
| } | |
| for _, line := range strings.Split(string(data), "\n") { | |
| parts := strings.SplitN(line, "=", 2) | |
| if len(parts) == 2 && parts[0] != "" { | |
| pm.pkgs[parts[0]] = parts[1] | |
| } | |
| } |
| cmd.Dir = root | ||
| data, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| msg := strings.TrimSpace(string(bytes.TrimSpace(data))) |
There was a problem hiding this comment.
There is a redundant call to strings.TrimSpace here. The data byte slice is already trimmed with bytes.TrimSpace. Calling strings.TrimSpace on the resulting string is unnecessary. You can remove the outer strings.TrimSpace for slightly better performance and cleaner code.
| msg := strings.TrimSpace(string(bytes.TrimSpace(data))) | |
| msg := string(bytes.TrimSpace(data)) |
| cmd.Dir = root | ||
| data, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| msg := strings.TrimSpace(string(bytes.TrimSpace(data))) |
There was a problem hiding this comment.
bytes.TrimSpace(data) followed by strings.TrimSpace() is redundant - both trim the same whitespace characters. Simplify to:
| msg := strings.TrimSpace(string(bytes.TrimSpace(data))) | |
| msg := string(bytes.TrimSpace(data)) |
| func runGoCommand(root string, args ...string) ([]byte, error) { | ||
| cmd := exec.Command("go", args...) | ||
| cmd.Dir = root | ||
| data, err := cmd.CombinedOutput() |
There was a problem hiding this comment.
CombinedOutput() merges stdout and stderr. If go list -e emits any warnings/diagnostics to stderr (which it can), they will be mixed with the formatted output and could corrupt parsing (lines without = are silently skipped, but lines with = in a warning message could create bogus entries).
Consider using separate capture for stdout only:
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
err := cmd.Run()
// ... use stdout.Bytes() for parsing| args := []string{"list", "-deps", "-e", "-f={{.ImportPath}}={{.Dir}}"} | ||
| if BuildMod != "" { | ||
| args = append(args, "-mod", BuildMod) | ||
| } | ||
| args = append(args, "all") |
There was a problem hiding this comment.
Running go list -deps ... all loads the entire module dependency graph (potentially thousands of packages for large modules), but Import() is only called for a small, bounded set of dependencies from llcppg.cfg.
Consider passing only the known dependency paths as arguments instead:
args := []string{"list", "-e", "-f={{.ImportPath}}={{.Dir}}", "pkg1", "pkg2", "pkg3"}This would produce the same mapping but with significantly less overhead.
|
|
||
| func NewPkgDepLoader(mod *xgomod.Module, pkg *gogen.Package) *PkgDepLoader { | ||
| return &PkgDepLoader{ | ||
| // BuildMod is passed to `go list -mod=...` when preloading package directories. | ||
| // Keep it empty to use the go command default behavior. | ||
| var BuildMod string |
There was a problem hiding this comment.
This exported package-level mutable variable has a couple of concerns:
-
No validation: If set to an unexpected value, it passes directly to
exec.Command. Consider validating against an allowlist ("","mod","readonly","vendor"). -
Not concurrency-safe: Any concurrent callers of
NewPkgDepLoaderwould race on this shared state. Consider passing this as a parameter toNewPkgDepLoaderinstead.
|
Overall: Good refactor that centralizes dependency resolution into a single Key suggestions:
Test coverage looks good with the updated |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #650 +/- ##
==========================================
- Coverage 85.30% 85.16% -0.14%
==========================================
Files 28 28
Lines 2415 2447 +32
==========================================
+ Hits 2060 2084 +24
- Misses 311 315 +4
- Partials 44 48 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
NewPkgDepLoadervia:go list -deps -e -f={{.ImportPath}}={{.Dir}} allpkgPath -> pkgDirand use it directly inImport-modconfigurable throughconvert.BuildModwhen building the cacheNewPkgDepLoaderreturn(*PkgDepLoader, error)and propagate errors at call sitesBehavior change
Importnow relies on the preloaded go-list cache for package directory lookup.Motivation
Validation
go test ./cl/internal/convert -run TestImport -count=1go test ./cl/internal/convert -run TestNewPackageLinkWithoutLibCommand -count=1Related