Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: [modules + integration] go mod buildrequires, list the build requirements of a set of unpacked modules #31300

Open
nim-nim opened this issue Apr 6, 2019 · 2 comments

Comments

@nim-nim
Copy link

commented Apr 6, 2019

This report is part of a series, filled at the request of @mdempsky, focused at making Go modules integrator-friendly.

Please do not close or mark it as duplicate before making sure you’ve read and understood the general context. A lot of work went into identifying problems points precisely.

Needed feature

Go needs an official go mod buildrequires command that processes a set of unpacked Go modules and returns the list of other Go modules that need to be present for this set to be processed by the rest of the go toolchain.

This kind of analysis is required to populate build environments in CI/CD systems accurately, either to complete the set with other modules, or make sure it is self-hosting before a CI/CD run.

go mod buildrequires could be implemented as a go mod graph mode, via specific option flags, or as a separate subcommand.

Constrains

  • the output should be a machine-readable list of:
    • module path,
    • minimal version,
    • excluded versions
  • that basically means:
    • process the require, exclude and replace of all the go.mod files in the set,
    • remove modules already provided by the set,
    • consolidate constrains:
      • bump minimal versions to something compatible with each go.mod individual requirements
      • warn if a module in the set does not meet the constrains specified by other modules in the set
      • deduplicate exclusions
  • the feature should also be exposed as a function in the official Go API, returning a go object (map, list, struct…)
  • the input set may be defined as one or several lists of of go.mod filesystem paths (as produced by go mod discover in issue #31299), one or several directory paths (similar to the directory paths defined in issue #31299), or a mix of both
  • the output should be either be pre-filtered, or be easily filterable, without further go tooling invocations, by:
    • GOOS (only return results for this GOOS),
    • GOOS+GOARCH (only return results for this GOOS and GOARCH), or
    • complete build context (GOOS+specific build flag set)
  • the output should be either be pre-filtered, or be easily filterable, without further go tooling invocations, by direct and indirect requirements
  • the output should be either be pre-filtered, or be easily filterable, without further go tooling invocations, by:
    • code build requirements (modules needed to build the code)
    • compiler test requirements (modules needed to run tests, that only require the compiler)
    • integration test requirements (modules needed to run tests, that require more than just the compiler to run, and may not be satisfied by the average CI/CD system):
      • internet access,
      • root access
      • some specific software instance
      • some credentials, etc
  • the command should also take user-provided versioning info as input, either to fill in blanks when info files are not present, or to override them (both strategies could arguably be valid)
  • the command should work in secure no-internet-download mode. In that mode it should probably restricts its processing to direct dependencies and the dependencies available in configured goproxy sources (#31304)

¹ The experimental code is probably horrible. That's not the point. Most of its functions are generic and should have been provided by generic Go tooling in the first place.

Motivation

  • the Go modules design allows the splitting of projects in multiple modules and nested submodules. Therefore, the CI/CD integration unit for Go, is now a source state that may contain a variable number of coordinated Go modules. The go toolchain should not assume that the inputs/outputs of a CI/CD run are a single module declared in a single go.mod file.

  • robust CI/CD systems will cut internet access during a run to ensure the reproducibility and security of run results. A CI/CD build environment needs to be populated with all the code the run needs, before the run starts. Later go get calls won’t work.

  • populating a CI/CD build environment with more code, than the strict minimum the run will need, gets prohibitively expensive in run time, for busy build farms with a huge list of jobs to run. Therefore the build requirement list needs to be as cut down as possible, allowing to remove:

    • the requirements unneeded on a particular GOOS/GOARCH,
    • the requirements of integration tests (if running them is not part of the scheduled CI/CD job)
    • the requirements of plain tests (ditto)
    • the requirements of example code (anything with example in the file or directory name)
  • missing modules, identified by the go mod buildrequires call, will typically be populated from the organisation baseline. Because parts of this baseline can be shared between organisation projects, it won't be mirrored in each project VCS in a vendor directory.

  • missing module population will make use of recent CI/CD improvements, driven by the needs of Rust, Go and Python ³

  • adding new modules to the organisation baseline is a lot of work². It requires

    • assigning a curating team,
    • sifting between forks to identify the actual current project upstream,
    • sifting between VCS mirrors to find the root VCS,
    • legal analysis,
    • test result analysis,
    • writing the corresponding recipe for the CI/CD system
    • etc.
  • therefore, there was violent disbelief and rejection among consulted integrators⁴, of any Go module setup, that forced them to process new modules just because the corresponding imports exists in module parts they have no use for:

    • dependencies of GOOS/GOARCH-specific code, when this GOOS/GOARCH is not part of the organisation targets
    • dependencies of upstream integration tests, that have no chance to ever run in our CI/CD setup, because they use elements not available in it (typically, direct internet calls or root access)
    • dependencies of example code, that will never be used in production, and typically does not compile because its upstream is not keeping it up to date
    • their GOPATH/vendor experience of third party code has made them extremely sensible to this kind of spurious import, and the amount of work it represents.

² Initial import represents the bulk of the work, keeping the module updated once imported is a lot more reasonable.

³

⁴ Sometimes, interrupting before the end of the presentation of Go module changes

@thepudds thepudds changed the title [modules + integration] go mod buildrequires, unpacked module set build requirements cmd/go: [modules + integration] go mod buildrequires, unpacked module set build requirements Apr 6, 2019

@thepudds thepudds added the modules label Apr 6, 2019

@nim-nim nim-nim changed the title cmd/go: [modules + integration] go mod buildrequires, unpacked module set build requirements cmd/go: [modules + integration] go mod buildrequires, list the build requirements of a set of unpacked modules Apr 7, 2019

@rsc

This comment has been minimized.

Copy link
Contributor

commented Apr 11, 2019

In general, you are asking for a whole bunch of low-level tools that depend on specific details of today's implementation. We do not want to expose any of those, because they may well not apply to tomorrow's implementation. We should have a discussion about what Linux distributions should do as far as modules are concerned. Posting a sequence of issues asking for specific low-level commands assumes a specific solution to that general problem and puts the cart before the horse.

@nim-nim

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

@rsc The intent is precisely to isolate Linux distributions from Go Modules low-level implementation details, to allow this implementation to evolve freely, without breaking distribution tools every release.

The command set basically describes the CLI API we use to interface with language (python, perl, java, ruby, rust, js…) and non language (fontconfig, ctan, icon…) component systems. We, unfortunately, do not have a nice comprehensive interface design spec. Interfacing grew organically over time, as component systems gained new capabilities distribution and language side. And, it is relatively unusual to onboard a whole full-fledged component system like Go modules in a single step (most languages construct their component systems progressively).

Also, I took the time to check each listed part was actually needed and compatible with the Go module design, and to translate rpm concepts to the Go module equivalent. But there's absolutely nothing in there specific to the current Go module implementation (nor specific to Go nor rpm in general).

So unless Go starts doing very weird and unusual things this command set should be sufficient for its present and future needs. And if it starts doing radically different things, the command interface can always be extended (as long as those new things are compatible with distribution objectives and core design decisions).

If the command set is fully implemented, we won't have to dig deep inside the Go Module code to implement custom replacements, and you can evolve this code at will without breaking Go integration distribution-side. If it's not fully implemented we’ll request the Go API parts needed to implement it ourselves.

To illustrate, this recent example showcases how it all works out for a rust (to take something roughly similar to Go). The

/usr/bin/cargo-inspector -BR Cargo.toml
/usr/bin/cargo-inspector -TR Cargo.toml

lines in the middle are equivalent to the go mod buildrequires requested here, with a light wrapping to translate the cargo idea of component names and semvers to rpm version syntax. There are two call because the cargo tooling uses a different call for test and non-test dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.