Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builder: port chroot building to Go #102

Merged
merged 1 commit into from
Jan 26, 2018

Conversation

cmarcelo
Copy link
Contributor

This rewrites bundle-chroot-builder.py in Go to be part of Mixer
code. Mixer is the only user of that software, and both Mixer and
b-c-b.py the same configuration file, with overlapping fields.

Main differences from bundle-chroot-builder.py:

  • New bundleset type was added, that cares about collecting as much
    information as possible from the bundles themselves. This type and
    related functions also sets us up for success when upcoming changes
    to how bundles are specified happen. There is no assumption all
    bundles are in the same directory.

  • We are not using m4, instead a bundleset takes care of parsing. If
    format of individual bundle files change. The upside is that we can
    give nicer error messages, specially for the circular case.

  • Read the configuration file directly (with go-ini) to peek at values
    that Mixer didn't read before. Done that to avoid conflicting with
    existing patch in-flight that parses configuration.

  • Some individual steps were reordered for code clarity. E.g.: since
    we have bundleset, we can upfront generate all the *-include files.

  • Fixed the output for versions file. Due to the way yum list output
    works, parsing it is not very friendly. Comments around the code
    tells the story.

  • Removed the network testing step. It wasn't covering every case in
    the Python version, so I'm leaning to let the failure come from
    yum/dnf itself. I'm usually in favor of such early tests, but in
    this case the price of parsing yet another config file didn't felt
    worth.

  • Removed the "yum clean all" step from the bootstrap. There isn't any
    cache at that point, and the next yum call will bootstrap the
    necessary files for yum to operate.

  • Removed generation of files-* files (and the pkgmap-* files used to
    generate them). I couldn't find any tool or team making use of this
    information. Those (or their content) might be relevant in future
    changes to use a single chroot, but we should add when we need them.

  • Added more detailed commentary to individual steps, collecting
    information from the developers of bcb and related software.

  • The port still don't parallelize the work into multiple
    goroutines. I plan to do this in a similar way than what was done in
    CreateFullfiles, but in a separated patch.

Fixes #42.

Signed-off-by: Caio Marcelo de Oliveira Filho caio.oliveira@intel.com

@cmarcelo
Copy link
Contributor Author

I've tested this by generating the chroots with a few bundles with --new-chroots and without, and comparing the md5sum of all the files. The only differences (all expected) were the versionstamp, versions file (this port fixes its output), and the presence of files-* files (not part of the port).

I haven't tested it in the "build clear linux itself" mode yet.

@cmarcelo
Copy link
Contributor Author

@phmccarty @tmarcu could you take a look?

@cmarcelo
Copy link
Contributor Author

@kevincwells could you take a look of how bundleset is being used? It let us pass files from different directories, so I think will play nice with the upcoming work you are doing. Instead of doing the ReadDir in the bundles directory, we'll call your functions to decide whether bundle.

Note that I've seen GetIncludedBundles, but my parsing was doing more than that (taking the packages), so I've opted out of it. I haven't looked how bundleset could benefit the code of AddBundles yet.

@kevincwells
Copy link
Contributor

@cmarcelo I finished up my bundle refactor and pushed it today (#107). It's designed to go as-is, generating the mix-bundles list on-the-fly and assuming this BCB rewrite doesn't exist.

I haven't had the time to look at your implementation yet, but I will tomorrow. I'm betting it'll be relatively easy to merge our two changes, so the mix-bundles dir is removed completely.


}

func parseBundle(contents []byte) (includes []string, packages []string, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very close to what I want for a mixer bundle lint or mixer bundle check command, as per #53. I would love to reuse this there.

The only addition I want for #53 is checks that the # [TITLE]: <title> and # [DESCRIPTION]: <description> fields are not empty. Maybe others too, but those for sure. I know your code doesn't care about those fields, though.

Copy link
Contributor

@tmarcu tmarcu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will run this on the larger server for a bigger test.

@@ -42,6 +42,10 @@ const Version = "3.2.1"
// swupd-server (package swupd) when possible. This is an experimental feature.
var UseNewSwupdServer = false

// UseNewChroots controls whether to use the new implementation of
// building the chroots. This is an experimental feature.
var UseNewChroots = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/UseNewChroots/UseNewChrootBuilder/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

func (b *Builder) buildBundleChroots(set *bundleSet) error {
var err error
stateDir := b.Statedir
chrootDir := filepath.Join(stateDir, "image")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have b.ChrootDir and whatever else is needed setup from the beginning wouldn't it? Out of scope for this PR but reading through made me think of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

// bootstraping or cleaning up.
outputDir := filepath.Join(stateDir, "www")

if stateDir == "" || stateDir == "/" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "/" disallowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just some paranoid version of me wanting to prevent some invalid input. Removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of sanity check I would want configuration reading to deal with, as marked in another TODO.

if err != nil {
return err
}
// TODO: Move this outside this function. chroot building doesn't use LAST_VER.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move it out now and call it only after building finishes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved outside, but added a TODO to consider moving it to the bundle update logic.

Not doing that other move now to contain the amount of change (don't want to touch "build update" in this patch).

version = b.Clearver
}
// TODO: This validation should happen when reading the configuration.
if version == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add this in the else case above and remove from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


fmt.Printf("Preparing new %s\n", chrootVersionDir)
// TODO: Fix this output when we take bundles from multiple places.
fmt.Printf(" based on bundles from: %s\n", b.Bundledir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should read from .mixer/mixbundles right @kevincwells ? (once the refactor PR is in)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this print and added a print to the place we read the bundles directory. This will make the fix suggested in TODO more straightforward later.

I think the idea when Kevin changes get in is that this would not output the transient bundles directory, but some information about the original location of the bundles ("using these and these local bundles, and these and these remove bundles").

return err
}
for name, bundle := range set.Bundles {
// TODO: Should we embed this information in groups.ini? (Maybe rename it to bundles.ini)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the groups.ini should be removed in the future and replaced with a bundles toml similar, which hosts all the information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.


fmt.Println("Creating chroots for bundles")
// TODO: Use goroutines.
for _, bundle := range set.Bundles {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can likely just throw each bundle install into a goroutine without any complicated waitgroup limiting, there's not much processing here just IO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rather do the parallel thing in a separate patch on top if it is OK.

I suspect that even if the processes will be I/O bound, we might get some slowness from a lot of different I/O competing (may reduce the locality). Maybe doesn't make a difference, but think it is worth testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, let's look at it sooner than later then.

_ = f.Close()
}()

// TODO: If this is a mix, NAME and ID should probably change too. Create a section in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good opportunity to update the os-release file to be more descriptive if it's on a mix, updating the name & adding a UpstreamVersion field perhaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but think we can do that in a separate change. Created #113 and assigned to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok +1


// run executes args storing its output in memory. If the command succeeds returns that output, if
// it fails, return err that contains both the out and err streams from the execution.
func run(args ...string) (*bytes.Buffer, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in helpers.go where RunCommand and RunCommandSilent are? Or update one/both of them to do what you need here? Don't want 3 functions doing essentially the same thing in different files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely be de-duplicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Make total sense, they just weren't in the codebase when the PR was made. :)

Created RunCommandOutput, which does what the run here does, and implemented RunCommandSilent using that. Added two tests for RunCommandOutput checking the output and error message have what we expect.

Copy link
Contributor

@matthewrsj matthewrsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason builder/chroots.go isn't loading for me right now. Submitting this review now so I don't lose it.

These are all optional suggestions.

chrootcmd.Stderr = os.Stderr
err := chrootcmd.Run()
if err != nil {
helpers.PrintError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we let the caller of BuildChroots print the error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Name string
Filename string

DirectIncludes []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just Includes since you don't have an AllIncludes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about someone reading code that says "bundle.Includes", and there could be ambiguity if this were all the includes or just the direct ones. Since this notion of Direct/All already appears for package I just reused.

// Error cases.
{Contents: []byte(`include(`), ExpectedError: true},
{Contents: []byte(`()`), ExpectedError: true},
{Contents: []byte(`incl` + `ud(`), ExpectedError: true}, // Hide the misspelling from linter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just say Include instead of trying to fool the linter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


for _, tt := range tests {
includes, packages, err := parseBundle(tt.Contents)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do

if (err != nil) != tt.ExpectedError {
        if tt.ExpectedError {
                t.Errorf("unexpected success when parsing bundle...", ...)
        } else {
                t.Errorf("unexpected error when parsing bundle...", ...)
        }
}

it just seems a little easier to read to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Renamed ExpectedError to ShouldFail to make it more evident it is a boolean and added a local variable failed just to make the if condition more readable.

t.Errorf("unexpected success when parsing bundle\nCONTENTS:\n%s\nPARSED INCLUDES: %s\nPARSED PACKAGES:\n%s", tt.Contents, includes, packages)
continue
}
sameIncludes := reflect.DeepEqual(includes, tt.ExpectedIncludes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify logic and just do

if !reflect.DeepEqual(includes, tt.ExpectedIncludes) {
        t.Errorf("expected includes: %v\nresult includes: %v", tt.ExpectedIncludes, includes)
}

And then do the same for packages. To me these seem like two distinct errors, and don't need to be combined like you do below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

FilesMap{
"a": Lines("A A A A"),
"b": Lines("include(a) A A A A"),
"c": Lines("include(b) A A A A include(a)"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we calling this valid? I guess I don't see why not...

include(b)
A
A
A
A
include(a)

Copy link
Contributor

@tmarcu tmarcu Jan 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Includes could be throw in anywhere, we just like to keep them organized(sorted).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it invalid later, but at the moment, we play by m4 rules (the program that parses the bundles).


var set *bundleSet
set, err = newBundleSet(bundleFiles)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simplify this as I mentioned in an above comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, also renaming to shouldFail.


func (b *Builder) buildBundleChroots(set *bundleSet) error {
var err error
stateDir := b.Statedir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you create a new variable here? I don't think it is very expensive to just access the struct member every time you want to use it and stateDir only saves you two characters over b.Statedir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
}

// TODO: To not touch config code that is in flux at the moment, reparsing it here to grab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/To/Do/ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tone.


// Write INI files. These are used to communicate to the next step of mixing (build update).
var serverINI bytes.Buffer
fmt.Fprintf(&serverINI, `[Server]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good use case for a template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but since we are exploring how to use information directly from builder.conf, I have the impression it will go away, so won't take the trouble. If it does stay, we templatize it.

Created #114 to keep track of server.ini's fate.


// run executes args storing its output in memory. If the command succeeds returns that output, if
// it fails, return err that contains both the out and err streams from the execution.
func run(args ...string) (*bytes.Buffer, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely be de-duplicated.

"-y",
"--releasever=" + b.Clearver,
}
// When in Fedora, call dnf instead of yum.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was in the original version, but we should consider just deprecating yum. Clear Linux is headed in that direction already and dnf is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better question for @bryteise

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Myself and others have been using dnf without problem for a while AFAIK, let's poke the list and see what people think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 @tmarcu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but for the sake of this PR, I'll stick to original behavior on this. We later can test and nuke yum usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recorded in issue #115.

return err
}
// TODO: This seems to be the only thing that makes two consecutive chroots of the same
// version to be different. Use SOURCE_DATE_EPOCH if available?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SOURCE_DATE_EPOCH would be a good change as it is used in Clear Linux package builds. This was also discussed as a change for swupd-server.

}

fmt.Printf("Creating %s chroot\n", bundle.Name)
_, err = run("cp", "-a", "--preserve=all", osCoreDir, filepath.Join(chrootVersionDir, bundle.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a way to do this without shelling out to cp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick research shows you would have to implement your own recursive copy... never mind.

This rewrites bundle-chroot-builder.py in Go to be part of Mixer
code. Mixer is the only user of that software, and both Mixer and
bundle-chroot-builder.py the same configuration file, with overlapping
fields.

Main differences from bundle-chroot-builder.py:

- New bundleset type was added, that cares about collecting as much
  information as possible from the bundles themselves. This type and
  related functions also sets us up for success when upcoming changes
  to how bundles are specified happen. There is no assumption all
  bundles are in the same directory.

- We are not using m4, instead a bundleset takes care of parsing. If
  format of individual bundle files change. The upside is that we can
  give nicer error messages, specially for the circular case.

- Read the configuration file directly (with go-ini) to peek at values
  that Mixer didn't read before. Done that to avoid conflicting with
  existing patch in-flight that parses configuration.

- Some individual steps were reordered for code clarity. E.g.: since
  we have bundleset, we can upfront generate all the *-include files.

- Fixed the output for versions file. Due to the way yum list output
  works, parsing it is not very friendly. Comments around the code
  tells the story.

- Removed the network testing step. It wasn't covering every case in
  the Python version, so I'm leaning to let the failure come from
  yum/dnf itself. I'm usually in favor of such early tests, but in
  this case the price of parsing yet another config file didn't felt
  worth.

- Removed the "yum clean all" step from the bootstrap. There isn't any
  cache at that point, and the next yum call will bootstrap the
  necessary files for yum to operate.

- Removed generation of files-* files (and the pkgmap-* files used to
  generate them). I couldn't find any tool or team making use of this
  information. Those (or their content) might be relevant in future
  changes to use a single chroot, but we should add when we need them.

- Added more detailed commentary to individual steps, collecting
  information from the developers of bcb and related software.

- The port still don't parallelize the work into multiple
  goroutines. I plan to do this in a similar way than what was done in
  CreateFullfiles, but in a separated patch.

Fixes clearlinux#42.

Signed-off-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
@cmarcelo
Copy link
Contributor Author

@tmarcu @matthewrsj new version is up.

Will run this on the larger server for a bigger test.

So, did your tests work?

@tmarcu
Copy link
Contributor

tmarcu commented Jan 26, 2018

@cmarcelo the boot prompt in my booted VM says it worked 👍

@tmarcu tmarcu merged commit ee63b72 into clearlinux:master Jan 26, 2018
@cmarcelo cmarcelo deleted the new-build-chroots branch January 30, 2018 23:47
@kevincwells kevincwells mentioned this pull request Feb 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants