Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull in latest snapd trunk and add additional optional parameter as nil to gadgetLayoutVolume #136

Merged
merged 5 commits into from Sep 18, 2023

Conversation

sil2100
Copy link
Collaborator

@sil2100 sil2100 commented Aug 2, 2023

This should fix snapd integration test suite, which got broken after snapd trunk changed the API of gadget.LayoutVolume. It's a bit of a shame that because of such API changes we need to rebuild with snapd trunk.

Copy link
Member

@alfonsosanchezbeato alfonsosanchezbeato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@sergiocazzolato sergiocazzolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@sil2100
Copy link
Collaborator Author

sil2100 commented Aug 2, 2023

Ok, the test failures now seem to be pointing at a real issue - possibly testcase needs to be readjusted. I think something changed in snapd.

@sil2100
Copy link
Collaborator Author

sil2100 commented Aug 21, 2023

I'm a bit worried about the gob tests that are failing right now. Apparently the latest changes cause issues with state machine resume...

@zhsj
Copy link
Contributor

zhsj commented Aug 29, 2023

The gob tests fail because type kcmdline.ArgumentPattern has no exported fields.

This is introduced in snapcore/snapd#12976

@mvo5
Copy link
Contributor

mvo5 commented Sep 5, 2023

Are there some more details on the "gob" test failure? We have a fix in snapcore/snapd#13144 but it would be nice to understand what exactly fails on your side so that we can add a test in snapd to prevent this in the future.

@zhsj
Copy link
Contributor

zhsj commented Sep 5, 2023

Are there some more details on the "gob" test failure? We have a fix in snapcore/snapd#13144 but it would be nice to understand what exactly fails on your side so that we can add a test in snapd to prevent this in the future.

The CI shows

2023-08-18T13:41:48.3535762Z goroutine 3728 [running]:
2023-08-18T13:41:48.3535858Z runtime/debug.Stack()
2023-08-18T13:41:48.3536022Z    /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/debug/stack.go:24 +0x72
2023-08-18T13:41:48.3536127Z runtime/debug.PrintStack()
2023-08-18T13:41:48.3536360Z    /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/debug/stack.go:16 +0x25
2023-08-18T13:41:48.3536768Z github.com/canonical/ubuntu-image/internal/helper.(*Asserter).AssertErrNil(0xc00001de00, {0x1665620, 0xc00023f680}, 0x0)
2023-08-18T13:41:48.3537050Z    /home/runner/work/ubuntu-image/ubuntu-image/internal/helper/asserter.go:19 +0x4e
2023-08-18T13:41:48.3537460Z github.com/canonical/ubuntu-image/internal/statemachine.TestUntilThru.func1(0xc0005f8000)
2023-08-18T13:41:48.3537780Z    /home/runner/work/ubuntu-image/ubuntu-image/internal/statemachine/state_machine_test.go:321 +0x92
2023-08-18T13:41:48.3537905Z testing.tRunner(0xc0005f8000, 0xc005fc9020)
2023-08-18T13:41:48.3538071Z    /opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1576 +0x217
2023-08-18T13:41:48.3538173Z created by testing.(*T).Run
2023-08-18T13:41:48.3538337Z    /opt/hostedtoolcache/go/1.20.7/x64/src/testing/testing.go:1629 +0x806
2023-08-18T13:41:48.3538540Z     asserter.go:23: Did not expect an error, got failed to parse metadata file: unexpected EOF

The error happens in readMetadata,

func (stateMachine *StateMachine) readMetadata() error {
// handle the resume case
if stateMachine.stateMachineFlags.Resume {
// open the ubuntu-image.gob file and determine the state
var partialStateMachine = new(StateMachine)
gobfilePath := filepath.Join(stateMachine.stateMachineFlags.WorkDir, "ubuntu-image.gob")
gobfile, err := os.Open(gobfilePath)
if err != nil {
return fmt.Errorf("error reading metadata file: %s", err.Error())
}
defer gobfile.Close()
dec := gob.NewDecoder(gobfile)
err = dec.Decode(&partialStateMachine)
if err != nil {
return fmt.Errorf("failed to parse metadata file: %s", err.Error())
}

Actually the error should happen early, when encode the data. But the error is ignored, so I send #138, then it will tells that writeMetadata also fails.

func (stateMachine *StateMachine) writeMetadata() error {
gobfilePath := filepath.Join(stateMachine.stateMachineFlags.WorkDir, "ubuntu-image.gob")
gobfile, err := os.OpenFile(gobfilePath, os.O_CREATE|os.O_WRONLY, 0644)
if err != nil && !os.IsExist(err) {
return fmt.Errorf("error opening metadata file for writing: %s", gobfilePath)
}
defer gobfile.Close()
enc := gob.NewEncoder(gobfile)
// no need to check errors, as it will panic if there is one
enc.Encode(stateMachine)
return nil

ubuntu-image use gadget.Info in its StateMachine struct.

type StateMachine struct {
cleanWorkDir bool // whether or not to clean up the workDir
CurrentStep string // tracks the current progress of the state machine
StepsTaken int // counts the number of steps taken
YamlFilePath string // the location for the yaml file
IsSeeded bool // core 20 images are seeded
rootfsVolName string // volume on which the rootfs is located
rootfsPartNum int // rootfs partition number
SectorSize quantity.Size // parsed (converted) sector size
RootfsSize quantity.Size
tempDirs temporaryDirectories
// The flags that were passed in on the command line
commonFlags *commands.CommonOpts
stateMachineFlags *commands.StateMachineOpts
states []stateFunc // the state functions
// used to access image type specific variables from state functions
parent SmInterface
// imported from snapd, the info parsed from gadget.yaml
GadgetInfo *gadget.Info
// image sizes for parsing the --image-size flags
ImageSizes map[string]quantity.Size
VolumeOrder []string
// names of images for each volume
VolumeNames map[string]string
}

And it wants to use encoding/gob to serialize that struct. So if snapd wants to add a test, then it could try to use encoding/gob on the gadget.Info struct.

@valentindavid
Copy link
Contributor

Is there a way we could make the code fail to compile if it is not serializable rather than getting a nil? It feels like we would need to write test with lots of different input so that we test all possible types. And that is not nice.

@valentindavid
Copy link
Contributor

valentindavid commented Sep 5, 2023

Maybe we should implement GobEncoder and GobDecoder on gadget.Info to use some serialization that is a bit more strict on the types.

@sil2100
Copy link
Collaborator Author

sil2100 commented Sep 6, 2023

hm, sadly I'm seeing issues in our CI with the latest snapd version.

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc0245fc718 stack=[0xc0245fc000, 0xc0445fc000]
fatal error: stack overflow

It didn't happen with the previous version, and it's not even in the pause/resume tests but instead in the UC20 tests. I can reproduce it locally as well. I'm trying to investigate, but would appreciate some help.

@sil2100
Copy link
Collaborator Author

sil2100 commented Sep 18, 2023

Merging this, as it seems to make everything and everyone happy.

@sil2100 sil2100 merged commit c1fcb3d into main Sep 18, 2023
10 checks passed
@sil2100 sil2100 deleted the snapd-breakage-refresh branch September 18, 2023 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants