Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(checks): require active workflow to proceed with osie/hook #220

Closed

Conversation

olidacombe
Copy link

Description

Require active workflow to proceed with serving osie/hook.

Why is this needed

Seems this was intended when #35 led to a merge. It has since fallen out of the codebase 🤷?

Also from this walkthrough

How Has This Been Tested?

I tested this manually. When no workflow was associated:

boots_1                  | {"level":"info","ts":1635256325.1207874,"caller":"job/job.go:162","msg":"fetching workflows for hardware","service":"github.com/tinkerbell/boots","pkg":"http","hardwareID":"ce7a95d7-5ffb-1fad-eb49-2726e9fb6202"}
boots_1                  | {"level":"info","ts":1635256325.1227338,"caller":"boots/tftp.go:83","msg":"","service":"github.com/tinkerbell/boots","pkg":"main","client":"192.168.200.130","event":"open","filename":"ipxe.efi","error":"retrieved job is empty: error while fetching the workflow: rpc error: code = Internal desc = grpc: error while marshaling: proto: Marshal called with nil","errorVerbose":"rpc error: code = Internal desc = grpc: error while marshaling: proto: Marshal called with nil\nerror while fetching the workflow\ngithub.com/tinkerbell/boots/packet.(*client).GetWorkflowsFromTink\n\t/Users/odacombe/exp/tinkerbell/boots/packet/endpoints.go:54\ngithub.com/tinkerbell/boots/job.HasActiveWorkflow\n\t/Users/odacombe/exp/tinkerbell/boots/job/job.go:77\ngithub.com/tinkerbell/boots/job.CreateFromIP\n\t/Users/odacombe/exp/tinkerbell/boots/job/job.go:163\nmain.tftpHandler.ReadFile\n\t/Users/odacombe/exp/tinkerbell/boots/cmd/boots/tftp.go:81\ngithub.com/tinkerbell/tftp-go.(*session).serveRRQ\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:228\ngithub.com/tinkerbell/tftp-go.(*session).serve\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:158\ngithub.com/tinkerbell/tftp-go.serve\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:95\ngithub.com/tinkerbell/tftp-go.Serve.func1\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/server.go:164\nruntime.goexit\n\t/usr/local/Cellar/go/1.17/libexec/src/runtime/asm_amd64.s:1581\nretrieved job is empty"}
boots_1                  | {"level":"info","ts":1635256325.1228704,"caller":"boots/tftp.go:157","msg":"","service":"github.com/tinkerbell/boots","pkg":"main","client":"192.168.200.130","event":"open","filename":"ipxe.efi","error":"access_violation: permission denied","errorVerbose":"permission denied\naccess_violation\nmain.serveFakeReader\n\t/Users/odacombe/exp/tinkerbell/boots/cmd/boots/tftp.go:157\nmain.tftpHandler.ReadFile\n\t/Users/odacombe/exp/tinkerbell/boots/cmd/boots/tftp.go:87\ngithub.com/tinkerbell/tftp-go.(*session).serveRRQ\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:228\ngithub.com/tinkerbell/tftp-go.(*session).serve\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:158\ngithub.com/tinkerbell/tftp-go.serve\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/handler.go:95\ngithub.com/tinkerbell/tftp-go.Serve.func1\n\t/Users/odacombe/go/pkg/mod/github.com/tinkerbell/tftp-go@v0.0.0-20200825172122-d9200358b6cd/server.go:164\nruntime.goexit\n\t/usr/local/Cellar/go/1.17/libexec/src/runtime/asm_amd64.s:1581"}

And when I associated a workflow to the same hardware:

boots_1                  | {"level":"info","ts":1635256594.6693184,"caller":"job/job.go:162","msg":"fetching workflows for hardware","service":"github.com/tinkerbell/boots","pkg":"http","hardwareID":"ce7a95d7-5ffb-1fad-eb49-2726e9fb6202"}
boots_1                  | {"level":"info","ts":1635256594.6725316,"caller":"job/job.go:83","msg":"found active workflow for hardware","service":"github.com/tinkerbell/boots","pkg":"http","workflowID":"0f203e84-3664-11ec-9369-0242ac1a0005"}

How are existing users impacted? What migration steps/scripts do we need?

Anyone expecting the existing behaviour would be impacted, and would need to create workflows before boot in order to enjoy an osie / hook boot.

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

@mmlb
Copy link
Contributor

mmlb commented Oct 26, 2021

Hi @olidacombe thanks for opening this up. This is actually intended behavior as we want the machine booted and connected to tink server waiting for a workflow to appear. This way we don't have to pay the machine boot time penalty as part of the workflow execution. It would probably be best to call this out in the docs to avoid confusion.

@mmlb
Copy link
Contributor

mmlb commented Oct 26, 2021

Going to close this off for ^ reason. Would you be interested in a PR to github.com/tinkerbell/docs?

@mmlb mmlb closed this Oct 26, 2021
@olidacombe
Copy link
Author

Hey @mmlb thanks for clearing that up. We do have a use-case for this, whereby we always try to boot from network first, and only continue to a workflow if there is one pending. I get that pbnj or similar could trigger a once-only pxe boot when necessary, but I'm not sure how ready for public consumption that project is.

Would you consider offering this behaviour behind an option or environment variable?

@mmlb
Copy link
Contributor

mmlb commented Oct 26, 2021

Gotcha, yep whats one more env var ;). I'd definitely love for you to give a pbnj a try though too. Its not used much within tinkerbell oss stuff but very much internally at Equinix, so production ready but likely still has some rough edges from single user. Getting rid of nic from the default boot order is also nice to speed up reboots and such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants