Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Propagate Firecracker Task Driver errors to Nomad UI #19

Closed
wimax-grapl opened this issue Apr 4, 2022 · 5 comments
Closed
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@wimax-grapl
Copy link

So I have a task start failing with the following, not-very-useful info:

rpc error: code = Unknown desc = task with ID "8ee3098b-7420-cb04-2892-fedaa3c730ba/tenant-plugin/339ec6bd" failed

image

However, going to the Nomad Agent logs I get the following, much more intelligible errors:
failure when invoking CNI: failed to load CNI configuration from dir "/etc/cni/conf.d" for network "default": no net configurations found in /etc/cni/conf.d"

    2022-04-04T13:23:32.274-0400 [INFO]  client.driver_mgr.firecracker-task-driver: starting firecracker task: driver=firecracker-task-driver driver_cfg="{KernelImage: BootOptions: BootDis
k: Disks:[] Network:default Nic:{Ip: Gateway: Interface: Nameservers:[]} Vcpus:1 Cputype: Mem:128 Firecracker:/usr/bin/firecracker Log: DisableHt:false}" @module=firecracker-task-driver ti
mestamp=2022-04-04T13:23:32.274-0400
    2022-04-04T13:23:32.274-0400 [INFO]  client.driver_mgr.firecracker-task-driver: Starting firecracker: driver=firecracker-task-driver driver_initialize_container="&{/usr/bin/firecracker
 /tmp/NomadClient1700322499/3aee425c-e789-5c1c-e029-d552efbf942c/tenant-plugin/vmlinux  console=ttyS0 reboot=k panic=1 pci=off nomodules /tmp/NomadClient1700322499/3aee425c-e789-5c1c-e029-
d552efbf942c/tenant-plugin/rootfs.ext4  [] default {   []} []    false 1  300    false false [] <nil> 0xc384c0}+" @module=firecracker-task-driver timestamp=2022-04-04T13:23:32.274-0400
    2022-04-04T13:23:32.275-0400 [INFO]  client.driver_mgr.firecracker-task-driver: Error starting firecracker vm: driver=firecracker-task-driver @module=firecracker-task-driver driver_cfg
="Failed to start machine: failure when invoking CNI: failed to load CNI configuration from dir \"/etc/cni/conf.d\" for network \"default\": no net configurations found in /etc/cni/conf.d"
 timestamp=2022-04-04T13:23:32.275-0400
    2022-04-04T13:23:32.275-0400 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=3aee425c-e789-5c1c-e029-d552efbf942c task=tenant-plugin error="rpc error: code = U
nknown desc = task with ID \"3aee425c-e789-5c1c-e029-d552efbf942c/tenant-plugin/0e1713e6\" failed"
    2022-04-04T13:23:32.275-0400 [INFO]  client.alloc_runner.task_runner: not restarting task: alloc_id=3aee425c-e789-5c1c-e029-d552efbf942c task=tenant-plugin reason="Error was unrecovera
ble"

I was wondering if it'd be possible to propagate that error up to the UI? Thanks!

@wimax-grapl
Copy link
Author

(you'll note that the alloc_id is different, I accidentally captured a retry, but the same shows up for 8ee3098b.)

@cneira cneira added bug Something isn't working help wanted Extra attention is needed good first issue Good for newcomers labels Apr 5, 2022
@wimax-grapl wimax-grapl changed the title Request: Propagate Firecracker errors to Nomad UI Request: Propagate Firecracker Task Driver errors to Nomad UI Apr 6, 2022
@ValentaTomas
Copy link
Contributor

ValentaTomas commented May 4, 2022

I would also appreciate this. I'm doing some custom changes to the task driver and even just letting the errors propagate as they are was really helpful.

Do you think just propagating the error here:
https://github.com/cneira/firecracker-task-driver/blob/master/driver/driver.go#L297
https://github.com/cneira/firecracker-task-driver/blob/master/driver/driver.go#L258
would be alright?

@wimax-grapl
Copy link
Author

wimax-grapl commented May 4, 2022

I'm not the author of this plugin, but I think based on how other official supported Nomad drivers work, it'd be totally reasonable.

The vast majority of the StartTask return statements include the err in, say, the Docker driver:
https://github.com/hashicorp/nomad/blob/52faa167dd0e18685440de5b6613f397b5fa0aa8/drivers/docker/driver.go#L300

@ValentaTomas
Copy link
Contributor

I made these changed in #21

@ValentaTomas
Copy link
Contributor

@cneira I think we can close this issue.

@cneira cneira closed this as completed May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants