Skip to content

test failure: bulk_sp_get_one_sp_powered_off #951

@davepacheco

Description

@davepacheco

Found in https://github.com/oxidecomputer/omicron/runs/6062888362?check_suite_focus=true#step:12:778 under #942:

     Running `/Users/runner/work/omicron/omicron/target/debug/deps/test_all-f0bf36f3c61b9c15`

running 6 tests
test integration_tests::bulk_state_get::bulk_sp_get_one_sp_powered_off ... FAILED
test integration_tests::bulk_state_get::bulk_sp_get_all_online ... ok
test integration_tests::commands::test_gateway_openapi_sled ... ok
test integration_tests::bulk_state_get::bulk_sp_get_one_sp_unresponsive ... ok
test integration_tests::serial_console::serial_console_detach ... ok
test integration_tests::serial_console::serial_console_communication ... ok

failures:

---- integration_tests::bulk_state_get::bulk_sp_get_one_sp_powered_off stdout ----
log file: "/tmp/omicron_tmp/test_all-f0bf36f3c61b9c15-bulk_sp_get_all_online.5369.0.log"
note: configured to log to "/tmp/omicron_tmp/test_all-f0bf36f3c61b9c15-bulk_sp_get_all_online.5369.0.log"
thread 'integration_tests::bulk_state_get::bulk_sp_get_one_sp_powered_off' panicked at 'assertion failed: `(left == right)`
  left: `204`,
 right: `500`', /Users/runner/.cargo/git/checkouts/dropshot-a4a923d29dccc492/da09c39/dropshot/src/test_util.rs:220:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    integration_tests::bulk_state_get::bulk_sp_get_one_sp_powered_off

test result: FAILED. 5 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.00s

The outputs were successfully uploaded:
https://github.com/oxidecomputer/omicron/runs/6062888362?check_suite_focus=true#step:13:28

Here's the end of the output file:

dap@zathras tmp $ bunyan test_all-f0bf36f3c61b9c15-bulk_sp_get_all_online.5369.0.log [2022-04-18T13:39:07.482916Z]  INFO: bulk_sp_get_all_online/5369 on Mac-1650287278361.local: setting up simualted sidecar (slot="sidecar 0")
[2022-04-18T13:39:07.483807Z]  INFO: bulk_sp_get_all_online/5369 on Mac-1650287278361.local: setting up simualted gimlet (slot="gimlet 0")
[2022-04-18T13:39:07.486855Z]  INFO: bulk_sp_get_all_online/5369 on Mac-1650287278361.local: setting up simualted gimlet (slot="gimlet 1")
[2022-04-18T13:39:07.489108Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/5369 on Mac-1650287278361.local: setting up gateway server
[2022-04-18T13:39:07.490474Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/5369 on Mac-1650287278361.local: successfully registered DTrace USDT probes
[2022-04-18T13:39:07.491667Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/SpCommunicator/5369 on Mac-1650287278361.local: started SP communicator
[2022-04-18T13:39:07.493156Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/ignition, method=GET)
[2022-04-18T13:39:07.495031Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/ignition/{type}/{slot}, method=GET)
[2022-04-18T13:39:07.496276Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp, method=GET)
[2022-04-18T13:39:07.497352Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}, method=GET)
[2022-04-18T13:39:07.498426Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/component, method=GET)
[2022-04-18T13:39:07.499555Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/component/{component}, method=GET)
[2022-04-18T13:39:07.501284Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/component/{component}/power_off, method=POST)
[2022-04-18T13:39:07.503046Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/component/{component}/power_on, method=POST)
[2022-04-18T13:39:07.504299Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, method=GET)
    path: /sp/{type}/{slot}/component/{component}/serial_console/attach
[2022-04-18T13:39:07.505363Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, method=POST)
    path: /sp/{type}/{slot}/component/{component}/serial_console/detach
[2022-04-18T13:39:07.506144Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/component/{component}/update, method=POST)
[2022-04-18T13:39:07.506876Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/power_off, method=POST)
[2022-04-18T13:39:07.507623Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: registered endpoint (local_addr=127.0.0.1:49412, path=/sp/{type}/{slot}/power_on, method=POST)
[2022-04-18T13:39:07.508353Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: listening (local_addr=127.0.0.1:49412)
[2022-04-18T13:39:07.508965Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: successfully registered DTrace USDT probes (local_addr=127.0.0.1:49412)
[2022-04-18T13:39:07.509512Z]  INFO: bulk_sp_get_all_online/client test context/5369 on Mac-1650287278361.local: client request (body=Body(Empty), method=POST)
    uri: http://127.0.0.1:49412http://127.0.0.1:49412/sp/sled/0/power_off
[2022-04-18T13:39:07.510175Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: accepted connection (local_addr=127.0.0.1:49412, remote_addr=127.0.0.1:49413)
[2022-04-18T13:39:07.510777Z] TRACE: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: incoming request (req_id=4e11f62a-f387-4037-a572-f4fe0973268e, uri=http://127.0.0.1:49412/sp/sled/0/power_off, method=POST, remote_addr=127.0.0.1:49413, local_addr=127.0.0.1:49412)
[2022-04-18T13:39:07.996817Z] DEBUG: 2d6d80f8-5967-4973-9df4-c457e91618fa/SpCommunicator/5369 on Mac-1650287278361.local: sending Request { version: 1, request_id: 0, kind: IgnitionCommand { target: 1, command: PowerOff } } to SP SpSocket { socket: PollEvented { io: Some(UdpSocket { addr: 127.0.0.1:56453, fd: 36 }) }, addr: 127.0.0.1:65481, port: SwitchPort(0) }
[2022-04-18T13:39:07.997286Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: request completed (req_id=4e11f62a-f387-4037-a572-f4fe0973268e, uri=http://127.0.0.1:49412/sp/sled/0/power_off, method=POST, remote_addr=127.0.0.1:49413, local_addr=127.0.0.1:49412, error_message_external="Internal Server Error", error_message_internal="timeout elapsed", response_code=500)
[2022-04-18T13:39:07.998064Z] DEBUG: bulk_sp_get_all_online/5369 on Mac-1650287278361.local: received ignition command PowerOff for target 1; sending ack (slot="sidecar 0")
[2022-04-18T13:39:07.998467Z]  INFO: bulk_sp_get_all_online/client test context/5369 on Mac-1650287278361.local: client received response (status=500)

It looks like a timeout. If we grep for only the messages for this request-id:

[2022-04-18T13:39:07.510777Z] TRACE: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: incoming request (req_id=4e11f62a-f387-4037-a572-f4fe0973268e, uri=http://127.0.0.1:49412/sp/sled/0/power_off, method=POST, remote_addr=127.0.0.1:49413, local_addr=127.0.0.1:49412)
[2022-04-18T13:39:07.997286Z]  INFO: 2d6d80f8-5967-4973-9df4-c457e91618fa/dropshot/5369 on Mac-1650287278361.local: request completed (req_id=4e11f62a-f387-4037-a572-f4fe0973268e, uri=http://127.0.0.1:49412/sp/sled/0/power_off, method=POST, remote_addr=127.0.0.1:49413, local_addr=127.0.0.1:49412, error_message_external="Internal Server Error", error_message_internal="timeout elapsed", response_code=500)

It looks like the timeout was at most 487ms. It may be that needs to be bumped up, at least for the test suite?

It'd also be helpful if the internal message said what the timeout was for and how long it waited.

@jgallagher does it make sense for you to take a look at this one?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions