Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load replay schedule for max_steps failures #85

Open
chc4 opened this issue Nov 16, 2022 · 2 comments
Open

Can't load replay schedule for max_steps failures #85

chc4 opened this issue Nov 16, 2022 · 2 comments

Comments

@chc4
Copy link

chc4 commented Nov 16, 2022

I have a shuttle test for a crate I wrote. It occasionally hits a deadlock that shuttle reports via hitting "exceeded max_steps bound ". It gives a (very big) failing schedule that I should pass to replay in order to reproduce the issue.

The problems are two fold:

  1. replay_from_file can't load the outputted schedule string, always panicking with "invalid schedule"
  2. reducing the max_steps via a custom Config.max_steps so that the schedule is able to be embedded as an argument to replay directly ends with shuttle erroring out with "expected context switch but next schedule step is random choice".

This unfortunately makes shuttle kind of useless for trying to fix this bug, since I can't exercise the reported deadlock to try and debug it under gdb or something to get a stacktrace of the stuck thread.

@jamesbornholt
Copy link
Member

jamesbornholt commented Nov 18, 2022

Yeah, I've had a similar experience with large failures being very unwieldy and failing to deserialize when replaying, but never had a chance to track it down.

Have you tried configuring the test to persist failures directly to a file, and then replaying from that? Something like this:

let scheduler = shuttle::scheduler::RandomScheduler::new(1000);
let config = shuttle::Config {
    failure_persistence: shuttle::FailurePersistence::File(None),
    ..Default::default()
};
let runner = shuttle::Runner::new(scheduler, config);
runner.run(|| my_test());

That might at least behave a little better.


Your second problem sounds like a potential non-determinism issue, but without being able to reliably replay the original failure it's tricky to be sure.

@chc4
Copy link
Author

chc4 commented Nov 18, 2022

I think you're right that the second issue was just non-determinism: I accidentally was using real rand instead of shuttle::rand somewhere else at that commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants