Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed' #3619

Closed
abdelmonem17 opened this issue Mar 18, 2021 · 16 comments · Fixed by #4044
Closed
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time

Comments

@abdelmonem17
Copy link

hello everyone I am using project called toshi from crate.io and it uses tokio which works well under windows 10 but when I compiled and run it under VM 'Ubuntu 20.04.2 LTS' always after few seconds the program stop responses and gives me a panic like the following

thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /home/my-user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.2.0/src/time/driver/mod.rs:260:9

this error also exist in the old version that uses old version of tokio and the output of the panic is:

thread 'tokio-runtime-worker' panicked at 'elapsed=71666; when=71665', /home/my-user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.25/src/time/wheel/mod.rs:223:9
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

@abdelmonem17 abdelmonem17 added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Mar 18, 2021
@Darksonn Darksonn added the M-time Module: tokio/time label Mar 18, 2021
@carllerche
Copy link
Member

Hmm, it looks like this is due to Instant::now() returning past values, which should not be possible. Looking at the code, I don't see any other possible reason short of memory corruption which seems unlikely. Can you see if you can get your VM to produce an Instant::now() that decrements? It may require involving multiple threads.

@taiki-e
Copy link
Member

taiki-e commented Mar 19, 2021

I guess this is the same issue as rust-lang/rust#60475.

  • On Windows, timer is not monotonic, so std forces Instant::now to be monotonic.
  • On Linux, timer is monotonic, so std does not force Instant::now to be monotonic.
  • VM is probably implemented to use the host OS's timer. So Linux's timer on the VM is the same as the Windows's timer, and it is not monotonic. (Or it's not monotonic due to a bug.) But since the OS is Linux, std does not force Instant::now to be monotonic. As a result, Instant::now may returns past values.

@carllerche
Copy link
Member

Yeah, seems likely. I guess we should just accept Instant::now() can be buggy and defend against it.

@abdelmonem17
Copy link
Author

@carllerche , @taiki-e is there a workaround to fix this issue ?

@Darksonn
Copy link
Contributor

You could try to remove the assert that panicked, but it might fail in other less obvious ways. The code wasn't written to behave properly when time goes backwards.

@abdelmonem17
Copy link
Author

@Darksonn I download the crate and remove the line but I have a panic because it fails that the forked library is different runtime than tokio.

@Darksonn
Copy link
Contributor

You need to use [patch.crates-io] in order for the library to recognize your fork as the right version of Tokio.

@abdelmonem17
Copy link
Author

@Darksonn thank you for your help.

@taiki-e
Copy link
Member

taiki-e commented May 20, 2021

related rust-lang/rust#84448

@jgrund
Copy link

jgrund commented May 21, 2021

FWIW, I'm seeing this happen across multiple tokio instances quite frequently.

My host is CentOS 7.9.2009 and my guests are CentOS 7.9.2009 running in VirtualBox.

My current_clocksource on both the host and the guests are tsc.

[root@node1 ~]# clush -ab cat /var/log/messages | grep panic
May 22 11:31:12 node3 emf-device-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 10:01:57 node4 emf-snapshot-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 00:55:28 node5 emf-ostpool-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 07:31:33 node5 emf-ntp-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 05:31:12 node6 emf-host-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 05:52:47 node6 emf-network-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 06:19:52 node6 emf-device-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9
May 22 05:15:36 node8 emf-corosync-agent: thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.0/src/time/driver/mod.rs:267:9

@carllerche
Copy link
Member

carllerche commented May 25, 2021

@jgrund do you want to submit a PR that avoids the panic and instead ignores Instant::now() values that are in the past?

Feel free to ping someone in #tokio-dev on Discord (https://discord.gg/tokio) if you need guidance.

@jgrund
Copy link

jgrund commented Jun 15, 2021

@carllerche It turned out this was actually useful in my case. It led me to update some kernel parameters in my vm which stopped the clock from warping (which was effecting other processes as well).

@carllerche
Copy link
Member

@jgrund Do you want to submit a PR that improves the assertion message? Maybe explain the problem + include links (e.g. rust-lang/rust#84448) and help on how to fix it?

@NeoLegends
Copy link
Contributor

Is this something where the std should be patched to also defend against non-monotonic clocks on Linux?

@Darksonn
Copy link
Contributor

Darksonn commented Sep 2, 2021

It is a bug in std that this can happen, yes.

@the8472
Copy link

the8472 commented Oct 30, 2021

It is a bug in std that this can happen, yes.

Well technically it is a kernel or hypervisor bug and should be reported upstream to whatever vendor is causing this. std relies on CLOCK_MONOTONIC guarantees from the OS and the OS may either perform the monotonization itself or rely on the hypervisor to signal clock capabilities correctly.

@carllerche It turned out this was actually useful in my case. It led me to update some kernel parameters in my vm which stopped the clock from warping (which was effecting other processes as well).

Yeah, picking an unreliable clock source manually could also cause this. By default the kernel should pick a reliable one. If it isn't that should be reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants