Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] overflow when ttl is large #52

Closed
bddap opened this issue Dec 2, 2021 · 7 comments · Fixed by #56
Closed

[BUG] overflow when ttl is large #52

bddap opened this issue Dec 2, 2021 · 7 comments · Fixed by #56
Assignees
Labels
bug Something isn't working good first issue Good for newcomers
Milestone

Comments

@bddap
Copy link

bddap commented Dec 2, 2021

The following panics:

#[tokio::test]
async fn large_ttl() {
    let cache: moka::future::Cache<(), ()> = moka::future::CacheBuilder::new(usize::MAX)
        .time_to_live(Duration::MAX)
        .build();
    cache.insert((), ()).await;
    cache.sync();
}
panicked at 'overflow when adding duration to instant',
.cargo/registry/src/github.com-1ecc6299db9ec823/quanta-0.9.3/src/instant.rs:142:14
@tatsuya6502 tatsuya6502 self-assigned this Dec 2, 2021
@tatsuya6502 tatsuya6502 added the bug Something isn't working label Dec 2, 2021
@barkanido
Copy link
Contributor

Is this a good first issues for newcomers? I would like to try an contribute here

@tatsuya6502
Copy link
Member

tatsuya6502 commented Dec 12, 2021

@barkanido — Sorry for not getting back to you sooner. Yes. I believe this is a good first issue.

Here are my initial thoughts. If you have better idea, please let me know.

I actually think Moka should panic when a Duration is too big to handle. Moka should not support such a big duration because giving Duration::MAX to time_to_live might be a programming error as it is about 584 billion yeas (doc). I think supporting up to 1,000 years will be far longer than required.

However, I think there are some rooms to improve:

  • Release build: I have not tried by myself but if you run the above code with --release flag, Moka will not panic but silently overflow. (So the time will wrap from future to past)
    • I think it should panic in release build too. We could use checked_add instead of + to ensure that overflow will be always detected.
  • Better panic message: We could make the panic message more informative (e.g. telling panicing is an intended behavior) than the current message: "panicked at 'overflow when adding duration to instant'".
  • Document panic (optional): It might be good if we mention about panic in the API doc.

Thanks!

@barkanido
Copy link
Contributor

barkanido commented Dec 12, 2021

sounds reasonable to me.
I now have a passing test:

    #[tokio::test]
    #[should_panic(expected = "overflow when adding duration to instant")]
    async fn ttl_overflow() {
        // https://github.com/moka-rs/moka/issues/52
        let cache:Cache<(),i32> = CacheBuilder::new(usize::MAX)
            .time_to_live(Duration::MAX)
            .build();
        cache.insert((), 42).await;
        cache.sync();
    }

thoughts:

  • Do you think it is better to fail early in the build phase to totally prevent this? Or merely documenting this is a good enough user experience?
  • Is there a similar cache in the sync cache?

[edit]

I think that the problem also happens in both sync and unsync versions as well. however, testing this depends on a real clock and I think that since everything is so fast (less than a nanosecond) the addition does not overflow. Indeed, this test (for sync cache) also fails:

    #[test]
    fn ttl_overflow() {
        let cache = CacheBuilder::new(usize::MAX)
            .time_to_live(Duration::new(u64::MAX, 1_000_000_000 - 1))
            .build();
        cache.insert((), 42);
        sleep(Duration::from_millis(1000)); // funny changing this to 1 passes the test
        let value = cache.get(&());
        assert_eq!(value.unwrap(), 42);
    }

so maybe is it wiser to unify all additions to quanta::Instant checked somehow?

@bddap bddap closed this as completed Dec 12, 2021
@bddap bddap reopened this Dec 12, 2021
@tatsuya6502 tatsuya6502 assigned barkanido and unassigned tatsuya6502 Dec 14, 2021
@tatsuya6502 tatsuya6502 added the good first issue Good for newcomers label Dec 14, 2021
@tatsuya6502
Copy link
Member

tatsuya6502 commented Dec 14, 2021

@barkanido — Thanks for working on it.

Do you think it is better to fail early in the build phase to totally prevent this? Or merely documenting this is a good enough user experience?

Yeah. Probably it is better to fail in build phase. e.g. Make CacheBuilder's time_to_live and time_to_idle to panic when the given Duration is longer than 1,000 years.

Then maybe we do not have to use checked_add anymore? Hmm. I am not sure.

I am not sure because Instant is not a real-time clock (wall clock), but a monotonically increasing clock from some unspecified point in the past. I think, on many major platforms including Linux, Instants represent nanoseconds that the system has been running since it was booted. So they will not be very big and will never overflow when adding a Duration <= 1,000 years. But this is not guaranteed on all platforms that Rust might support. So we would better to use checked_add?


however, testing this depends on a real-time clock and I think that since everything is so fast (less than a nanosecond) the addition does not overflow.

Oh. Sorry I did not tell, but please use a mocked clock from Quanta for testing (doc). Moka has support for it. See time_to_live test here for details.


sleep(Duration::from_millis(1000)); // funny changing this to 1 passes the test

It is because the eviction/expiration process will run at 500 milliseconds after the cache creation. Then it will run every 300 milliseconds. (code)

For testing, you want to stop the repeating eviction/expiration process by calling reconfigure_for_testing. Then call sync() twice maybe once after inserting a cache entry. See the time_to_live test above for details.


so maybe is it wiser to unify all additions to quanta::Instant checked somehow?

I think so. You can create a function in moka::common module or somewhere else. The function will do checked_add and panic! when fail. Then search the source code for ts + *ttl and ts + *tti (I believe there are four of them), and change them to call the function.

@tatsuya6502
Copy link
Member

Then call sync() twice maybe once after inserting a cache entry.

It might help to read the Implementation Details section in the doc to understand how the expiration is handled.

Here is a summary:

  1. When an entry (key-value) is inserted to the hash map, a recording of a write will be pushed to the write-log inter-thread channel.
  2. When sync is called,
    1. the recording will be popped from the channel, and entries will be created for the key on the write-order queue and access-order queue.
    2. then, the entries in the write-order and access-order queues will be checked if they has been expired. The overflow can occur at the check.

@barkanido
Copy link
Contributor

Thanks, @tatsuya6502 .

I have implemented it type-driven by wrapping quanta::Instant in a wrapper type that only allows for checked additions. This both guarantees current implementations work on all platforms (as per your concerns) and also guards against future unchecked additions. currently, this forces the caller to handle the overflow issue (handling None), hence the panic code is implemented at the call site. A matter of style. Tell me if you want that to be changed.

About the max allowed value, Redis allows max int of seconds (which is 136 years) and Aerospike allows for a max of 10 years (just as a reference). So to me, 1000 years is a bit too much as a Durationbut it certainly fits in a Duration (only 35 bits):

    let year_secs: u64 = 365 * 24 * 3600;
    let d3 = Duration::from_secs(1000 * year_secs);

@tatsuya6502
Copy link
Member

tatsuya6502 commented Dec 16, 2021

About the max allowed value, Redis allows max int of seconds (which is 136 years) and Aerospike allows for a max of 10 years (just as a reference).

Thank you for the information. Good to know.

@tatsuya6502 tatsuya6502 added this to the v0.6.2 milestone Dec 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants