New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow EmbeddedChannel users to control the scheduler time #12433
Milestone
Comments
sounds good... Do you want to contribute something ? I think this is very similar that what SwiftNIO does: https://github.com/apple/swift-nio/blob/main/Sources/NIOEmbedded/Embedded.swift#L146 |
Yep, I'll try implementing it. The swift-nio thing indeed sounds similar |
normanmaurer
added a commit
that referenced
this issue
Jun 13, 2022
Motivation: Tests using EmbeddedEventLoop can run faster if they can "advance time" so that scheduled tasks (e.g. timeouts) run earlier. Additionally, "freeze time" functionality can improve reliability of such tests. Modification: - Introduce a protected method `AbstractScheduledEventExecutor.getCurrentTimeNanos` that replaces the previous static `nanoTime` method (now deprecated). Replace usages of `nanoTime` with the new method. - Override `getCurrentTimeNanos` with the new time control (freeze, unfreeze, advanceBy) features in `EmbeddedEventLoop`. - Add a microbenchmark that tests one of the sites that seemed most likely to see negative performance impact by the change (`ScheduledFutureTask.delayNanos`). Result: Fixes #12433. Local runs of the `ScheduleFutureTaskBenchmark` microbenchmark shows no evidence for performance impact (within error bounds of each other): ``` before: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 132.437 ± 15.116 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 694.475 ± 8.184 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 88.037 ± 4.013 ops/s after: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 149.629 ± 7.514 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 688.954 ± 7.831 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 85.426 ± 1.104 ops/s ``` The new `ScheduleFutureTaskDeadlineBenchmark` shows some performance degradation: ``` before: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 60726336.795 ± 280054.533 ops/s after: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 56948231.480 ± 188264.092 ops/s ``` The difference is small, but it's there, so I investigated this further using jitwatch. Looking at the generated assembly, the call to `getCurrentTimeNanos` is devirtualized and inlined in the absence of `EmbeddedEventLoop`, so the code is mostly identical. However there is the added getfield and checkcast for the executor, which probably explains the discrepancy. In my opinion this is acceptable, because the performance impact is not severe, this use is likely the worst case (virtual call through `scheduledExecutorService()`), and it is never as hot as in this benchmark. Note that if an `EmbeddedEventLoop` is present in the application, the performance impact is likely substantially higher, because this would necessitate a virtual call. However this is not an issue for production applications, and the affected code is still not very hot. Co-authored-by: Norman Maurer <norman_maurer@apple.com>
normanmaurer
added a commit
that referenced
this issue
Jun 13, 2022
Motivation: Tests using EmbeddedEventLoop can run faster if they can "advance time" so that scheduled tasks (e.g. timeouts) run earlier. Additionally, "freeze time" functionality can improve reliability of such tests. Modification: - Introduce a protected method `AbstractScheduledEventExecutor.getCurrentTimeNanos` that replaces the previous static `nanoTime` method (now deprecated). Replace usages of `nanoTime` with the new method. - Override `getCurrentTimeNanos` with the new time control (freeze, unfreeze, advanceBy) features in `EmbeddedEventLoop`. - Add a microbenchmark that tests one of the sites that seemed most likely to see negative performance impact by the change (`ScheduledFutureTask.delayNanos`). Result: Fixes #12433. Local runs of the `ScheduleFutureTaskBenchmark` microbenchmark shows no evidence for performance impact (within error bounds of each other): ``` before: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 132.437 ± 15.116 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 694.475 ± 8.184 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 88.037 ± 4.013 ops/s after: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 149.629 ± 7.514 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 688.954 ± 7.831 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 85.426 ± 1.104 ops/s ``` The new `ScheduleFutureTaskDeadlineBenchmark` shows some performance degradation: ``` before: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 60726336.795 ± 280054.533 ops/s after: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 56948231.480 ± 188264.092 ops/s ``` The difference is small, but it's there, so I investigated this further using jitwatch. Looking at the generated assembly, the call to `getCurrentTimeNanos` is devirtualized and inlined in the absence of `EmbeddedEventLoop`, so the code is mostly identical. However there is the added getfield and checkcast for the executor, which probably explains the discrepancy. In my opinion this is acceptable, because the performance impact is not severe, this use is likely the worst case (virtual call through `scheduledExecutorService()`), and it is never as hot as in this benchmark. Note that if an `EmbeddedEventLoop` is present in the application, the performance impact is likely substantially higher, because this would necessitate a virtual call. However this is not an issue for production applications, and the affected code is still not very hot. Co-authored-by: Norman Maurer <norman_maurer@apple.com>
normanmaurer
added a commit
that referenced
this issue
Jun 13, 2022
Motivation: Tests using EmbeddedEventLoop can run faster if they can "advance time" so that scheduled tasks (e.g. timeouts) run earlier. Additionally, "freeze time" functionality can improve reliability of such tests. Modification: - Introduce a protected method `AbstractScheduledEventExecutor.getCurrentTimeNanos` that replaces the previous static `nanoTime` method (now deprecated). Replace usages of `nanoTime` with the new method. - Override `getCurrentTimeNanos` with the new time control (freeze, unfreeze, advanceBy) features in `EmbeddedEventLoop`. - Add a microbenchmark that tests one of the sites that seemed most likely to see negative performance impact by the change (`ScheduledFutureTask.delayNanos`). Result: Fixes #12433. Local runs of the `ScheduleFutureTaskBenchmark` microbenchmark shows no evidence for performance impact (within error bounds of each other): ``` before: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 132.437 ± 15.116 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 694.475 ± 8.184 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 88.037 ± 4.013 ops/s after: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 149.629 ± 7.514 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 688.954 ± 7.831 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 85.426 ± 1.104 ops/s ``` The new `ScheduleFutureTaskDeadlineBenchmark` shows some performance degradation: ``` before: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 60726336.795 ± 280054.533 ops/s after: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 56948231.480 ± 188264.092 ops/s ``` The difference is small, but it's there, so I investigated this further using jitwatch. Looking at the generated assembly, the call to `getCurrentTimeNanos` is devirtualized and inlined in the absence of `EmbeddedEventLoop`, so the code is mostly identical. However there is the added getfield and checkcast for the executor, which probably explains the discrepancy. In my opinion this is acceptable, because the performance impact is not severe, this use is likely the worst case (virtual call through `scheduledExecutorService()`), and it is never as hot as in this benchmark. Note that if an `EmbeddedEventLoop` is present in the application, the performance impact is likely substantially higher, because this would necessitate a virtual call. However this is not an issue for production applications, and the affected code is still not very hot. Co-authored-by: Norman Maurer <norman_maurer@apple.com> Co-authored-by: Jonas Konrad <me@yawk.at>
raidyue
pushed a commit
to raidyue/netty
that referenced
this issue
Jul 8, 2022
Motivation: Tests using EmbeddedEventLoop can run faster if they can "advance time" so that scheduled tasks (e.g. timeouts) run earlier. Additionally, "freeze time" functionality can improve reliability of such tests. Modification: - Introduce a protected method `AbstractScheduledEventExecutor.getCurrentTimeNanos` that replaces the previous static `nanoTime` method (now deprecated). Replace usages of `nanoTime` with the new method. - Override `getCurrentTimeNanos` with the new time control (freeze, unfreeze, advanceBy) features in `EmbeddedEventLoop`. - Add a microbenchmark that tests one of the sites that seemed most likely to see negative performance impact by the change (`ScheduledFutureTask.delayNanos`). Result: Fixes netty#12433. Local runs of the `ScheduleFutureTaskBenchmark` microbenchmark shows no evidence for performance impact (within error bounds of each other): ``` before: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 132.437 ± 15.116 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 694.475 ± 8.184 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 88.037 ± 4.013 ops/s after: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 149.629 ± 7.514 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 688.954 ± 7.831 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 85.426 ± 1.104 ops/s ``` The new `ScheduleFutureTaskDeadlineBenchmark` shows some performance degradation: ``` before: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 60726336.795 ± 280054.533 ops/s after: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 56948231.480 ± 188264.092 ops/s ``` The difference is small, but it's there, so I investigated this further using jitwatch. Looking at the generated assembly, the call to `getCurrentTimeNanos` is devirtualized and inlined in the absence of `EmbeddedEventLoop`, so the code is mostly identical. However there is the added getfield and checkcast for the executor, which probably explains the discrepancy. In my opinion this is acceptable, because the performance impact is not severe, this use is likely the worst case (virtual call through `scheduledExecutorService()`), and it is never as hot as in this benchmark. Note that if an `EmbeddedEventLoop` is present in the application, the performance impact is likely substantially higher, because this would necessitate a virtual call. However this is not an issue for production applications, and the affected code is still not very hot. Co-authored-by: Norman Maurer <norman_maurer@apple.com>
franz1981
pushed a commit
to franz1981/netty
that referenced
this issue
Aug 22, 2022
Motivation: Tests using EmbeddedEventLoop can run faster if they can "advance time" so that scheduled tasks (e.g. timeouts) run earlier. Additionally, "freeze time" functionality can improve reliability of such tests. Modification: - Introduce a protected method `AbstractScheduledEventExecutor.getCurrentTimeNanos` that replaces the previous static `nanoTime` method (now deprecated). Replace usages of `nanoTime` with the new method. - Override `getCurrentTimeNanos` with the new time control (freeze, unfreeze, advanceBy) features in `EmbeddedEventLoop`. - Add a microbenchmark that tests one of the sites that seemed most likely to see negative performance impact by the change (`ScheduledFutureTask.delayNanos`). Result: Fixes netty#12433. Local runs of the `ScheduleFutureTaskBenchmark` microbenchmark shows no evidence for performance impact (within error bounds of each other): ``` before: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 132.437 ± 15.116 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 694.475 ± 8.184 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 88.037 ± 4.013 ops/s after: Benchmark (num) Mode Cnt Score Error Units ScheduleFutureTaskBenchmark.scheduleCancelLotsOutsideLoop 100000 thrpt 20 149.629 ± 7.514 ops/s ScheduleFutureTaskBenchmark.scheduleLots 100000 thrpt 20 688.954 ± 7.831 ops/s ScheduleFutureTaskBenchmark.scheduleLotsOutsideLoop 100000 thrpt 20 85.426 ± 1.104 ops/s ``` The new `ScheduleFutureTaskDeadlineBenchmark` shows some performance degradation: ``` before: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 60726336.795 ± 280054.533 ops/s after: Benchmark Mode Cnt Score Error Units ScheduleFutureTaskDeadlineBenchmark.requestDeadline thrpt 20 56948231.480 ± 188264.092 ops/s ``` The difference is small, but it's there, so I investigated this further using jitwatch. Looking at the generated assembly, the call to `getCurrentTimeNanos` is devirtualized and inlined in the absence of `EmbeddedEventLoop`, so the code is mostly identical. However there is the added getfield and checkcast for the executor, which probably explains the discrepancy. In my opinion this is acceptable, because the performance impact is not severe, this use is likely the worst case (virtual call through `scheduledExecutorService()`), and it is never as hot as in this benchmark. Note that if an `EmbeddedEventLoop` is present in the application, the performance impact is likely substantially higher, because this would necessitate a virtual call. However this is not an issue for production applications, and the affected code is still not very hot. Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It would be useful if EmbeddedChannel provided a way to manage the scheduler time (e.g. freeze or advance the nanoTime used internally). This would allow race-free testing of time-based algorithms.
I want to try working on this, this issue is to make sure it's not a stupid idea that will be rejected :)
Constraints are:
Expected behavior
Tests using EmbeddedChannel should be able to (1) stop scheduled tasks from executing too early even if they're scheduled very tightly, and (2) execute future tasks earlier than their nominal deadline.
Actual behavior
Tests have to use Thread.sleep to wait for scheduled tasks to reach their deadline.
Netty version
4.1.77.Final
The text was updated successfully, but these errors were encountered: