Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(console): fix calculation of busy time during poll #405

Merged
merged 3 commits into from
Mar 30, 2023

Conversation

hds
Copy link
Collaborator

@hds hds commented Mar 30, 2023

The Console API specifies sending task busy duration only for completed
polls, it doesn't include the time spent in the current poll if the task
is active.

Tokio Console then calculates the busy time including the time spent in
the current poll - based on the last poll start and poll end times sent
by the Console Subscriber.

However, there was an error in the logic which determined when a task is
being polled for the purpose of calculating the busy time. The logic
only considered the first poll, when there was no recorded end poll time
at all.

This change adapts the logic so that it also considers the case where
the last recorded poll start is later than the last recorded poll end.
This indicates that the task is being polled.

PR Notes

You can see the current behavior by creating a task which blocks for a
long time (say std::thread::sleep for 5 seconds) in a loop, yielding
in between. Setting the console subscriber interval to 100ms makes
it easier to see.

In console we see the Total Time going up gradually, but the
Busy time stays the same for 5 seconds and then jump up. The
busy percentage jumps up and down as the busy time is incorrect.

Here's a GIF!

long-sleep-busy-jump

After the fix, both times move smoothly and the busy percentage
remains constant as it should.

Another GIF!

long-sleep-fixed

The Console API specifies sending task busy duration only for completed
polls, it doesn't include the time spent in the current poll if the task
is active.

Tokio Console then calculates the busy time including the time spent in
the current poll - based on the last poll start and poll end times sent
by the Console Subscriber.

However, there was an error in the logic which determined when a task is
being polled for the purpose of calculating the busy time. The logic
only considered the first poll, when there was no recorded end poll time
at all.

This change adapts the logic so that it also considers the case where
the last recorded poll start is later than the last recorded poll end.
This indicates that the task is being polled.
@hds hds requested a review from a team as a code owner March 30, 2023 15:54
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix, good catch!

i left some style suggestions that might be a little clearer, although i'm kind of on the fence about it. let me know if you think my suggestion is easier to understand --- i'm happy to merge the current code, but i thought i'd mention it.

Comment on lines 331 to 332
(Some(started), Some(ended)) if started > ended => busy_time_in_poll(started),
(Some(started), _) => busy_time_in_poll(started),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, i think we could combine these two match arms, since Option<T> also implements PartialOrd when T: PartialOrd, and None compares less than Some...so we could write

Suggested change
(Some(started), Some(ended)) if started > ended => busy_time_in_poll(started),
(Some(started), _) => busy_time_in_poll(started),
(Some(started), ended) if Some(started) > ended => busy_time_in_poll(started),

but on the other hand, I think having two match arms might be more understandable, so I'm not sure if this change would actually be better...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, thinking about it, this would allow us to get rid of the busy_time_in_poll closure, so this whole function could be

    pub(crate) fn busy(&self, since: SystemTime) -> Duration {
        if self.started > self.ended {
            debug_assert!(
                self.started.is_some(),
                "if `started` is greater than `ended`, it should always be `Some`",
            );
            if let Some(started) = started {
                // the task is currently being polled
                let current_time_in_poll = since.duration_since(started).unwrap_or_default();
                return self.stats.busy + current_time_in_poll;
            }
        }

        self.stats.busy
    }

or something...do you think that's substantially clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the debug_assert! could also just be an expect:

    pub(crate) fn busy(&self, since: SystemTime) -> Duration {
        if self.started > self.ended {
            // the task is currently being polled
            let started = self.started
                .expect("if `started` is greater than `ended`, it should always be `Some`");
            let current_time_in_poll = since.duration_since(started).unwrap_or_default();
            return self.stats.busy + current_time_in_poll;
        }

        self.stats.busy
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this variant on your second option? It's a little weird to safely unwrap the started time and then use the Option, but it seemed a bit cleaner and doesn't require the .expect or the debug_assert!.

    pub(crate) fn busy(&self, since: SystemTime) -> Duration {
        if let Some(started) = self.stats.last_poll_started {
            if self.stats.last_poll_started > self.stats.last_poll_ended {
                // in this case the task is being polled at the moment
                let current_time_in_poll = since.duration_since(started).unwrap_or_default();
                return self.stats.busy + current_time_in_poll;
            }
        }
        self.stats.busy
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yeah, that's nicer. good call.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove the commented out code? otherwise, looks great!

tokio-console/src/state/tasks.rs Outdated Show resolved Hide resolved
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
@hawkw hawkw merged commit 6fa2185 into tokio-rs:main Mar 30, 2023
11 checks passed
@hds hds deleted the busy-time-in-poll branch March 31, 2023 08:45
hawkw added a commit that referenced this pull request Sep 29, 2023
The Console API specifies sending task busy duration only for completed
polls, it doesn't include the time spent in the current poll if the task
is active.

Tokio Console then calculates the busy time including the time spent in
the current poll - based on the last poll start and poll end times sent
by the Console Subscriber.

However, there was an error in the logic which determined when a task is
being polled for the purpose of calculating the busy time. The logic
only considered the first poll, when there was no recorded end poll time
at all.

This change adapts the logic so that it also considers the case where
the last recorded poll start is later than the last recorded poll end.
This indicates that the task is being polled.

Co-authored-by: Eliza Weisman <eliza@buoyant.io>
hawkw added a commit that referenced this pull request Sep 29, 2023
# Changelog

All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## tokio-console-v0.1.10 - (2023-09-29)

[7d009f8](https://github.com/tokio-rs/console/commit/7d009f87120ce0c89f5f9c5311f05b6756ca770f)...[7d009f8](https://github.com/tokio-rs/console/commit/7d009f87120ce0c89f5f9c5311f05b6756ca770f)

### <a id = "tokio-console-v0.1.10-breaking"></a>Breaking Changes
- **Update Tonic and Prost dependencies ([#364](#364 ([f9b8e03](https://github.com/tokio-rs/console/commit/f9b8e03bd7ee1d0edb441c94a93a350d5b06ed3b))<br />This commit updates the public dependencies `prost` and `tonic` to
semver-incompatible versions (v0.11.0 and v0.8.0, respectively). This is
a breaking change for users who are integrating the `console-api` protos
with their own `tonic` servers or clients.
- **Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464 ([96e62c8](https://github.com/tokio-rs/console/commit/96e62c83ef959569bb062dc8fee98fa2b2461e8d))<br />This is a breaking change for users of `console-api` and
`console-subscriber`, as it changes the public `tonic` dependency to a
semver-incompatible version. This breaks compatibility with `tonic`
0.9.x and `prost` 0.11.x.

### Added

- [**breaking**](#tokio-console-v0.1.10-breaking) Update Tonic and Prost dependencies ([#364](#364)) ([f9b8e03](f9b8e03))
- Only suggest opening issues for panics ([#365](#365)) ([da2a89c](da2a89c))
- Init error handling before subcmds ([#365](#365)) ([ec66eda](ec66eda))
- Filter out boring frames in backtraces ([#365](#365)) ([95a5e54](95a5e54))
- Include config options in autogenerated issues ([#365](#365)) ([3244a1f](3244a1f))
- Reduce decimal digits in UI ([#402](#402)) ([c13085e](c13085e))
- Use tokio task ids in task views ([#403](#403)) ([f5b06d2](f5b06d2))
- Add support for Unix domain sockets ([#388](#388)) ([a944dbc](a944dbc), closes [#296](#296))
- Add scheduled time per task ([#406](#406)) ([f280df9](f280df9))
- Add task scheduled times histogram ([#409](#409)) ([d92a399](d92a399))
- Update `tonic` to 0.9 ([#420](#420)) ([48af1ee](48af1ee))
- Update MSRV to Rust 1.60.0 ([b18ee47](b18ee47))
- Migrate to `ratatui` and update `crossterm` ([#425](#425)) ([b209dd6](b209dd6))
- Help view modal ([#432](#432)) ([359a4e7](359a4e7))
- Add way to inspect details of task from resource view ([#449](#449)) ([132ed4e](132ed4e), closes [#448](#448))
- Add warning for tasks that never yield ([#439](#439)) ([d05fa9e](d05fa9e))
- [**breaking**](#tokio-console-v0.1.10-breaking) Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464)) ([96e62c8](96e62c8))

### Documented

- Update screenshots in README ([#419](#419)) ([e9bcd67](e9bcd67))
- Revert "update screenshots in README ([#419](#419))" ([993a3d9](993a3d9))
- Update screenshots in README ([#421](#421)) ([8a27f96](8a27f96))
- Add column descriptions for all tables ([#431](#431)) ([e3cf82b](e3cf82b))
- Update MSRV version docs to 1.64 ([#467](#467)) ([94a5a51](94a5a51))

### Fixed

- Fix ascii-only flipped input ([#377](#377)) ([652ac34](652ac34))
- Declare `tokio-console` bin as `default-run` ([#379](#379)) ([9ce60ec](9ce60ec))
- Make `retain_for` default to 6s if not specfied ([#383](#383)) ([0a6012b](0a6012b), fixes [#382](#382))
- Enable view-switching keystrokes on details views ([#387](#387)) ([f417d7a](f417d7a))
- Fix `ViewOptions` default lang' ([#394](#394)) ([a1cf1b8](a1cf1b8), fixes [#393](#393))
- Remove `tracing-subscriber` 0.2 from dependencies ([#404](#404)) ([768534a](768534a))
- Fix calculation of busy time during poll ([#405](#405)) ([e2c536a](e2c536a))
- Remove histogram minimum count ([#424](#424)) ([02cf8a6](02cf8a6))
- Remove trailing space from task/resource location ([#443](#443)) ([90e5918](90e5918))
- Make long locations readable ([#441](#441)) ([9428d7f](9428d7f), closes [#411](#411))
- Fix task detail view Id to display remote tokio::task::Id ([#455](#455)) ([70c3952](70c3952))

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants