Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear task stream based on recent behavior #3200

Merged
merged 5 commits into from
Nov 14, 2019

Conversation

mrocklin
Copy link
Member

@mrocklin mrocklin commented Nov 6, 2019

Alternative to #3190 cc @dickreuter

I tried to breifly implement my proposed solution to clearing the task stream based on recent behavior. It only runs the check if we haven't seen an update in a suitable amount of time (which is nice for performance) and then bases the decision on the current timespan of the rectangles in the plot. In principle this works fairly nicely.

However, it does have a fail case when building up a larger and larger task stream plot over time. If you start having these large gaps in the stream then things can get harder and harder to clear out. Probably we should have some other check as well that attempts to understand the amount of whitespace in the plot currently. Perhaps some measure of the sum of the durations / workers over the total timespan.

@dickreuter I mostly wanted to share this to communicate what I was trying to say in the issue earlier. Maybe it helps make my original intent more clear.

Screen Shot 2019-11-05 at 3 56 47 PM

@dickreuter
Copy link

Thanks for this. Yes I agree, your proposal basically does not clear out old runs if there are gaps. That's why I used (gap_to_previous > (avg_duration_of_visualized * self.clear_multiplier) > self.clear_interval*1000), which for my cases yielded better results. But I guess both solutions are better than what we currently have.

@mrocklin
Copy link
Member Author

Merging this in shortly if there are no objections.

@mrocklin
Copy link
Member Author

It looks like I was a factor of 1000 off in the density computation due to the s/ms difference.

Also while diving in with pbd it looks like "start" should really be renamed "middle". A lot of the logic here is off by half of the duration value.

@mrocklin
Copy link
Member Author

We're nearing a release. I'm going to go ahead and merge this in. We can refine it in the future if necessary.

@mrocklin mrocklin merged commit 886189a into dask:master Nov 14, 2019
@mrocklin mrocklin deleted the task-stream-reset branch November 14, 2019 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants