Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change cull_idle_timeout_minimum to 1 from 300 #2910

Merged
merged 3 commits into from
Nov 1, 2017

Conversation

mheilman
Copy link
Contributor

@mheilman mheilman commented Oct 9, 2017

I ran into a situation where it'd be nice to be able to set cull_idle_timeout below 5 minutes, but it looks like that's not possible currently.

This is related to #2215

How would people feel about changing the minimum to 1 second? I didn't see any comments suggesting this would be a bad idea in #2215, but I'm probably not aware of all the uses of this feature.

@parente
Copy link
Member

parente commented Oct 10, 2017

I think we decided on five minutes so that the culler didn't run too frequently. If the cull_idle_timeout minimum is reduced, cull_interval_default and and cull_interval need to be adjusted accordingly. At face value, it seems potentially harmful to allow the cull callback to run every second.

@takluyver
Copy link
Member

What's the situation where you want a shorter timeout? It sounds like it might be quite different from the use cases we were thinking of.

@mheilman
Copy link
Contributor Author

What's the situation where you want a shorter timeout?

Basically, I have a server that multiple people might connect to, and I don't want to have a bunch of kernels sitting around idle taking up resources if people disconnect. 5 minutes seems like a pretty long timeout. I'd like to set it to a minute or so (same with cull interval), but I figured that if I was going to propose changing 300 to 60 or so, then I might as well propose changing the minimum to 1 second.

If the cull_idle_timeout minimum is reduced, cull_interval_default and and cull_interval need to be adjusted accordingly.

Hmm... Perhaps I'm misreading the code. Isn't the default timeout infinity (represented by 0)? Why would other defaults need to be changed?

Thanks for taking a look at this.

@takluyver
Copy link
Member

I think the timeout is based on kernel activity, so if you stop to read a notebook for a minute, the kernel would be killed and you'd lose all your variables. That seems like it would be annoying. If it's really what you want, feel free to decrease the minimum, but let's leave it at some vaguely sensible value like 60 - I can't see a 1 second timeout ever making sense.

@mheilman
Copy link
Contributor Author

IIUC, kernels only get culled if all of the following conditions are satisfied (code):

  • they've been idle for longer than the timeout
  • they're not currently busy
  • they don't have any active connections

As long as you have your browser open to a notebook, the notebook app will keep the kernel running, even if the kernel is idle for an hour.

I tried the following two commands with my PR version. The first works just like a regular notebook in practice (i.e., the kernel keeps running even if I don't execute code for a while). The latter command results in kernels getting culled unless they're executing code every few seconds.

jupyter notebook --MappingKernelManager.cull_idle_timeout=5 --MappingKernelManager.cull_interval=5
jupyter notebook --MappingKernelManager.cull_idle_timeout=5 --MappingKernelManager.cull_connected=True --MappingKernelManager.cull_interval=5

@takluyver
Copy link
Member

Ah, OK, I'd missed that it tracked the connection as well. In that case, I think it's fine to have a very low minimum.

@takluyver
Copy link
Member

Though it might be worth putting a note in the docstring that with a very short timeout, users with connection problems may also find that their kernel gets killed.

@mheilman
Copy link
Contributor Author

Ah, good point. How's the note in f8ef573 look?

@takluyver
Copy link
Member

That looks good, thanks.

@mheilman mheilman changed the title change cull_idle_timeout_minimum to 1 from 500 change cull_idle_timeout_minimum to 1 from 300 Oct 19, 2017
lower disable culling. The minimum timeout is 300 seconds (5 minutes). Positive values less than the minimum value
will be set to the minimum."""
lower disable culling. The minimum timeout is 1 second. Positive values less than the minimum value
will be set to the minimum. Very short timeouts may result in kernels being culled for users with poor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: as this is an integer option, there are no longer meaningful positive values less than the minimum. I'd drop that sentence.

@mheilman
Copy link
Contributor Author

Looking at this again, I think that if the minimum is 1, and since it's an integer as @takluyver points, out, then cull_idle_timeout_minimum isn't actually necessary because cull_idle_timeout is only used when it's > 0. Am I missing something?

@takluyver
Copy link
Member

I think you're right, but I'd like another pair of eyes on the code. @kevin-bates implemented the culling in #2215, and @Carreau and @minrk asked for the minimum in review. What do you think of getting rid of the minimum on the timeout, as being connected to a kernel now keeps it alive?

@kevin-bates
Copy link
Member

The minimum was important at the time because the original implementation didn't discriminate against connect (or busy) status. Now that those distinctions exist, I think the timeout minimum can be lifted.

However, I tend to agree with @parente that an interval minimum is still warranted. If there are enough active kernels such that checking all for idle timeout exceeds the culling interval, then it seems as though the server could get into a pathological state of only culling (if I'm understanding how tornado works).

What about lifting the timeout minimum, but introduce an interval minimum for the sole purpose of saturation prevention? Yes, the interval minimum would then act as a logical timeout value. However, this would also help satisfy the connection issues condition, which could be considered another pathological state example.

I guess I'm thinking we should impose at least a 10 second interval minimum.

@mheilman
Copy link
Contributor Author

If there are enough active kernels such that checking all for idle timeout exceeds the culling interval

Interesting point. Does checking for idleness take that long? I was under the assumption that the idleness check would take milliseconds, such that the overhead wouldn't be a big issue in practice.

@kevin-bates
Copy link
Member

kevin-bates commented Oct 20, 2017

The typical sweep will be on the order of milliseconds. However, when there are timeouts to deal with, then the sweep duration could exceed an unconstrained interval since the kernel shutdown itself could take seconds (e.g., if the kernel doesn't immediately detect the shutdown request via the ZMQ socket - after which the server will begin its kernel signalling sequence). I hadn't actually thought about this case until now and probably argues for an even longer min interval than 10 seconds although I also lean toward Give 'em enough rope....

@takluyver takluyver added this to the 5.3 milestone Oct 31, 2017
@takluyver
Copy link
Member

Thanks @kevin-bates . That makes sense, and I think it can be done as a separate PR.

@Carreau @minrk - I'll merge this tomorrow unless you want more time to review it.

@takluyver takluyver merged commit bfe012e into jupyter:master Nov 1, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants