Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocking exhausts Native threads #845

Closed
DamianReeves opened this issue May 11, 2019 · 6 comments · Fixed by #889
Closed

Blocking exhausts Native threads #845

DamianReeves opened this issue May 11, 2019 · 6 comments · Fixed by #889
Labels
enhancement New feature or request good first issue Good for newcomers
Milestone

Comments

@DamianReeves
Copy link

We have ran into an issue while load testing an implementation of our system that uses ZIO 1.0-RC4 where at the end of a long run we start failing because of OutOfMemoryErrors, which seem to report happening when trying to create a thread in the blocking thread pool.

Analysis seems to point to the blocking pool not reusing threads and us getting into a situation where the observed behavior of the Blocking pool seems to be equivalent to calling NewThread while under load.

I suggest 2 courses of action:

  1. Provide a built-in way for users to create a Blocking instance that is just like the Blocking.Live instance, but with the ability to adjust the keepAliveTime, maximumPoolSize, and corePoolSize these are all options a user might want to drive from configuration for example.
  2. Look into the default configuration of Blocking. Consider whether it actually does make sense to set the maxPoolSize to Int.MaxValue
@jdegoes
Copy link
Member

jdegoes commented May 11, 2019

We would like to provide a pleasant, safe out of the box experience that voids these types of issues. It seems like tweaking defaults to the blocking thread pool is warranted... /cc @Kaishh

@jdegoes jdegoes added this to the 1.0.0 milestone May 11, 2019
@jdegoes jdegoes added the enhancement New feature or request label May 11, 2019
@neko-kai
Copy link
Member

Note, default Blocking.Live does reuse threads – corePoolSize is the parameter that regulates thread reuse, not maxPoolSize – a finite max parameter with SynchronousQueue would just block the caller until a thread is available which is a bit different, there is a test for working caching too (although currently disabled, I just it ran it locally to confirm – https://github.com/scalaz/scalaz-zio/blob/6c7506d380a01a2a2ef6ff8227f4d891a504a917/core/jvm/src/test/scala/scalaz/zio/RTSSpec.scala#L1290)
I think just bumping keepAliveTime from 1 sec to 60 sec, like monix does, may ensure more frequent thread caching.
If you could try to make an example that results in the same behavior that would greatly help pinpoint what exactly is happening. It may be that somehow the threads are not returned back to the pool after the call finishes (or your parallelism level may be so high as to immediately exhaust OS limit, you've mentioned using a Queue to limit workers, but maybe you could try a Semaphore?)

@neko-kai
Copy link
Member

neko-kai commented May 13, 2019

So, the real cause of this error was generation of a new scheduled thread pool on new Clock.Live that was mistakenly used instead of a global Clock.Live:

Damian Reeves @DamianReeves May 13 20:46
So we see reported in the profiler a bunch of zio-timer-1-* threads all in the parked state
processingTimeout is 100s
So if I look at the number of Live Object instances I see almost 5000
java.util.concurrent.ScheduledThreadPoolExecutor instances in the Java VisualVM

Damian Reeves @DamianReeves 21:43
OMG!
This is stupid user error

Damian Reeves @DamianReeves 21:43
Yes
I have a call that did had this in it:
.provide(new Clock.Live())
That was directly in my processing flow

Damian Reeves @DamianReeves 21:44
I feel so silly

What we could do to prevent the same mistake is to make the Live default mixins all use globals instead of being generative. Don't know how I feel about that, that may make sense, but may be a step back. I remember Blocking.Live redirected to a global some time ago and that annoyed me – but currently it creates a new pool every time, although I may be misremembering completely.

@neko-kai
Copy link
Member

Actually, the problem is more plainly stated as: Clock.Live, Scheduler.Live and Blocking.Live constructors are all not pure and this is the main cause of trip-up, had they been wrapped in IO, that would make it obvious that a new thread pool allocation happens. We can wrap all these constructors, although to allow cake pattern to work they'd probably need to be in CPS form...

jdegoes added a commit to jdegoes/zio that referenced this issue May 19, 2019
@jdegoes
Copy link
Member

jdegoes commented May 19, 2019

Oops, I forgot Blocking.Live. Re-opening.

@jdegoes
Copy link
Member

jdegoes commented May 24, 2019

Fixed for Blocking.Live as well. Now repeated construction of modules does not allocate new pools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants