-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced global ContextShift/Timer instances on JVM #547
Conversation
Codecov Report
@@ Coverage Diff @@
## master #547 +/- ##
=========================================
- Coverage 89.9% 89.4% -0.51%
=========================================
Files 72 73 +1
Lines 2090 2104 +14
Branches 127 125 -2
=========================================
+ Hits 1879 1881 +2
- Misses 211 223 +12 |
def newThread(r: Runnable): Thread = { | ||
val back = new Thread(r) | ||
back.setName(s"ioapp-compute-${ctr.getAndIncrement()}") | ||
back.setDaemon(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I continue to dislike daemon threads for important work. IOApp.WithContext
is the existing compromise, though it's inconvenient enough I'm not so vigilant about using it.
I don't know how to avoid the threads and shut down the context properly without a breaking change. This comment may be more of a 3.0 discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this. I think it's okay for these to be daemon specifically because we have other control flow mechanisms which guarantee everything gets wound down before exiting under normal circumstances. In the event that something is exiting without tracing through the control flow graph all the way to the end of IOApp
, then something horrible has happened (like OOM), and we want the threads to just die.
TLDR, I think that threads blocking shutdown is a feature that isn't relevant or necessary in applications constructed with explicit monadic control flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I suppose we can say that any service that relies on its thread pools' graceful shutdown needs to be run on a separate, non-daemon pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
case t: Throwable => | ||
// under most circumstances, this will work even with fatal errors | ||
t.printStackTrace() | ||
System.exit(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why -1 (which results in 255, which is to some sources an "exit status out of range"? I think 1 is more reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change.
try { | ||
r.run() | ||
} catch { | ||
case NonFatal(t) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One false positive an http4s user ran into recently: rendering a very large String causes the charset encoder to allocate an array greater than Int.MaxValue and throws an OutOfMemoryError("Requested array size exceeds VM limit")
but everything (but that response) is fine. It's irritating that the JVM throws this non-fatal error as the most inarguably fatal of error types.
I think this approach is more correct than not, but I'm interested to hear if there are other horror stories of non-fatal errors that aren't caught by this. The System.exit
raises the cost of false negatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. Also that's a good example of an OOM which is legit recoverable. Another false-positive, as a note, could be AssertionError
. It's another problem that I went back and forth on.
I think the thing is that, the moment some sort of truly fatal error happens, I really do want to know it and explode spectacularly. The worst feeling in the world is knowing that your system probably went OOM (or had a link error, or such), but it was swallowed and now your only way to figure it out is to dump the running system and look for orphans.
// we can initialize this eagerly because the enclosing object is lazy | ||
val ioAppGlobal: ExecutionContext = { | ||
// lower-bound of 2 to prevent pathological deadlocks on virtual machines | ||
val bound = math.max(2, Runtime.getRuntime().availableProcessors()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// lower-bound of 2 to prevent pathological deadlocks on virtual machines | ||
val bound = math.max(2, Runtime.getRuntime().availableProcessors()) | ||
|
||
val executor = Executors.newFixedThreadPool(bound, new ThreadFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the basic argument against ForkJoin that most people aren't doing divide-and-conquer tasks, and therefore we're all paying for the overhead of the layered queues without reaping its rewards? I intuitively agree, but Scala standard library chose otherwise, and the ForkJoinPool
javadoc claims:
efficient processing ... when many small tasks are submitted to the pool from external clients
and
Especially when setting asyncMode to true in constructors, ForkJoinPools may also be appropriate for use with event-style tasks that are never joined.
In theory, I agree with this change. Empirically, I've swapped them out on real world workloads and not seen much difference either way. I'm curious if you have apps where a FJ pool is noticeably detrimental to performance. (Other points, like catching fatals and a minimum of one, are well taken and discussed elsewhere. I'm strictly speaking performance here.)
I imagine most cats-effect users implement blocking with evalOn
and a second pool, but those who don't and whose blocking libraries save them by assuming a blocking
-integrated executor will be in for an unpleasant surprise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if you have apps where a FJ pool is noticeably detrimental to performance.
Very yes. There isn't a good tldr response to this comment, so I'll write something up in a gist instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh actually, #509 contains a halfway-decent description. I can flesh it out more if you like. It's a lot less about the layered queues and more about the spawning semantics being optimized for bad thread pool usage. ForkJoin is a fantastic general purpose pool for Java. It's not a terrible choice for Future
, either, since people tend to abuse Future
for blocking side effects and such. Users of cats-effect though tend to be a lot more scrupulous about their pools, since the API is very carefully (and very effectively) designed to carrot them to do so.
When you use ForkJoin to exclusively perform CPU-bound tasks that are throughput-optimized, it will behave horrifically poorly and you'll end up with a zillion threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you tend to see this with longer-running CPU intensive tasks, rather than a many small ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much so, yes. Many small ones are generally okay because you're yielding back to the pool frequently. Heavily fairness-optimized workflows also don't see this issue. I'm relatively certain, based on the symptoms, that fork-join's heuristics are simply mistakenly identifying throughput-optimized workflows as being blocking IO, and thus spinning up new threads to avoid starvation.
This pulls
IOApp
off the globalExecutionContext
and onto a custom one that has better properties. Specifically:OutOfMemoryError
)Related to #347. I'd like to hear from @rossabaker (and ideally @alexandru) as to their thoughts on the design tradeoffs here, and whether I should expand the impact at all.
I want to get this in 2.0 because, while it isn't breaking in the traditional sense, it's most definitely a very subtle compatibility shift and I don't want to catch people off guard.
Fixes #509
Fixes #515