We had a node down due to ulimit problems. A client got work to save and then OOMed due to lots of data saved in the LinkedBlockingQueue in SimpleHostConnectionPool. I had thought it was due to being down a long time but support said they bounced the client and it pretty much immediately OOMed again.
It looks like Cassandra was taking connections but not processing them due to the 'Too many files open' due to ulimit. Then the blocking queue has a TFramedTransport -> TByteArrayOutputStream with a 2M byte. It is possible I have just configured too many connections for my JVM size in these failure situations.