We don't encode error messages properly for CQL binary protocol #43

tgrabiec · 2015-07-24T10:58:29Z

This makes cassandra-stress error out like this:

ERROR 20:31:50 Exception in response
io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.messages.ErrorMessage$WrappedException: java.lang.IndexOutOfBoundsException: readerIndex(53) + length(2) exceeds writerIndex(53): SlicedByteBuf(ridx: 53, widx: 53, cap: 53/53, unwrapped: UnpooledUnsafeDirectByteBuf(ridx: 62, widx: 62, cap: 64))
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
Caused by: org.apache.cassandra.transport.messages.ErrorMessage$WrappedException: java.lang.IndexOutOfBoundsException: readerIndex(53) + length(2) exceeds writerIndex(53): SlicedByteBuf(ridx: 53, widx: 53, cap: 53/53, unwrapped: UnpooledUnsafeDirectByteBuf(ridx: 62, widx: 62, cap: 64))
    at org.apache.cassandra.transport.messages.ErrorMessage.wrap(ErrorMessage.java:256) ~[main/:na]
    at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:273) ~[main/:na]
    at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:235) ~[main/:na]
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) [netty-all-4.0.23.Final.jar:4.0.23.Final]
    ... 17 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: readerIndex(53) + length(2) exceeds writerIndex(53): SlicedByteBuf(ridx: 53, widx: 53, cap: 53/53, unwrapped: UnpooledUnsafeDirectByteBuf(ridx: 62, widx: 62, cap: 64))
    at io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1175) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.buffer.AbstractByteBuf.readShort(AbstractByteBuf.java:589) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at io.netty.buffer.AbstractByteBuf.readUnsignedShort(AbstractByteBuf.java:597) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
    at org.apache.cassandra.transport.CBUtil.readConsistencyLevel(CBUtil.java:186) ~[main/:na]
    at org.apache.cassandra.transport.messages.ErrorMessage$1.decode(ErrorMessage.java:80) ~[main/:na]
    at org.apache.cassandra.transport.messages.ErrorMessage$1.decode(ErrorMessage.java:43) ~[main/:na]
    at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:247) ~[main/:na]
    ... 19 common frames omitted

That's because it expects to read consistency level field for READ_TIMEOUT error message but we don't write it in transport/server.cc.

Relevant code is in Origin in org.apache.cassandra.transport.messages.ErrorMessage.

The text was updated successfully, but these errors were encountered:

penberg · 2015-07-27T13:29:02Z

Yeah, I fixed one of these in commit 918b30b. We probably should just go through the CQL binary protocol specification and fix up all of them in one sweep.

dorlaor · 2015-07-27T13:56:53Z

Probably this is related too:
#48

On Mon, Jul 27, 2015 at 4:29 PM, Pekka Enberg notifications@github.com
wrote:

Yeah, I fixed one of these in commit 918b30b
918b30b.
We probably should just go through the CQL binary protocol specification
and fix up all of them in one sweep.

—
Reply to this email directly or view it on GitHub
#43 (comment)
.

penberg · 2015-07-27T14:17:10Z

@dorlaor Seems unrelated. We don't set TRUNCATE_ERROR in our code so it's probably something else.

penberg · 2015-07-28T06:21:52Z

I sent patches to fix up READ_TIMEOUT and UNAVAILABLE errors. AFAICT, the only remaining encoding issue is with WRITE_TIMEOUT. We don't return it to clients yet but we should fix it before closing this issue.

penberg · 2015-07-28T06:24:32Z

Oh, there's also UNPREPARED we need to fix.

penberg · 2015-07-28T10:05:57Z

UNPREPARED is fixed as of commit dcbf8d5.

WRITE_TIMEOUT needs to know the write type so I'll leave it to someone who actually knows what they're doing in clustering code.

penberg · 2015-08-12T12:03:04Z

WRITE_TIMEOUT was fixed in commit 0b3d2de. @slivne, please close the issue.

slivne mentioned this issue Jul 25, 2015

Incorrcect error code return when consistency_level can not be met #32

Closed

slivne added the bug label Aug 5, 2015

slivne added this to the Beta milestone Aug 5, 2015

slivne assigned gleb-cloudius Aug 5, 2015

slivne added the ready label Aug 9, 2015

slivne mentioned this issue Aug 10, 2015

SIGSEGV during commitlog write when low on memory (fragmentation problem) #108

Closed

slivne closed this as completed Aug 12, 2015

slivne removed the ready label Aug 12, 2015

This was referenced Aug 12, 2015

CQL BATCH fails on DTEST batch_test.py .acknowledged_by_batchlog_not_set_when_batchlog_write_fails_test #140

Closed

CQL BATCH fails on dtest batch_test.py acknowledged_by_batchlog_set_when_batchlog_write_succeeds_test #141

Closed

tgrabiec unassigned gleb-cloudius Sep 22, 2015

nirmaayan mentioned this issue Aug 1, 2016

Scylla Crash When Running Write Load #1510

Closed

jinxing64 mentioned this issue Sep 27, 2016

Failure during compaction. #1706

Closed

GpStore mentioned this issue Nov 10, 2016

Scylla crashes during query #1831

Closed

slivne mentioned this issue Jul 3, 2017

Row Cache continuity information is not used for queries of partitions that do not exist (bloom filters are still checked) #2544

Closed

frank8989 mentioned this issue Mar 13, 2018

Reactor stalled for 2029 ms on shard for scylla 2.0.0 #3240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We don't encode error messages properly for CQL binary protocol #43

We don't encode error messages properly for CQL binary protocol #43

tgrabiec commented Jul 24, 2015

penberg commented Jul 27, 2015

dorlaor commented Jul 27, 2015

penberg commented Jul 27, 2015

penberg commented Jul 28, 2015

penberg commented Jul 28, 2015

penberg commented Jul 28, 2015

penberg commented Aug 12, 2015

We don't encode error messages properly for CQL binary protocol #43

We don't encode error messages properly for CQL binary protocol #43

Comments

tgrabiec commented Jul 24, 2015

penberg commented Jul 27, 2015

dorlaor commented Jul 27, 2015

penberg commented Jul 27, 2015

penberg commented Jul 28, 2015

penberg commented Jul 28, 2015

penberg commented Jul 28, 2015

penberg commented Aug 12, 2015