Introduce cassandra.skip-partition-check #1
Conversation
42bb382
to
a9d490f
Compare
NativeCassandraSession#getPartitions is very slow if there are many partitions used in WHERE clause. In our use, the partitions usually exist, so introduce cassandra.skip-parition-check property to reduce planning time.
a9d490f
to
d7ce5e8
Compare
@joker1007 @YAMASHITAHiroki Please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I think that this feature is suitable for current our usage.
public static String getColumnValueForCql(Object object, CassandraType cassandraType) | ||
{ | ||
switch (cassandraType) { | ||
case ASCII: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When cassandraType
was ASCII
or VARCHAR
, I thought that an exception would occur in d7ce5e8#diff-bd0665bf8831e4701f735d3b2fda47d8R490.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've fixed it in c7136c6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that almost things have no probrem.
And, I have two questions.
if (isComposite) { | ||
buffer.putShort((short) slice.length()); | ||
buffer.put(slice.getBytes()); | ||
buffer.put((byte) 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it delimiter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a delimiter. This logic comes from NativeCassandraSession.java#L390-L400. Cassandra builds partition keys using CFMetaData#serializePartitionKey, so we should build ones in the same way to get the list of hosts which have data in the partitions. The list is used if node-scheduler.network-topology
is flat
and Cassandra is running on Presto workers.
stringBuilder.append(" = "); | ||
stringBuilder.append(CassandraType.getColumnValueForCql(value, cassandraType)); | ||
} | ||
buffer.flip(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the buffer does flip
?
I guess CassandraPartition
requirements. but I want to know exact reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I convert ByteBuffer to byte array to create a CassandraPartition
in the same way of NativeCassandraSession#getPartitions
. As I mention in https://github.com/reproio/presto/pull/1/files#r229854855, we should build partition keys in the same way as Cassandra does (cf. CFMetaData#serializePartitionKey).
java> ByteBuffer buf = ByteBuffer.allocate(10)
java.nio.ByteBuffer buf = java.nio.HeapByteBuffer[pos=0 lim=10 cap=10]
java> buf.putInt(1)
java.nio.ByteBuffer res6 = java.nio.HeapByteBuffer[pos=4 lim=10 cap=10]
java> buf.array()
byte[] res7 = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
java> buf.limit()
java.lang.Integer res8 = 10
java> buf.flip()
java.nio.ByteBuffer res9 = java.nio.HeapByteBuffer[pos=0 lim=4 cap=10]
java> byte[] key = new byte[buf.limit()]
byte[] key = [0, 0, 0, 0]
java> buf.get(key)
java.nio.ByteBuffer res11 = java.nio.HeapByteBuffer[pos=4 lim=4 cap=10]
java> key
byte[] key = [0, 0, 0, 1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. Thanks!!
Previously, all functions (CallExpressions) would be printed as '$function_name$(ARG1, ARG2, ..., ARGN)' but now some expressions will print in human format eg, '(#1) + (BIGINT 5)'. Additional changes were made to pass a `FunctionManager` instance to `RowExpressionFormatter` instead of instantiating one in the class.
NativeCassandraSession#getPartitions
is very slow if there are many partitions used in WHERE clause.In our use, the partitions usually exist, so skipping partition check will reduce planning time and wall time.
In my experiment, "cassandra.skip-partition-check" reduced analysis time of a certain query from 22829 msec to 1060 msec.