-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed ClassCastException during Multi-Stage Queries on real-time segments #16607
base: master
Are you sure you want to change the base?
Fixed ClassCastException during Multi-Stage Queries on real-time segments #16607
Conversation
…hat include columns of HLLSketch type resulted in a ClassCastException.
if (val == null) { | ||
return ByteArrays.EMPTY_ARRAY; | ||
} else if (val instanceof byte[]) { | ||
return (byte[]) val; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect because there might exist a serde where the deserialized value is byte[], however, the serialized value can be something else.
For example, different compression techniques can transform a deserialized byte[] array to a different serialized byte[] array. This would be unique to the complex type, so it shouldn't be a universal change. Instead, the following changes can be made:
- If this only occurs in the HllSketchBuild serde, you can make this change specific to the HllSketchBuild serde.
- Else, if this occurs with all the complex types while running MSQ tasks on complex metrics, you can make the changes in MSQ-specific classes (after confirming where the byte[] array is coming from).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out! It looks like this issue might not be limited to just the HLLSketch. None of the ObjectStrategies seem to fit this scenario, as they all expect specific objects for the toBytes() method. Using it with a byte array directly would lead to the CCE. I'll look into why selector.getObject is returning a byte array for a real-time segment maybe it could be fixed there as well. It is not appears in MSQ classes directly tho, it shows up at ComplexFieldWriter.writeTo()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np! Since the issue is bubbling up from the column value selectors, you can check who creates those, and why they are returning a byte array instead of the desired object type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some research and it looks like it's normal for selector.getObject to return a byte[] for real-time segments. This happens because it uses RowBasedColumnValueSelector instead of ObjectColumnSelector. It doesn't go through GenericIndexed.copyBufferAndGet() and strategy.fromByteBuffer(), and just grabs the byte[] directly from the ByteBuffer using rowSupplier.get() (at least in case of ComplexType from RealTime segment MSQ).
Right now, we don't have an ObjectStrategy that needs a byte[] as input, so initial setup would likely cause CCE failures. But you're right, we might need to handle this in the future. So, I've moved this check to the ObjectStrategy as a defaul void, making it easy to override if needed.
@@ -116,7 +116,7 @@ | |||
public byte[] toBytes(@Nullable Object val) | |||
{ | |||
if (val != null) { | |||
byte[] bytes = getObjectStrategy().toBytes(val); | |||
byte[] bytes = getObjectStrategy().objectToBytes(val); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
ComplexMetricSerde.getObjectStrategy
…ith-hllsketch-in-realtime
Thanks for the changes @nozjkoitop. I'll review them shortly. |
@LakshSingla Could I remind you to have a look on that please? |
Sorry for the delay @nozjkoitop and thanks for the update 🚀 The new approach is a lot cleaner than the previous one, but I still wanna see if there's a way to restrict the change to MSQ, since I suspect that there's a faulty selector at play. The MSQ test case would help in debugging this as well. |
…ith-hllsketch-in-realtime
I've added new test-case with hyperUnique column in select from realtime segment, please have a look |
Fixed issue when msq queries throw ClassCastException with
"includeSegmentSource": "REALTIME"
and supervisor running.After some investigation was found that selector's getObject() for real-time segments was returning a byte array representation of HllSketch but still was trying to convert it to a byte[]. While HllSketchHolderObjectStrategy expects to have HllSketchHolder instance in toBytes() leading to the ClassCastException.
This PR has: