-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opinionated changes to internode tracking #1121
Conversation
sbtourist
commented
May 15, 2024
- Removed message count as bytes should really be the most important/accurate.
- Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
- Added internode sensors to custom params in the same place where we add the other sensors.
- Added internode sensors for request.
* Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request.
8aa88ca
to
1bba5f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few comments, but in general I'm convinced about the benefits of this approach:
- Consistency with other sensors
- Simpler implementation when compared to the MessagingService sink approach).
However, the trade-offs are:
- Approximation of message size which I think is no big deal
- Extra work for size calculation <-- I'm not sure about this - heart says its fine, brain says this is on the critical path of messaging but we can test for regression in performance.
Message.Builder<ReadResponse> reply = message.responseWithBuilder(response); | ||
addReadBytesSensorToResponse(reply, context); | ||
int size = reply.currentSize(MessagingService.current_version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, the sensor custom headers themselves as not added so just confirming you intentionally want to exclude them? On one hand, they are not part of the user data, but at the end of the day they are bytes consuming network BW so I would've thought we should include them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is intentional. Rationale is they should be a pale share of the overall message size in prod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it, but bc it doesn't cost us anything to include them (just calculating the size right before reply.build()
I think why not just include all sensors do - it is as accurate of a measure of network bytes as it gets. Alternatively, we could just keep the payload and forget the headers altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bc it doesn't cost us anything to include them (just calculating the size right before reply.build() I think why not just include all sensors
It's a chicken and egg problem :) How do you include the internode bytes in the header if you haven't computed them yet?
Alternatively, we could just keep the payload and forget the headers altogether.
I actually like this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a chicken and egg problem :) How do you include the internode bytes in the header if you haven't computed them yet?
I'm just reference to the other UCU sensors (WRITE/READ/INDEX BYTES
) can be included. The minimal bytes footprint to exclude is the one that actually captures INTERNODE BYTES
But payload only works for me too
return doBuild(hasId ? id : nextId()); | ||
} | ||
|
||
public int currentSize(int version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically we build the message twice now - once we calculate the size, and another when want to build the actual response, is this ok performance wise? This was one of the reasons to use register to the outbound sink (tradeoff being complexity of the overall approach, just making sure you think the redundant message build is worth it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken. The most expensive computation is probably the payload size, what if we cache it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how caching the payload size would help. It seems you call it only once? I was also thinking building the message to calculate the size has a memory hit.
The size calculation of the message and including it in the headers is a chicken and egg problem. tbh don't see an effective way around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you call it only once?
Once per message, but we build the message twice. The reason we cache it in the message is exactly the fact that it's the most expensive size computation.
I was also thinking building the message to calculate the size has a memory hit.
Building the Message
is just a bunch of objects assignment, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once per message, but we build the message twice. The reason we cache it in the message is exactly the fact that it's the most expensive size computation.
Exactly - that's why I wasn't sure how caching would help, maybe I misunderstood your suggestion.
Building the Message is just a bunch of objects assignment, isn't it?
Fair point. No more concerns here.
RequestSensors sensors = new RequestSensors(keyspace, ImmutableSet.of(command.metadata())); | ||
sensors.registerSensor(context, Type.READ_BYTES); | ||
ExecutorLocals locals = ExecutorLocals.create(sensors); | ||
ImmutableSet<TableMetadata> tables = ImmutableSet.of(command.metadata()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always a single table, not sure we need the Set.
int perSensorSize = response.currentSize(MessagingService.current_version) / requestSensors.size(); | ||
requestSensors = RequestTracker.instance.get().getSensors(Type.INTERNODE_BYTES); | ||
requestSensors.forEach(sensor -> RequestTracker.instance.get().incrementSensor(sensor.getContext(), sensor.getType(), perSensorSize)); | ||
RequestTracker.instance.get().syncAllSensors(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the syncAllSensors
is redundant (and in other handlers) - it is already taken care of when mutation is applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this comes after the mutation is applied right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. In that case, the syncAllSensors
call in the apply method (or when the row iterator is closed in the read case) is redundant. Should we just keep the call in the verb handler? Now that you made idempotent, I'm less concerned about it but still a small performance hit
requestSensors = RequestTracker.instance.get().getSensors(Type.INDEX_WRITE_BYTES); | ||
addSensorsToResponse(requestSensors, requestParam, tableParam, response); | ||
|
||
// Add internode bytes sensors to the response after updating each per-table sensor with the current response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented in the ReadCommandVerbHandler
regarding this - I mean if we go with the approximation, we need to make it consistent - here we are including the sensors values from above (WRITE_BYTES
and INDEX_WRITE_BYTES
) are already added, but in the read example, READ_BYTES
is added after we calculate the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, we can refactor this.
* Cache the payload size where possible.
@aymkhalil This should address the points previously discussed: a14504f |
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
* Collect counter mutation WRITE_BYTES from replicas * Rely on replica count to accommodate for mutation replica sensor values * Empty-Commit * Track internode msg bytes and counts * Track internode sensors at the outbound sink level * Fix NPE for ks only sensors * Fix more NPEs in unit tests * Rebase on top of origin/cc-ucu * Remove ks from headers & track internode messages in MessagingService * Add internode sensors per ks and table * Opinionated changes to internode tracking (#1121) * Opinionated changes to internode tracking: * Removed message count as bytes should really be the most important/accurate. * Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes. * Added internode sensors to custom params in the same place where we add the other sensors. * Added internode sensors for request. * Added SensorsInternodeTest + bug fixes. * * Use only the payload size as internode bytes. * Cache the payload size where possible. * Use payload size for inbound messages --------- Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>