Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track internode msg bytes #1081

Merged
merged 13 commits into from
May 30, 2024
Merged

Track internode msg bytes #1081

merged 13 commits into from
May 30, 2024

Conversation

aymkhalil
Copy link

@aymkhalil aymkhalil commented Apr 12, 2024

Addresses: https://github.com/riptano/cndb/issues/9194

Tracking internode message bytes.

More generic implementation (i.e. leveraging InboundMessageHandlers like InternodeInboundMetrics have the following complications:

  • The work if we want to track internode messages globally. However per keyspace and table dimensions would require access the the deserialzed message and inverting the payload at runtime.
  • The life cycle of RequestSensors today (from sensor registration to synchronizing the sensors registry) begins on VerbHandler#doVerb. Trying to register and synchronise outside the verb handler boundaries to increment internode sensors would require major changes (probably where the verbs are dispatched in InboundSink)

@aymkhalil
Copy link
Author

  • Update internode sensors tracking to happen at the outboundSink level
  • Encoded KEYSPACE header to simplify encoding internode sensors in the message header. Please note that it is not possible to link the message to a keyspace in the Mutation case because the payload is empty. Even for reads, the message doesn't naturally hold keyspace info in the LocalDataResponse and even if it does, extracting will require inspecting the payload at runtime. I found the header approach to be simpler.
  • Internode sensors are now tracking at the keyspace level only, consequently, there are only tracking in the SensorsRegistry
  • Added SensorsInternodeMessageTest unit tests

919d1cf

@aymkhalil aymkhalil marked this pull request as ready for review April 17, 2024 23:22
Base automatically changed from cc-ucu-counter to cc-ucu May 10, 2024 23:15
@@ -28,6 +29,12 @@
*/
public final class SensorsCustomParams
{
/**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit unclean, what about passing the keyspace via the existing executor locals mechanism? I.e. we could add a RequestTracker method to register/retrieve keyspaces involved in the current request, wdyt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. AFAIK RequestTracker would map 1:1 with the keyspace so will add it to the constructor.

*/
public Context(String keyspace)
{
this(keyspace, null, null);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for not being great at remembering things (one more reason I guess to recap our discussions on github), but can you remind me why don't we also track table information?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Batch writes (one idea to mitigate this is to divide bytes and counts equally between tables)
  • The life cycle of RequestSensors today (from sensor registration to synchronizing the sensors registry) begins on VerbHandler#doVerb and ends on the callback. To track per table, two complexities arise:
    • There will be need to synchronize sensors values outside the verb handler boundary
    • We embed sensor value in the response message, that would require somehow injecting the sensor values into the message at the OutboundSink level which is not trivial.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built on the idea you proposed of adding keyspace to RequestTracker to add all tables too in order to track table info. The caveat is:

  • We don't track internode messages per table per request bc the message size is only known after it's been built
  • The track per table across requests using SensorsRegistry - the very first message belong to an table will not contain internode sensors.

Let me know caa569f

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and I'm not too convinced, it seems internode tracking adds quite a bit of confusion :/ Let me think if we can make it a bit better and more similar to the other sensors...

/**
* Tracks outbound messages size and count in Sensors Regsitry
*/
private boolean trackOutboundMessages(Message<?> message, InetAddressAndPort ignored)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding internode tracking responsibility to the registry doesn't look right to me: what about moving this code to MessagingService#listen() and rewrite it slightly to use the "standard" sensors API? It would probably be slightly more verbose, but overall cleaner IMO. WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is cleaner but requires a SensorsRegistry#syncAllSensors if we were to use the standard API. If there is need, we can expose an async variant or a public api to sync a single sensor directly.

@aymkhalil
Copy link
Author

Fixed PR diff cc52860

@aymkhalil aymkhalil force-pushed the cc-ucu-intenode branch 3 times, most recently from e59190b to 442dbff Compare May 13, 2024 23:44
@aymkhalil
Copy link
Author

@sbtourist PR is ready for second round of review. PTAL.

@@ -620,10 +625,32 @@ boolean isConnected(InetAddressAndPort address, Message<?> messageOut)
public void listen()
{
inboundSockets.open();
outboundSink.add(this::trackOutboundMessages);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a nit, but this would be best invoked first, so as soon as we start listening it's all setup.

*/
public Context(String keyspace)
{
this(keyspace, null, null);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and I'm not too convinced, it seems internode tracking adds quite a bit of confusion :/ Let me think if we can make it a bit better and more similar to the other sensors...

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.
@aymkhalil aymkhalil changed the title Track internode msg bytes and counts Track internode msg bytes May 24, 2024
tables.forEach(tm -> {
Context context = Context.from(tm);
requestSensors.registerSensor(context, Type.INTERNODE_BYTES);
requestSensors.incrementSensor(context, Type.INTERNODE_BYTES, message.serializedSize(MessagingService.current_version) / tables.size());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message.serializedSize

I realized for inbound messages we use the whole message serialized size, should we use the payload size for consistency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 8543504

@aymkhalil aymkhalil merged commit ad66f41 into cc-ucu May 30, 2024
433 of 444 checks passed
@aymkhalil aymkhalil deleted the cc-ucu-intenode branch May 30, 2024 17:05
@aymkhalil aymkhalil restored the cc-ucu-intenode branch May 30, 2024 17:09
@aymkhalil aymkhalil deleted the cc-ucu-intenode branch May 30, 2024 17:21
aymkhalil added a commit that referenced this pull request Jun 5, 2024
* Collect counter mutation WRITE_BYTES from replicas

* Rely on replica count to accommodate for mutation replica sensor values

* Empty-Commit

* Track internode msg bytes and counts

* Track internode sensors at the outbound sink level

* Fix NPE for ks only sensors

* Fix more NPEs in unit tests

* Rebase on top of origin/cc-ucu

* Remove ks from headers & track internode messages in MessagingService

* Add internode sensors per ks and table

* Opinionated changes to internode tracking (#1121)

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.

* Use payload size for inbound messages

---------

Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
aymkhalil added a commit that referenced this pull request Sep 12, 2024
* Collect counter mutation WRITE_BYTES from replicas

* Rely on replica count to accommodate for mutation replica sensor values

* Empty-Commit

* Track internode msg bytes and counts

* Track internode sensors at the outbound sink level

* Fix NPE for ks only sensors

* Fix more NPEs in unit tests

* Rebase on top of origin/cc-ucu

* Remove ks from headers & track internode messages in MessagingService

* Add internode sensors per ks and table

* Opinionated changes to internode tracking (#1121)

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.

* Use payload size for inbound messages

---------

Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
aymkhalil added a commit that referenced this pull request Sep 12, 2024
* Collect counter mutation WRITE_BYTES from replicas

* Rely on replica count to accommodate for mutation replica sensor values

* Empty-Commit

* Track internode msg bytes and counts

* Track internode sensors at the outbound sink level

* Fix NPE for ks only sensors

* Fix more NPEs in unit tests

* Rebase on top of origin/cc-ucu

* Remove ks from headers & track internode messages in MessagingService

* Add internode sensors per ks and table

* Opinionated changes to internode tracking (#1121)

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.

* Use payload size for inbound messages

---------

Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
djatnieks pushed a commit that referenced this pull request Sep 17, 2024
* Collect counter mutation WRITE_BYTES from replicas

* Rely on replica count to accommodate for mutation replica sensor values

* Empty-Commit

* Track internode msg bytes and counts

* Track internode sensors at the outbound sink level

* Fix NPE for ks only sensors

* Fix more NPEs in unit tests

* Rebase on top of origin/cc-ucu

* Remove ks from headers & track internode messages in MessagingService

* Add internode sensors per ks and table

* Opinionated changes to internode tracking (#1121)

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.

* Use payload size for inbound messages

---------

Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
djatnieks pushed a commit that referenced this pull request Sep 18, 2024
* Collect counter mutation WRITE_BYTES from replicas

* Rely on replica count to accommodate for mutation replica sensor values

* Empty-Commit

* Track internode msg bytes and counts

* Track internode sensors at the outbound sink level

* Fix NPE for ks only sensors

* Fix more NPEs in unit tests

* Rebase on top of origin/cc-ucu

* Remove ks from headers & track internode messages in MessagingService

* Add internode sensors per ks and table

* Opinionated changes to internode tracking (#1121)

* Opinionated changes to internode tracking:
* Removed message count as bytes should really be the most important/accurate.
* Tracked both inbound and outbound bytes, as writes take up inbound bytes, while reads take up outbound bytes.
* Added internode sensors to custom params in the same place where we add the other sensors.
* Added internode sensors for request.

* Added SensorsInternodeTest + bug fixes.

* * Use only the payload size as internode bytes.
* Cache the payload size where possible.

* Use payload size for inbound messages

---------

Co-authored-by: Sergio Bossa <sergio.bossa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants