Skip to content

decrements total queued mutation size when update idles#5236

Merged
keith-turner merged 1 commit intoapache:2.1from
keith-turner:accumulo-5235
Jan 8, 2025
Merged

decrements total queued mutation size when update idles#5236
keith-turner merged 1 commit intoapache:2.1from
keith-turner:accumulo-5235

Conversation

@keith-turner
Copy link
Copy Markdown
Contributor

fixes #5235

@keith-turner keith-turner added this to the 2.1.4 milestone Jan 8, 2025
@keith-turner
Copy link
Copy Markdown
Contributor Author

Not sure how to test this. I plan to manually test it using the added logging before merging.

Copy link
Copy Markdown
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

@keith-turner
Copy link
Copy Markdown
Contributor Author

Manually tested this using this following test.

public class TmpIT extends AccumuloClusterHarness {

  @Override
  public void configureMiniCluster(MiniAccumuloConfigImpl cfg, Configuration hadoopCoreSite) {
    // cfg.setProperty(Property.TSERV_TOTAL_MUTATION_QUEUE_MAX, "10");
    cfg.setProperty(Property.TSERV_UPDATE_SESSION_MAXIDLE, "10s");
  }

  @Test
  public void testUnclosedUpdateSessionsions() throws Exception {

    String table = getUniqueNames(1)[0];
    try (AccumuloClient c = Accumulo.newClient().from(getClientProps()).build()) {
      c.tableOperations().create(table);

      Random rand = new Random();

      for (int i = 0; i < 100; i++) {
        var ctx = (ClientContext) c;
        var tableId = ctx.getTableId(table);
        var extent = new KeyExtent(tableId, null, null);
        var tabletMetadata = ctx.getAmple().readTablet(extent, TabletMetadata.ColumnType.LOCATION);
        var location = tabletMetadata.getLocation();
        assertNotNull(location);
        assertEquals(TabletMetadata.LocationType.CURRENT, location.getType());

        TabletClientService.Iface client =
            ThriftUtil.getClient(ThriftClientTypes.TABLET_SERVER, location.getHostAndPort(), ctx);
        // Make the same RPC calls made by the BatchWriter, but pass a corrupt serialized mutation
        // in
        // this try block.
        try {
          TInfo tinfo = TraceUtil.traceInfo();
          long sessionId = client.startUpdate(tinfo, ctx.rpcCreds(), TDurability.DEFAULT);

          byte row[] = new byte[32];
          int valSize = rand.nextInt(900)+100;
          byte value[] = new byte[valSize];

          rand.nextBytes(row);
          rand.nextBytes(value);

          client.applyUpdates(tinfo, sessionId, extent.toThrift(),
              List.of(createTMutation(Base64.getEncoder().encodeToString(row),
                  Base64.getEncoder().encodeToString(value))));
           // not closing update session, this will cause it to idle out on the server side.

        } finally {
          ThriftUtil.returnClient((TServiceClient) client, ctx);
        }
      }

      UtilWaitThread.sleep(60000);

    }
  }
}

and saw the following in the logs when enabling trace in the config. Those are the last few lines of the cleanup, a lot more were printed.

TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 698
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 4410 after adding -698
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 990
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 3420 after adding -990
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 374
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 3046 after adding -374
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 1166
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 1880 after adding -1166
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 662
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 1218 after adding -662
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletClientHandler] TRACE: cleaning up abandoned update session, decrementing totalQueuedMutationSize by 1218
TabletServer_436604017.out:2025-01-08T16:38:48,687 [tserver.TabletServer] TRACE: totalQueuedMutationSize is now 0 after adding -1218

@keith-turner keith-turner merged commit d5756e8 into apache:2.1 Jan 8, 2025
@keith-turner keith-turner deleted the accumulo-5235 branch January 8, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants