Do not hold on to block data in the plot worker #7235

cfoust · 2023-12-15T12:58:45Z

User-Facing Changes
Drastically reduce the amount of memory required for plotting.

Description
In the beginning, I moved the plot pipeline to a Web Worker in order to reduce the burden that dataset creation and downsampling put on the main thread. This was a big change, so to reduce the number of variables I kept the logic for aggregating messages the same. As a result, we had two copies of the message data used for plotting: one in the main thread and another in the worker. This approach cost a lot of memory, but interacting with plots was much faster.

This PR finally closes the loop and makes it so that the main thread sends the messages that plots need on demand; the worker only holds on to plot datasets and not the messages necessary to produce them. This greatly reduces the memory usage of plots, with the downside that the relevant raw messages must be transmitted to the worker in full when plot parameters change.

This costs a fair bit of complexity, so to make this easier to test and reason about I broke out the logic for sending messages to clients into a standalone state machine that's used by useDatasets and added a suite of tests.

Known issues
~~There is still a timing issue affecting plot stories that I am investigating.~~ Update: fixed!

cfoust · 2023-12-19T20:45:27Z

I fixed the issue with current data. We now retransmit it to a specific client when plot params change. Right now we transmit all current data subscribed to by plots, not just what the client actually needs, so it's a little wasteful but I'm still polishing.

defunctzombie · 2023-12-20T13:59:05Z

relevant raw messages must be transmitted to the worker in full when plot parameters change.

What plot parameter changes cause this to happen?

defunctzombie · 2023-12-20T14:52:14Z

packages/studio-base/src/panels/Plot/useDatasets.ts

-let blockState = initBlockState();
-let clients: Record<string, Client> = {};
+let datasetsState: DatasetsState = initDatasets();
+let callbacks: Record<string, (topics: SubscribePayload[]) => void> = {};


Not a fan of all these globals - why do we have globals here?

It's module-level state that exists outside of the React tree. I prefer not to connect state to React that doesn't need to be there (and, for example, does not on its own trigger component updates.)

cfoust · 2023-12-20T18:20:29Z

What plot parameter changes cause this to happen?

Anything that affects the output points--so, basically, all plot parameter changes. Theoretically there are some kinds of mutations for which we could just modify the existing dataset, but without having the messages used to generate that dataset it's pretty tricky.

cfoust · 2023-12-20T18:36:39Z

Just to recap:

I realized that we also needed to retransmit current data to the worker when plot parameters changed, though fixing this is much simpler than the approach necessary for transmitting blocks.
I made it so that when a client is reset, we only send the current data it needs rather than all of the accumulated current messages.

achim-k · 2023-12-21T14:18:23Z

packages/studio-base/src/panels/Plot/useDatasets.worker.ts

  },
  // eslint-disable-next-line @foxglove/no-boolean-parameters
  setLive(value: boolean): void {
    state = setLive(value, state);
  },
-  unregister(id: string): void {
+  unregisterCleint(id: string): void {


Typo: Cleint

achim-k · 2023-12-21T14:24:18Z

packages/studio-base/src/panels/Plot/useDatasets.ts

+  // Only send the current events that the client can actually use. This also
+  // saves us from having to `structuredClone` unused data
+  const topics = new Set(R.uniq(getClientPayloads(client).map(({ topic }) => topic)));
+  void service?.addCurrentData(
+    current.filter(({ topic }) => topics.has(topic)),
+    id,
+  );


If all plot clients have a rolling data window (e.g. only the last 10 seconds are plotted), could we also filter out any current data that is not within that window?

That's a bit tricky and even potentially undesirable--as a user, I would not expect that to happen. For example, users seem to rely on the fact that data stays around (even if it's not visible) for the CSV export functionality

achim-k · 2023-12-21T15:13:10Z

packages/studio-base/src/panels/Plot/processor/messages.ts

+ * client specified by `clientId`.
+ */
+export function addCurrentData(
+  events: readonly MessageEvent[],


Is this guaranteed to be at max only one message per topic? Because otherwise we might drop some messages with R.map((messages) => messages.slice(-1), current) below.

That's only for a single-message plot--which necessarily will ignore anything but the most recent message

achim-k · 2023-12-21T15:15:37Z

packages/studio-base/src/panels/Plot/processor/messages.ts

-
-      if (isSingleMessage(params)) {
-        const plotData = buildPlot(
+        current: accumulate(


If all of these new current messages are already preloaded, do we still have to recompute the plot data and send it to the client? I would assume that the plot data wouldn't/shouldn't change in that case?

Theoretically yes, but it's a pretty negligible amount of compute, worth fixing later only if we really think it's a problem

Just for my understanding: in the case where we do have both blocks & current data, where is the logic that chooses to use the block data instead of the current data?

For generating the final dataset? That happens here

packages/studio-base/src/panels/Plot/processor/clients.ts

jtbandes · 2023-12-22T02:30:10Z

packages/studio-base/src/panels/Plot/processor/clients.ts

    concatEffects((newState: State): StateAndEffects => {
-      const { pending, blocks } = newState;
+      const { pending } = newState;


What changed in this section? I am getting a bit lost in the details and might be missing the big picture. Is it that we now use applyBlockUpdate instead of splatting the pending data into blocks?

Yup, exactly. It was weird and complicated to have two different code paths for ingesting block data, one that used messages and one that used BlockUpdates.

jtbandes · 2023-12-22T02:39:07Z

packages/studio-base/src/panels/Plot/processor/clients.ts

+      const allUpdates = pending.map(
+        (update: BlockUpdate): [next: BlockUpdate, applied: BlockUpdate] => {
+          const { updates } = update;
+          const [used, unused] = R.partition(
+            ({ id: clientId }) => clientIds.includes(clientId),
+            updates,
+          );
+          return [
+            { ...update, updates: unused },
+            { ...update, updates: used },
+          ];
+        },
+      );

-      const allNewTopics = getAllTopics(newState);
-      const newData = R.pick(allNewTopics, pending);
-      if (R.isEmpty(newData)) {
-        return [newState, []];
-      }
+      const newPending: BlockUpdate[] = allUpdates
+        .filter(([unused]) => unused.updates.length > 0)
+        .map(([unused]) => unused);

-      const newTopics = Object.keys(newData);
+      const updatesToApply: BlockUpdate[] = allUpdates
+        .filter(([, used]) => used.updates.length > 0)
+        .map(([, used]) => used);


I think I've understood what this is doing (though maybe I am missing something). I find there seems to be a lot of dancing around the difficulty of trying to partition each item in a list and then extract or sort of "transpose" the results to end up with two lists.

I think it would be shorter, clearer, and more efficient if structured more like below – unless there is some subtlety that I have lost – but if there is, I think it is too subtle because I can't see it after several minutes of reading the code.

const newPending: BlockUpdate[] = []; const updatesToApply: BlockUpdate[] = []; for (const update of pending) { for (const { id: clientId } of update.updates) { if (clientIds.includes(clientId)) { updatesToApply.push(update); } else { newPending.push(update); } } }

(I'm also not quite sure I understand why we have two levels of nested collections – pending updates where each update has its own array of updates?)

While this section could use some improvement--no denying that--your code snippet copies the BlockUpdate (including all of its updates) on every iteration. The subtlety is that ClientUpdates refer to data inside of the messages field on the BlockUpdate; you cannot just combine BlockUpdates as it seems you're suggesting in the parenthetical at the end of your comment.

I can think of ways to clean this up, however! It's a good call-out that this is hard to understand.

What do you mean by copy? I don’t see anything in my snippet that would be creating new objects, just organizing them into new arrays right?

“copy” here meaning “duplicate”, not in terms of memory layout. You are appending the entire BlockUpdate to the list for every ClientUpdate

Which, though it’s clearly just copying the reference, is a little counterintuitive since then you have to reaggregate them when you apply all the pending updates for a client that joins later

I see, I think I get your point now. Would this be more accurate?

const newPending: BlockUpdate[] = []; const updatesToApply: BlockUpdate[] = []; for (const blockUpdate of pending) { const [used, unused] = R.partition( ({ id: clientId }) => clientIds.includes(clientId), blockUpdate.updates, ); if (used.length > 0) { updatesToApply.push({ ...blockUpdate, updates: used }); } if (unused.length > 0) { newPending.push({ ...blockUpdate, updates: unused }); } }

jtbandes · 2023-12-22T02:58:11Z

packages/studio-base/src/panels/Plot/processor/messages.ts

+  // We aggregate all of the updates for each client and then apply them as a
+  // group. This is because we don't want the `shouldReset` field, which can
+  // reset the plot data, to throw away data we aggregated from an update we
+  // just applied.


It sounds like this means we don't necessarily want to apply the updates in order – why is that? I could imagine that if there is a list of updates with a reset=true in the middle, then it should theoretically reset things that were aggregated before it but not after it.

Maybe I could state this more clearly, but a single BlockUpdate contains a single "step" forward in the block state machine. In other words, this code cannot result in out-of-order application. The scenario it's avoiding is where you apply ClientUpdates from the updates field one at a time. If one update contains shouldReset=true, all of them will for a given client, but if you apply them one by one, you might reset the client's data entirely (with shouldReset) several times.

ex, this goes from:

shouldReset update shouldReset update

to

shouldReset update update

jtbandes · 2023-12-22T03:00:30Z

packages/studio-base/src/panels/Plot/processor/messages.ts

+  // If we get updates for clients that haven't registered yet, we've got to
+  // keep that data around and use it when they register
+  const clientIds = state.clients.map(({ id }) => id);
+  const unused = updates.filter(({ id }) => !clientIds.includes(id));


Random drive-by thought as I'm reading this – if we are usually accessing updates by specific client ids, would it make sense to store them grouped by id instead of in a single array?

While this is totally subjective and I'm not going to die on this hill, I have a mild preference in favor of lists over associative arrays, mostly because it makes mapping and filtering more ergonomic. In this case, however, it may make sense given that we just groupBy in applyBlockUpdate anyway, but I will think about it

packages/studio-base/src/panels/Plot/processor/messages.ts

packages/studio-base/src/panels/Plot/blocks.ts

packages/studio-base/src/panels/Plot/processor/messages.ts

jtbandes · 2023-12-22T03:09:05Z

packages/studio-base/src/panels/Plot/processor/messages.ts

+            return {
+              ...a,


I'm mildly concerned that a proliferation of reduce with object spreads will lead to a lot more CPU work being done than it would be with a loop more like const clientMessages = {}; for (const clientUpdate of updates) { clientMessages[clientUpdate.update.topic] = ... }. In less-hot code paths or when N is small I'm sure it's not a big deal but I am gathering that this will be happening rapidly many times in a row for each topic during preloading?

In the worst case this code block runs N*M times (for each client) while preloading data, where N is the number of blocks, M is the number of topics, so usually this is pretty small. In relative terms the compute cost of this is dwarfed by dataset generation and downsampling, where runtime complexity is obviously a function of the number of messages.

jtbandes · 2023-12-22T03:10:36Z

packages/studio-base/src/panels/Plot/processor/messages.ts

-
-      if (isSingleMessage(params)) {
-        const plotData = buildPlot(
+        current: accumulate(


Just for my understanding: in the case where we do have both blocks & current data, where is the logic that chooses to use the block data instead of the current data?

…ont-keep-block-data

achim-k · 2023-12-28T17:32:13Z

packages/studio-base/src/panels/Plot/processor/messages.ts

@@ -24,6 +25,9 @@ import { BlockUpdate, ClientUpdate } from "../blocks";
 import { Messages } from "../internalTypes";
 import { isSingleMessage } from "../params";

+// Maximum number of accumulated current messages before triggering a cull
+const ACCUMULATED_CURRENT_MESSAGE_CULL_THRESHOLD = 50_000;


Should we also have an upper limit for memory size? Depending on the message size, 50k messages might be way more than the allowed JS heap size, so keeping on to the full message objects should be avoided.

Possibly - though I think we use typedarrays for storing this data so it won't be the same kind of impact on the heap?

Ah sorry, you are right. I was referring to the wrong code section, I actually meant

studio/packages/studio-base/src/panels/Plot/useDatasets.ts

Line 137 in b31eca4

datasetsState = updateCurrent(messages, datasetsState);

where we concatenate new current message events to existing ones, and there seems to be no culling in place like it is here. I'm pretty sure that this is one of the main reasons for OOM crashes.

vercel bot deployed to Preview December 15, 2023 13:15 View deployment

cfoust force-pushed the caleb/13-12-23/feat/dont-keep-block-data branch from 34c1f87 to 1b89f73 Compare December 15, 2023 14:13

vercel bot deployed to Preview December 15, 2023 14:21 View deployment

vercel bot deployed to Preview December 15, 2023 16:07 View deployment

vercel bot deployed to Preview December 15, 2023 16:50 View deployment

cfoust force-pushed the caleb/13-12-23/feat/dont-keep-block-data branch from e71a485 to 5541aaf Compare December 15, 2023 17:10

vercel bot deployed to Preview December 15, 2023 17:15 View deployment

vercel bot deployed to Preview December 18, 2023 19:20 View deployment

cfoust force-pushed the caleb/13-12-23/feat/dont-keep-block-data branch from 85052f9 to c7b3f8f Compare December 18, 2023 19:36

vercel bot deployed to Preview December 18, 2023 19:39 View deployment

cfoust force-pushed the caleb/13-12-23/feat/dont-keep-block-data branch from c7b3f8f to 45ad888 Compare December 18, 2023 20:03

vercel bot deployed to Preview December 18, 2023 20:06 View deployment

vercel bot deployed to Preview December 18, 2023 23:29 View deployment

vercel bot deployed to Preview December 18, 2023 23:35 View deployment

cfoust added 16 commits December 18, 2023 15:44

feat: skeleton for preparing complete block updates

da7aed2

feat: hacky (but working) client-based block state

0ae8803

feat: mostly integrate BlockUpdates

8d5dd82

feat: it's alive

3294818

feat: semi-hacky clearing mechanism

959ef27

feat: apply client updates all at once

d6cdbd8

feat: massive cleanup and tests

3c2378c

feat: rename useDatasets usages

c0a00ca

chore: remove unused functions

0fb1a4b

fix: tests for processBlocks

fb34cf5

feat: tests for prepareUpdate

c8a16e7

feat: rename prepareUpdate to prepareBlockUpdate

d8bba23

feat: break out client state machine

1559ee9

fix: docstring

6686362

feat: tests for client state machine

c1756a7

chore: lint

186b870

defunctzombie requested a review from achim-k December 19, 2023 20:05

feat: retain current data and send on demand

fdd1494

vercel bot deployed to Preview December 19, 2023 20:45 View deployment

fix: tests for addCurrentData

e72d976

vercel bot deployed to Preview December 19, 2023 20:49 View deployment

defunctzombie reviewed Dec 20, 2023

View reviewed changes

feat: only send relevant current messages

52c8bc3

vercel bot deployed to Preview December 20, 2023 18:34 View deployment

feat: add test for new addCurrentData behavior

685e3b3

vercel bot deployed to Preview December 20, 2023 18:45 View deployment

cfoust mentioned this pull request Dec 20, 2023

Don't keep current messages in the plot worker if block data already has them #7226

Closed

achim-k reviewed Dec 21, 2023

View reviewed changes

chore: typo

5277f52

vercel bot deployed to Preview December 21, 2023 21:38 View deployment

jtbandes reviewed Dec 22, 2023

View reviewed changes

feat: PR stuff

0111360

vercel bot deployed to Preview December 22, 2023 13:24 View deployment

update test

1b3a785

vercel bot deployed to Preview December 23, 2023 00:07 View deployment

defunctzombie added 2 commits December 23, 2023 09:43

Merge remote-tracking branch 'origin/main' into caleb/13-12-23/feat/d…

77b26aa

…ont-keep-block-data

Cull accumulated current data once it reaches the culling threshold

d5c20b5

vercel bot deployed to Preview December 23, 2023 18:15 View deployment

jtbandes approved these changes Dec 23, 2023

View reviewed changes

defunctzombie merged commit dcf781b into main Dec 23, 2023
14 checks passed

defunctzombie deleted the caleb/13-12-23/feat/dont-keep-block-data branch December 23, 2023 20:24

achim-k reviewed Dec 28, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not hold on to block data in the plot worker #7235

Do not hold on to block data in the plot worker #7235

cfoust commented Dec 15, 2023 •

edited

cfoust commented Dec 19, 2023

defunctzombie commented Dec 20, 2023

defunctzombie Dec 20, 2023

cfoust Dec 20, 2023

cfoust commented Dec 20, 2023 •

edited

cfoust commented Dec 20, 2023

achim-k Dec 21, 2023

achim-k Dec 21, 2023

cfoust Dec 21, 2023

achim-k Dec 21, 2023

cfoust Dec 21, 2023

achim-k Dec 21, 2023

cfoust Dec 21, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023

jtbandes Dec 22, 2023

cfoust Dec 22, 2023 •

edited

jtbandes Dec 22, 2023

achim-k Dec 28, 2023

defunctzombie Dec 28, 2023

achim-k Dec 29, 2023

achim-k Jan 2, 2024

Do not hold on to block data in the plot worker #7235

Do not hold on to block data in the plot worker #7235

Conversation

cfoust commented Dec 15, 2023 • edited

cfoust commented Dec 19, 2023

defunctzombie commented Dec 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfoust commented Dec 20, 2023 • edited

cfoust commented Dec 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfoust Dec 22, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfoust commented Dec 15, 2023 •

edited

cfoust commented Dec 20, 2023 •

edited

cfoust Dec 22, 2023 •

edited