Reduce amount of subscription data stored in Elixir registry #1006

zoldar · 2020-12-22T09:49:29Z

This PR is a follow-up to a proposal that was initially posted on Elixir slack and on Elixir Forum.

Instead of storing precomputed initial_phases where options (along with context) are repeated several times over, we are storing just enough information to reconstruct them using Absinthe.Pipeline.for_document/2 which comes down to building a list of phases with options put in a number of spots, which should not be computationally expensive.

If I haven’t missed anything obvious, that change should be transparent and non-breaking 😅

I have run a series of simple tests with absinthe’s test suite as well as test suite of a larger app I’m working on that uses subscriptions extensively. For the purpose of the experiment I have added a line calculating byte size of a serialized map stored in elixir registry by Absinthe.Subscriptions.subscribe, like this:

doc_value |> :erlang.term_to_binary() |> byte_size() |> IO.inspect(label: :PAYLOAD)

This is not the same amount of space it takes up in ETS memory but it gives an idea in relative terms. Then I have run tests for both absinthe and the app without and with the patch applied. Here are the results:

https://gist.github.com/zoldar/9f06e95bfced82cd25793c25f2e632c5

As can be seen, the memory usage can be lowered even >10x this way. Given, in our case, roughly ~90% of VM memory in production is taken up by those ETS entries for subscriptions (multiple subscriptions per user), that would substantially lower memory usage. What do you think?

UPDATE: Here’s one more round of test runs against our app where the sum of memory occupied by the ETS tables for key partitions is checked (10648 is subtracted from each table because that seems to be the base size on this particular setup):

https://gist.github.com/zoldar/d99c636a933725c73d2a4f7131fdd936

benwilson512 · 2020-12-22T17:19:38Z

Hey @zoldar this is really great! The current structure of some of this stuff definitely causes issues when the structures are flattened inside of :ets. The changes you're making here are very small and look pretty safe, but I need to make sure that the phases are the correct ones. I'll pull this down today and take a look, thanks!

benwilson512

I think the specific approach here can't work, for_document/2 is simply not equivalent to doc.initial_phases. However, if there are a lot of memory savings to be had, then perhaps we can do some "compression" of sorts on the pipeline that is registered to minimize the space duplication that occurs when it is flattened in the :ets table. Let me take a look and see.

lib/absinthe/subscription.ex

zoldar · 2020-12-28T13:18:16Z

@benwilson512 You are right of course, I got hung up on Absinthe.run using that default pipeline 🤦

I have made a PoC of an alternative approach in 187e655 - as described in #1006 (comment).

I have run that version against our app's test suite again and the savings seem to be still pretty sizable:

https://gist.github.com/zoldar/f185d0896a217e36db13bc1f67f7f12e

I'm still going to verify it with additional tests, but when inspecting the serialized/deserialised data, it seems to behave correctly. What do you think about this approach?

binaryseed · 2020-12-28T18:35:50Z

I'm curious if we can look at the broader context of this... what is it that is winding up in options that is getting duplicated & taking up a lot of memory?

benwilson512 · 2020-12-29T19:12:28Z

@binaryseed that's a good question. Often one killer thing for folks is the context, a large context can explode things very fast. Large inputs (variables) can do the same thing.

In Absinthe 2.0 I want to build a proper %Absinthe.Pipeline{} struct that would contain a simple phases key which would be a list of just the modules, and then an options key, I think per phase options adds unnecessary complications.

I think the best option right now is to sort of re-create that structure by de-duplicating the options. Off the top of my head:

defp compress([], phases, opts_map, _index) do
  {Enum.reverse(phases), Map.new(opts_map, fn {k, v} -> {v, k} end)}
end

defp compress([phase | phases], compressed_phases, opts_map, index) do
  case phase do
    {phase, opts} ->
      {opts_map, index} = compress_opts(opts_map, opts, index)
      compress(phases, [{phase, index} | compressed_phases], opts_map, index)
    phase ->
      compress(phases, [phase, compressed_phases], opts_map, index)
  end
end

defp compress_opts(opts_map, opts, index) do
  if Map.has_key?(opts_map, opts) do
    {opts_map, index}
  else
    index = index + 1
    opts_map = Map.put(opts_map, opts, index)
    {opts_map, index}
  end
end

Basically, this turns a list of {phase, opts} tuples into a list of {phase, integer} tuples where the integer maps to a unique opts value. Haven't actually run this but I'm pretty sure it's roughly correct. Decompressing is as simple as:

Enum.map(compressed_phases, fn
  {phase, i} -> {phase, Map.fetch!(opts_map, i)}
  phase -> phase
end)

zoldar · 2020-12-29T19:37:56Z

@benwilson512 Just to double-check - have you seen 187e655 ? 😅 (mentioned in #1006 (comment))

benwilson512

Ah lol I missed that sorry. Looks great, and I'm glad to see there are still size savings.

lib/absinthe/subscription/phase_serializer.ex

benwilson512 · 2020-12-29T20:07:37Z

lib/absinthe/subscription/phase_serializer.ex

+
+      {{phase, options_label}, options_reverse_map}
+    else
+      new_index = map_size(options_reverse_map)


Ah clever, using the map size as an implicit indexer.

zoldar · 2020-12-29T20:27:17Z

@benwilson512 Cool! I'll get it into shape (apply feedback and add serializer tests) soon.

zoldar · 2020-12-30T00:14:25Z

@benwilson512 The PR should be now fully ready for review 😅

One observation though. I'm keeping a fork of an older tag of absinthe at the moment with the same patch applied. The fork does not contain this change: b407000. Telemetry phases there are repeating the original options two times with customized event for each respective phase. For current version of absinthe that means - provided this default pipeline serves as a base - that even the packed version will always have at least 3 full copies of options stored from the get go - with a significant overlap.

I'm wondering if it would be worth the effort to make the serializer try to handle this, doing an extra processing step (diffing existing options and finding a common base perhaps)? Maybe there's a better way to handle this though.

(That's definitely something beyond the scope of this PR).

zoldar · 2021-01-06T23:21:49Z

If there's anything that needs addressing to make this patch acceptable, please let me know. Thanks!

benwilson512 · 2021-01-09T05:44:29Z

This looks great, thank you!

zoldar force-pushed the zoldar/reduce-size-of-subscription-data-kept-in-elixir-registry branch from aa4760e to 26ce61b Compare December 22, 2020 09:52

benwilson512 requested changes Dec 24, 2020

View reviewed changes

lib/absinthe/subscription.ex Show resolved Hide resolved

zoldar added 2 commits December 28, 2020 12:26

Reduce amount of subscription data stored in Elixir registry

d91bd35

Implement packing pipeline when storing in the registry

187e655

zoldar force-pushed the zoldar/reduce-size-of-subscription-data-kept-in-elixir-registry branch from 26ce61b to 187e655 Compare December 28, 2020 13:10

benwilson512 approved these changes Dec 29, 2020

View reviewed changes

zoldar added 3 commits December 29, 2020 22:08

Adjust naming to make it consistent with the rest of the codebase

b15f4e5

Use map_reduce instead of reduce + reverse

67acc31

Add tests for PipelineSerializer

6952830

zoldar requested a review from benwilson512 December 29, 2020 22:35

Fix a typo in PipelineSerializer test

3591de7

benwilson512 merged commit 5e54c50 into absinthe-graphql:master Jan 9, 2021

maartenvanvliet mentioned this pull request Jan 24, 2022

'Add __private__ field to EnumValueDefinition struct' maartenvanvliet/absinthe#2

Closed

maartenvanvliet mentioned this pull request Aug 26, 2022

Mark root objects as referenced maartenvanvliet/absinthe#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce amount of subscription data stored in Elixir registry #1006

Reduce amount of subscription data stored in Elixir registry #1006

zoldar commented Dec 22, 2020

benwilson512 commented Dec 22, 2020

benwilson512 left a comment

zoldar commented Dec 28, 2020

binaryseed commented Dec 28, 2020

benwilson512 commented Dec 29, 2020

zoldar commented Dec 29, 2020

benwilson512 left a comment

benwilson512 Dec 29, 2020

zoldar commented Dec 29, 2020 •

edited

Loading

zoldar commented Dec 30, 2020 •

edited

Loading

zoldar commented Jan 6, 2021

benwilson512 commented Jan 9, 2021

Reduce amount of subscription data stored in Elixir registry #1006

Reduce amount of subscription data stored in Elixir registry #1006

Conversation

zoldar commented Dec 22, 2020

benwilson512 commented Dec 22, 2020

benwilson512 left a comment

Choose a reason for hiding this comment

zoldar commented Dec 28, 2020

binaryseed commented Dec 28, 2020

benwilson512 commented Dec 29, 2020

zoldar commented Dec 29, 2020

benwilson512 left a comment

Choose a reason for hiding this comment

benwilson512 Dec 29, 2020

Choose a reason for hiding this comment

zoldar commented Dec 29, 2020 • edited Loading

zoldar commented Dec 30, 2020 • edited Loading

zoldar commented Jan 6, 2021

benwilson512 commented Jan 9, 2021

zoldar commented Dec 29, 2020 •

edited

Loading

zoldar commented Dec 30, 2020 •

edited

Loading