Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize State merge #126

Closed
wants to merge 4 commits into from

Conversation

indrekj
Copy link
Contributor

@indrekj indrekj commented Jul 5, 2019

NOTE: Merge PR 136 and 127 before this. This PR has more complex changes and needs a more in depth review.

I was inspecting why Presence.list() was quite slow in our production servers. It took couple of milliseconds when traffic was low but over few hundred milliseconds when traffic was at peak. We have about 40K presence connections between 5 instances. So far it seems that processing heartbeats is the most expensive part which block those list calls.

Within my investigation I have discovered few places that can be optimized. I've tried to ensure I didn't break anything as well as I could, but the codebase is complex and I cannot be sure that I didn't miss some edge case. All the tests however pass.

All feedback is welcome and if you have recommendations what else to check / optimize then please let me know.

Optimizations

observe_removes

observe_removes was one of the slowest parts of the State merge
function. Each time observe_removes was called, we did a full ets table
scan on the values table and copied all elements except from local
replica and then filtered them one by one. This was especially slow when
there were multiple nodes each with many tracked processes. E.g. given 4
replicas each with 25K records, we had to scan all 100K records and then
copied 75K from ets for additional processing.

We had to go through that each time if there was a non-empty delta, even
if we didn't need to remove any records.

I created a new ets table to map tags to value keys. This allows us to
find out which records we first want to query before even using ets.
Best case scenario when the delta only includes adds we don't have to do
any ets queries and the whole observe_removes is basically skipped.

With removes we now only fetch what's needed. Also because we have tags
table we now don't have to do full table scans any more.

compact

compact was another slow part. Most slowness came from
Enum.sort(MapSet.to_list(cloud))) part. I don't really see why we have
to sort this. Instead I'm now passing the cloud without modifications.
I changed the do_compact function to expect the cloud to be unordered. I
don't see why it had to be sorted.

Benchmarks

phoenix.pub_sub.bench

mix phoenix.pub_sub.bench --size 100000 --delta-size 10000

>> Merging delta with 10000 joins and leaves into 200000 element set...
   = 132.18ms    0.13s (OLD)
   = 49.9ms      0.05s (NEW)

Everything else stayed more or less the same.

my custom benchmark

I wrote a benchmark that spawned 4 VMs and set up a tracker on each of
them with pool size of 1. I then added 10K records to each tracker
divided between 10 topics and after 5 seconds, I updated 2.5K records on each
tracker. These are the merge calls:

Before:

[node1] >> Connected nodes: [:"node2@10.200.0.207", :"node3@10.200.0.207", :"node4@10.200.0.207"]
Starting pubsub with pool size of 1
[node1] >> Starting tracker with pool size of 1
[node1] >> MERGE                 16.44ms
[node1] >> Tracked 10000 clients
[node1] >> MERGE                 48.97ms
[node1] >> MERGE                 38.71ms
[node1] >> MERGE                 54.84ms
[node1] >> MERGE                 41.26ms
[node1] >> MERGE                 41.48ms
[node1] >> Updated 2.5e3 clients
[node1] >> MERGE                 62.81ms
[node1] >> MERGE                 68.38ms
[node1] >> MERGE                 77.4ms
[node1] >> MERGE                 37.67ms
[node1] >> MERGE                 21.12ms
[node1] >> EXIT

After:

[node1] >> Connected nodes: [:"node4@10.200.0.207", :"node2@10.200.0.207", :"node3@10.200.0.207"]
Starting pubsub with pool size of 1
[node1] >> Starting tracker with pool size of 1
[node1] >> MERGE                 18.82ms
[node1] >> Tracked 10000 clients
[node1] >> MERGE                 19.07ms
[node1] >> MERGE                 28.12ms
[node1] >> MERGE                 21.37ms
[node1] >> MERGE                 31.98ms
[node1] >> MERGE                 22.54ms
[node1] >> Updated 2.5e3 clients 
[node1] >> MERGE                 36.29ms
[node1] >> MERGE                 46.59ms
[node1] >> MERGE                 43.54ms
[node1] >> MERGE                 3.81ms
[node1] >> MERGE                 2.02ms
[node1] >> EXIT

State/new expects a second argument which is the shard name.
* Extract was used wrongly. Same state object was used in all three
  arguments. First argument should be the local and other two should be
  remote.
* State.replica_up/2 needs to be called, otherwise extract does not
  return a proper remote map.
* Added couple of assert statements to ensure merging actually works and
  everything is set up correctly.
@indrekj indrekj force-pushed the merge-perf-for-pr branch 3 times, most recently from 1b2a616 to 3fc1468 Compare July 6, 2019 20:41
* observe_removes

observe_removes was one of the slowest parts of the State merge
function. Each time observe_removes was called, we did a full ets table
scan on the values table and copied all elements except from local
replica and then filtered them one by one. This was especially slow when
there were multiple nodes each with many tracked processes. E.g. given 4
replicas each with 25K records, we had to scan all 100K records and then
copied 75K from ets for additional processing.

We had to go through that each time if there was a non-empty delta, even
if we didn't need to remove any records.

I created a new ets table to map tags to value keys. This allows us to
find out which records we first want to query before even using ets.
Best case scenario when the delta only includes adds we don't have to do
any ets queries and the whole observe_removes is basically skipped.

With removes we now only fetch what's needed. Also because we have tags
table we now don't have to do full table scans any more.

* compact

compact was another slow part. Most slowness came from
`Enum.sort(MapSet.to_list(cloud)))` part. I don't really see why we have
to sort this. Instead I'm now passing the cloud without modifications.
I changed the do_compact function to expect the cloud to be unordered. I
don't see why it had to be sorted.

= Benchmarks

== phoenix.pub_sub.bench

`mix phoenix.pub_sub.bench --size 100000 --delta-size 10000`

```
>> Merging delta with 10000 joins and leaves into 200000 element set...
   = 132.18ms    0.13s (OLD)
   = 49.9ms      0.05s (NEW)
```

Everything else stayed more or less the same.

== my custom benchmark

I wrote a benchmark that spawned 4 VMs and set up a tracker on each of
them with pool size of 1. I then added 10K records to each tracker
diveded by 10 topics and after 5 seconds, I updated 2.5K records on each
tracker. These are the merge calls:

Before:
```
[node1] >> Connected nodes: [:"node2@10.200.0.207", :"node3@10.200.0.207", :"node4@10.200.0.207"]
Starting pubsub with pool size of 1
[node1] >> Starting tracker with pool size of 1
[node1] >> MERGE 		 16.44ms
[node1] >> Tracked 10000 clients
[node1] >> MERGE 		 48.97ms
[node1] >> MERGE 		 38.71ms
[node1] >> MERGE 		 54.84ms
[node1] >> MERGE 		 41.26ms
[node1] >> MERGE 		 41.48ms
[node1] >> Updated 2.5e3 clients
[node1] >> MERGE 		 62.81ms
[node1] >> MERGE 		 68.38ms
[node1] >> MERGE 		 77.4ms
[node1] >> MERGE 		 37.67ms
[node1] >> MERGE 		 21.12ms
[node1] >> EXIT
```

After:
```
[node1] >> Connected nodes: [:"node4@10.200.0.207", :"node2@10.200.0.207", :"node3@10.200.0.207"]
Starting pubsub with pool size of 1
[node1] >> Starting tracker with pool size of 1
[node1] >> MERGE 		 18.82ms
[node1] >> Tracked 10000 clients 		 304.18ms
[node1] >> MERGE 		 19.07ms
[node1] >> MERGE 		 28.12ms
[node1] >> MERGE 		 21.37ms
[node1] >> MERGE 		 31.98ms
[node1] >> MERGE 		 22.54ms
[node1] >> Updated 2.5e3 clients 		 98.42ms
[node1] >> MERGE 		 36.29ms
[node1] >> MERGE 		 46.59ms
[node1] >> MERGE 		 43.54ms
[node1] >> MERGE 		 3.81ms
[node1] >> MERGE 		 2.02ms
[node1] >> EXIT
```
Using Map.update has few inefficiencies: both initial argument and
update function are created, which is unnecessary and it also involves
anonymous update function call.

Related: https://medium.com/learn-elixir/optimizing-work-with-nested-maps-in-elixir-3131da27a13e

I got about a 25% performance increase locally for the given reduce. In
the big picture this is a very minor improvement though.
@indrekj
Copy link
Contributor Author

indrekj commented Jan 7, 2020

Closing this atm. Will remove the compaction part as I'm still not sure if it is needed or not. Will add observe_removes part to our own managed branch and do some testing in beta and production.

@indrekj indrekj closed this Jan 7, 2020
@indrekj indrekj deleted the merge-perf-for-pr branch January 7, 2020 15:51
@indrekj indrekj restored the merge-perf-for-pr branch January 7, 2020 15:51
@indrekj indrekj deleted the merge-perf-for-pr branch January 15, 2020 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant