Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(electric): Proposed speedup of Sent Rows Graph popping method #1389

Merged
merged 6 commits into from
Jun 25, 2024

Conversation

msfstef
Copy link
Contributor

@msfstef msfstef commented Jun 19, 2024

Addresses VAX-1977

While running Linearlite multi-project, I noticed unsubscribes took a very long time, and I found that the Graph.delete_vertices operation is what takes the most time to run by far, not scaling very well.

It doesn't make sense to me that deleting a bunch of vertices from a tree graph would take this long especially when traversing the whole graph is fast, so reconstructing it should really not be too much trouble (?), so my conjecture is that the delete_vertices implementation which interacts with a map quite a lot is just suboptimal.

The proposed change is a minimal diff to do the same thing which with the same workload I'm observing >100x improvement in speed with much better scaling properties, but obviously when you get this kind of speedup it casts doubt on its correctness.

Instead of deleting vertices, we keep track of all edges to be removed, as well as vertices to be removed - then run Graph.remove_edge which seems to have a fairly optimal implementation for all edges to be popped, and instead of removing the vertices with Graph.delete_vertices we use Graph.subgraph with a diff of the graph's existing vertices minus the ones to be removed, which should generate a maximally connected subgraph containing the specified vertices.

Since the request related edges have been removed, along with the disconnected vertices, the subgraph should be (?) the same as what we were producing before, but I suspect because of the implementation of Graph.subgraph being more aimed at recreating a graph from scratch through traversing the existing one rather than modifying the existing one it is more efficient. edit: I suspect also that since once a vertex that forms a subtree that is going to be removed entirely is marked as removed, constructing a subgraph ignores the whole subtree whereas delete_vertices will try to clean up the subtree that we're going to entirely ignore anyway (?)

Even if there is an issue with the correctness of this approach, I believe we should be able to traverse the graph from the root to recreate a graph without the specified edges and vertices much faster than with Graph.delete_vertices, even if it requires a more custom implementation.

Testing with Linearlite varying number of issues and comments, split over 3 projects/shapes.
The unsubscribe Shapes.SentRowsGraph.pop_by_request_ids call takes:

For 3k rows / shape (single shape)

  • old ~1200ms
  • new ~ 30ms

For 10k rows / shape (single shape)

  • old ~11000ms
  • new ~50ms

For 20k rows / shape (multi-shape)

Old
  • sub 1 shape unsub 1 shape ~ 46000ms
  • sub 2 shapes unsub 1 shape ~ 180000ms
  • ( didn't bother with 3 shapes)
New
  • sub 1 shape unsub 1 shape ~100ms
  • sub 2 shapes unsub 1 shape ~ 300ms
  • sub 3 shapes unsub 1 shape ~ 400ms

@msfstef msfstef changed the title proposed improvement feat(electric): Proposed speedup of Sent Rows Graph popping method Jun 20, 2024
@msfstef msfstef marked this pull request as ready for review June 20, 2024 15:31
Copy link
Contributor

@icehaunter icehaunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

@msfstef msfstef merged commit 5e16611 into main Jun 25, 2024
7 checks passed
@msfstef msfstef deleted the msfstef/temp-test-improvement branch June 25, 2024 09:59
msfstef added a commit that referenced this pull request Jun 26, 2024
Addresses
[VAX-1985](https://linear.app/electric-sql/issue/VAX-1985/large-unsubscribes-lead-client-with-wa-sqlite-driver-to-a-stack)

Apparently the method we were using, where we chained `OR` and `AND`
clauses in the `WHERE` clause, led to:
1. SQLite call stack overflow (at least with `wa-sqlite`) at around 6k
`OR` clauses, far below the max parameter limit
2. Sub-optimal performance in both SQLite and PGlite

I've converted the batching to use `IN` statements for both
single-column queries and multi-column ones, like with composite primary
keys. The performance is significantly improved and has room for more
improvement:

#### Old Batching
##### Shadow Table 30k deletes
- ~4700ms pglite
- ~500ms wa-sqlite (arbitrary small batching to avoid overflow)

##### Comment 9k deletes
- ~700ms pglite
- ~2000ms wa-sqlite (max 6k batches)

#### New Batching
##### Shadow Table 30k deletes
- ~100ms pglite
- ~450ms wa-sqlite

##### Comment 9k deletes
- ~40ms pglite
- ~700ms wa-sqlite


While the shadow table deletes take similar time for wa-sqlite (~500ms),
the shadow table uses a triple composite key of 3 string columns.
There's room for significant optimization there, concatenating the
parameters and deleting based on the concatenated PK columns reduced the
time to execute from ~500ms to ~50ms - pglite also had a very small
improvement but hard to measure.

I think it's possible to avoid having a composite primary key for the
shadow table since we know it's 3 string columns and they can be
collapsed into one if that proves to be a significant optimization, but
that is outside of the scope of this fix, just raising it here as a
potential optimization.


NOTE: this improvement is the result of the same investigation that
yielded #1389, both
together result in a useable linearlite with >2k issues otherwise we run
into timeouts and call stack overflows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants