Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Port script for finding/deleting unreferenced state groups into Synapse #7635

Open
erikjohnston opened this issue Jun 4, 2020 · 3 comments
Labels
A-Disk-Space things which fill up the disk A-Performance Performance, both client-facing and admin-facing z-p2 (Deprecated Label)

Comments

@erikjohnston
Copy link
Member

erikjohnston commented Jun 4, 2020

There is a really simple rust script here: https://github.com/erikjohnston/synapse-find-unreferenced-state-groups for finding unreferenced state groups. Unlike the compress state script there isn't really a reason for it to be in rust, and given a) its small and mostly just SQL and b) needs a lot of rework anyway, we may as well port it o Python and add it to Synapse.

For now it should just be a script that can run independently (though part of the synapse code), but we may wish to look into running it from within synapse later.

c.f. #3364

@erikjohnston erikjohnston added z-p2 (Deprecated Label) A-Performance Performance, both client-facing and admin-facing labels Jun 4, 2020
@richvdh richvdh added the A-Disk-Space things which fill up the disk label Jun 11, 2020
@clokep
Copy link
Contributor

clokep commented Jun 17, 2020

needs a lot of rework anyway

@erikjohnston Can you expand upon what this means?

Were the steps you were thinking for this:

  1. Pull out the SQL and port those to Python somewhere under synapse.*
  2. Add a script that actually goes ahead and find/deletes the unreferenced state.

I think you also mentioned to be careful that this doesn't get applied to "new" state that might not be referenced yet but will soon.

@erikjohnston
Copy link
Member Author

Yeah, exactly. We'd want all the functional bits of the script in synapse.* so that we can later add it to the admin API or run it automatically or whatever.

We probably also want to make sure that it can work incrementally, so that it doesn't have to do it all at once

I think you also mentioned to be careful that this doesn't get applied to "new" state that might not be referenced yet but will soon.

Yeah. The current script just takes a parameter for the maximum state group to look at, which you can get by looking for a state group that was persisted before the last restart

@clokep
Copy link
Contributor

clokep commented Oct 20, 2020

but we may wish to look into running it from within synapse later.

Now that we have support for running background tasks on a separate worker, it might be a logical place to run it. I'm unsure of the impact on smaller deploys that don't have this worker though.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Disk-Space things which fill up the disk A-Performance Performance, both client-facing and admin-facing z-p2 (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

3 participants