Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

purge_history_api.rst: add example script #1034

Closed
wants to merge 1 commit into from

Conversation

rubo77
Copy link
Contributor

@rubo77 rubo77 commented Aug 21, 2016

In this pull request I add an example Script To The Purge API Readme, that automatically purchase all data older than certain time with the option to really delete the data on your home server.

Signed-off-by: Ruben Barkow github@r.z11.de

@matrixbot
Copy link
Member

Can one of the admins verify this patch?

4 similar comments
@matrixbot
Copy link
Member

Can one of the admins verify this patch?

@matrixbot
Copy link
Member

Can one of the admins verify this patch?

@matrixbot
Copy link
Member

Can one of the admins verify this patch?

@matrixbot
Copy link
Member

Can one of the admins verify this patch?

# $ synctl start

# This could be set, so you don't need to prune every time after deleting some rows:
# $ sqlite3 homeserver.db "PRAGMA auto_vacuum = FULL;"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like auto_vacuum = FULL does not have any effect?

@rubo77
Copy link
Contributor Author

rubo77 commented Nov 18, 2016

I fixed the database is locked problem with a pragma busy timeout, so this can be merged now

@kyrias
Copy link
Contributor

kyrias commented Dec 3, 2016

Would be nice if this supported Postgres too, but you should probably recommend stopping synapse before running the query against the sqlite3 DB, because multiple writers to an sqlite3 DB is a recipe for corruption.

@rubo77
Copy link
Contributor Author

rubo77 commented Dec 3, 2016

I thought the busy_timeout will sort this out further?
And if you call via API, the calls are rolled out one after another

@rryan
Copy link

rryan commented Dec 19, 2016

This was helpful, thanks! BTW the pragma busy timeout adds an additional line of output to BUFFER which prevents awk from properly selecting the event ID.

@rubo77
Copy link
Contributor Author

rubo77 commented Dec 19, 2016

@rryan do you mean we have to change something more? or do you just explain, why I used this?

EVENT_ID=$(echo $BUFFER|awk '{print $2}')

@4nd3r
Copy link
Contributor

4nd3r commented Dec 19, 2016

FYI #1621

@richvdh
Copy link
Member

richvdh commented May 9, 2018

The API has changed now, so I'm afraid this is out of date.

@rubo77
Copy link
Contributor Author

rubo77 commented Oct 14, 2018

@richvdh: the changes are not relevant to my script

@rubo77 rubo77 changed the base branch from master to develop October 14, 2018 20:24
@rubo77
Copy link
Contributor Author

rubo77 commented Oct 14, 2018

I added hints how to solve this in Postgres too.

And I added the command to really delete local events and messages, which is what users want to achieve mostly i think.

Copy link
Contributor

@aaronraimist aaronraimist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is missing a changelog (that's why the build fails) https://github.com/matrix-org/synapse/blob/master/CONTRIBUTING.rst#changelog

…pi.rst

Signed-off-by: Ruben Barkow <github@r.z11.de>
@rubo77
Copy link
Contributor Author

rubo77 commented Oct 30, 2018

changelog added and commits squashed.

@rubo77
Copy link
Contributor Author

rubo77 commented Nov 2, 2018

You could easily adapt the script to delete all rooms with a loop like

ROOMS=$(sudo -u postgres psql -t -A --dbname="$DBNAME" --command="SELECT room_id FROM rooms;" 2>/dev/null | grep -v 'Pager')

for ROOM_NAME in $ROOMS; do
  ...

with a more sophisticated select command you can select only the rooms that are not encrypted already:

psql -t -A --dbname="synapse" --command="SELECT room_id FROM rooms where room_id not in (SELECT distinct room_id FROM events where type ='m.room.encrypted');" | grep -v 'Pager'

@rubo77
Copy link
Contributor Author

rubo77 commented Nov 5, 2018

But I suggest to only purge old history in rooms with many users, for example more than 10 users:

select * from (select count(*) as numberofusers, room_id from current_state_events where type ='m.room.member' group by room_id) as q left join room_aliases a on q.room_id=a.room_id where q.numberofusers >= 10 order by numberofusers desc

instead of purging before a certain age

I think better it is to only purge messages in rooms that are more than 1000 messages ago, so you keep old conversations, that someone might still be interested in.

I think this would be the select for the right event to start the purge upon:

select event_id from events where type='m.room.message' and room_id ='$ROOM' order by received_ts desc limit 1 offset 999"

@richvdh
Copy link
Member

richvdh commented Nov 6, 2018

sorry, I don't think we're going to be able to accept a script of this complexity into the official documentation; we won't have the resources to maintain it.

We could consider putting something into contrib if you like.

@rubo77
Copy link
Contributor Author

rubo77 commented Nov 6, 2018

I created a new PR in #4155 that adds the script to contrib/scripts instead and with a much enhanced script, that loops over more rooms and waits for one purge to finish until the next one starts

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants