Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

When we redact events, any mxc content they refer to should be redacted too (SYN-216) #1263

Open
matrixbot opened this issue Dec 24, 2014 · 25 comments
Labels
A-Media-Repository Uploading, downloading images and video, thumbnailing T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. z-feature (Deprecated Label) z-privacy-sprint (Deprecated Label)

Comments

@matrixbot
Copy link
Member

It's a bit of a disasterous thinko that we can redact events which point to stuff in the media repo, and that content is subsequently preserved even though the event is nuked. We should rm it and its caches too (assuming the HS is honouring redactions).

(Imported from https://matrix.org/jira/browse/SYN-216)

(Reported by @ara4n)

@matrixbot
Copy link
Member Author

Jira watchers: @ara4n

@matrixbot
Copy link
Member Author

matrixbot commented Dec 24, 2014

Links exported from Jira:

relates to SYN-576

@matrixbot matrixbot added A-Media-Repository Uploading, downloading images and video, thumbnailing z-p2 (Deprecated Label) z-feature (Deprecated Label) labels Nov 7, 2016
@matrixbot matrixbot changed the title When we redact events, any mxc content they refer to should be redacted too (SYN-216) When we redact events, any mxc content they refer to should be redacted too (https://github.com/matrix-org/synapse/issues/1263) Nov 7, 2016
@matrixbot matrixbot changed the title When we redact events, any mxc content they refer to should be redacted too (https://github.com/matrix-org/synapse/issues/1263) When we redact events, any mxc content they refer to should be redacted too (SYN-216) Nov 7, 2016
@ara4n ara4n added p1 and removed z-p2 (Deprecated Label) labels Jan 6, 2017
@ara4n
Copy link
Member

ara4n commented Jan 6, 2017

We just had a minor disaster with this happening (the MXC URL was bridged to IRC, so redacting the content on Matrix was achieving nothing). This should be trivial to fix...

@erikjohnston
Copy link
Member

erikjohnston commented Jan 6, 2017 via email

@ara4n
Copy link
Member

ara4n commented Jan 8, 2017

I wonder whether a good enough compromise would be for HSes purge redacted data after a few days (Windows Recycle Bin stylee), albeit with the option of configuring the retention per HS. The idea that sensitive data can be left visible to HS admins (and clogging up diskspace) indefinitely, after being redacted, feels undesirable and unintuitive.

@jfrederickson
Copy link

This came up in #matrix:matrix.org earlier today - as an HS admin, I would really really like to be able to configure my HS to purge redacted content. At the very least, I don't want my HS to continue to serve requests for it from the media repo.

Specifically in reference to illicit content, continuing to serve it from my HS could put me in a really tough spot, legally. And if it's redacted and therefore not easy to find in the first place...

@uhoreg
Copy link
Member

uhoreg commented Nov 3, 2017

Of course, you have to be careful that the mxc content isn't referred to by a different event (possibly including an encrypted event).

@rkfg
Copy link
Contributor

rkfg commented Dec 12, 2017

This is absolutely needed to keep the homeserver storage relatively small. I set it up on a VPS and it's growing constantly. It'll become a problem in several months. At the same time we should preserve some content and maybe events from deletion like avatars of users and rooms. It would not be nice to suddenly lose them after a maintenance cycle.

@Mikaela
Copy link
Contributor

Mikaela commented Dec 19, 2018

Related: #1287.

Is #2369 a duplicate of this one?

@anoadragon453
Copy link
Member

One tricky point is that we can't just have the server delete the media on event redaction as in encrypted rooms it does not know what the attached mxc:// url is.

@ara4n
Copy link
Member

ara4n commented Aug 16, 2019

the redacting client can do it though.

@richvdh
Copy link
Member

richvdh commented Aug 19, 2019

the redacting client can do it though.

Only if we propose a way to delete media across federation (https://github.com/matrix-org/matrix-doc/issues/790)

@dkasak
Copy link
Member

dkasak commented Oct 13, 2021

I think it would make sense to solve this at least for the single HS case, by allowing a redacting client to delete the media it uploaded to its HS. Then later we could build on top of that to add a way for this to work over federation.

@DMRobertson DMRobertson added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Oct 14, 2021
@benjaoming
Copy link

benjaoming commented Dec 9, 2021

I 👍 @ara4n's suggestion #1263 (comment) - having a server configurable retention time makes sense and is a sweet spot between many needs:

  1. Users: Privacy, some minimal right to be forgotten or at least have some control
  2. Moderators: Relevant moderation tools against abuse
  3. Sysadmins: Cleaning up and maintaining servers -- without having to manually free up space for what's most likely junk

journalctl for systemd has a nice option --vacuum-time which might be of inspiration. Perhaps the same idea of time-based vacuuming can be applied to redacted media. For people who want redacted event media immediately purged, the redaction time can be "0 seconds" and for people who want more moderator control, it can be configured higher.

This would seem related to #3479 (comment) - The API for deleting on-demand deletion specifies:

POST /_synapse/admin/v1/media/<server_name>/delete?before_ts=<before_ts>

So imagining that a delete?vacuum_redacted_expiry=10s could be a possible example of such an API and similarly for purging remote caches. Of course, the server should clean up redacted media by itself, not only through API calls.

@locness3
Copy link

What's the state of this? It feels bad to find out that you have basically no way of deleting media you uploaded to a homeserver that's not your own.

@alexshpilkin
Copy link

@locness3 Note that a malicious homeserver is free to retain whatever it wants anyway and you won’t be able to tell (through technical means, barring pervasive DRM-equivalent mechanisms like Intel SGX), so this issue is purely about cooperating servers. While having some way to delete things would be good for giving users some peace of mind, relying on it for things you wanted to keep truly private probably won’t be a good idea even once it exists.

@Mikaela
Copy link
Contributor

Mikaela commented May 7, 2022

Do I understand correctly that there is no reason for well-behaved homeservers to attempt removing child sexual abuse material from their media repositories, because it's always possible that a malicious homeserver doesn't wish to do that and thus the CSAM would still be in the Matrix federation forever?

@benjaoming
Copy link

@Mikaela that's a good sharp question. I would agree that the target for fixing this issue should be well-behaved servers. I think that what @alexshpilkin points out is more of a communication issue around that function, such that an individual who is redacting contents knows what is at stake (this could be a well-meaning person who incidentally uploaded a copy of their passport).

@alexshpilkin
Copy link

alexshpilkin commented May 7, 2022

@Mikaela The comment above is correct: I was solely objecting to the implication that this function guarantees more than it actually does, that is that I wanted to say there will always be

no way of deleting media you uploaded to a homeserver that's not your own

and to some extent it’s even a feature (there’s no way of deleting information you uploaded to a brain that’s not your own, either). AFAIU the Matrix designers used “redact” instead of “delete” in the first place in an attempt to avoid users assuming that the operation provides stronger confidentiality that it actually does (too bad they have seem to abandoned that terminology in Element).

That doesn’t mean that a streamlined way to ask a cooperating homeserver to delete things would not be useful—I fully acknowledge that many things are best-effort and still useful, such as delivery of data over the Internet :)

@FSG-Cat
Copy link
Contributor

FSG-Cat commented Jul 18, 2022

AFAIU the Matrix designers used “redact” instead of “delete” in the first place in an attempt to avoid users assuming that the operation provides stronger confidentiality that it actually does (too bad they have seem to abandoned that terminology in Element).

No it was probably not choosen because of the guarantees it was probably choosen because its EXTACTLY what happens. Ever see a spy movie where a document is partially blacked out well its because its a redacted copy this is exactly what we do in matrix. We nuke part of the event we dont delete it we partially redact it.

As for the point that @Mikaela brings up. As far as i am concerned malicious servers are to be completely ignored from this debate. Why? Because redactions face this EXACT same issue already and therefore since we have redactions for events we can have redactions for media same issues faced and already concluded to be acceptable.

@locness3
Copy link

and to some extent it’s even a feature (there’s no way of deleting information you uploaded to a brain that’s not your own, either).

I find this resonable when it comes to communicating between homeservers, however if you're signed up on a homeserver that you do not own (matrix.org, as a completely random example :/ ), not being able to delete media you uploaded to it can make you feel out of control.

@shukon
Copy link

shukon commented Aug 19, 2022

Of course, you have to be careful that the mxc content isn't referred to by a different event (possibly including an encrypted event).

One tricky point is that we can't just have the server delete the media on event redaction as in encrypted rooms it does not know what the attached mxc:// url is.

Wouldn't it be possible to copy the file in the media-repo and generate a new mxc-url when forwarding to an encrypted room and using a hard-link style counter for forwarding to an unencrypted room (since the hs can keep track of unencrypted mentions of the mxc-url without privacy issues)?
For mxc-url-redactions in encrypted rooms the client has to inform the server (and not doing so results in files without any pointers to it). And deduplication-wise I think the incredible gain of space by actually being able to delete unused content would easily make up for the duplicated storage in some few edge-cases where a file is forwarded to an encrypted room.

@asakura42

This comment was marked as abuse.

@bkil
Copy link

bkil commented Sep 30, 2022

You can always inline smaller attachments within the body of the message as base64, in the URI anchor or on upload them to external servers, so you do have a workaround.

Also, I guess you should only attach files with sensitive content to E2EE rooms. Such files with be encrypted separately, and after the message is removed, its key will be lost, making the file contents unavailable.

@asakura42
Copy link

@bkil

Also, I guess you should only attach files with sensitive content to E2EE rooms.

Yes, but that limits me in actions.
I do not know all the features of the Matrix network, but I am sure that it is worth avoiding the mistakes of Telegram and other messengers, which store media forever, even after deleting the message. The point is that this creates a deceptive confidence of the users that the media they have uploaded will not remain on the server. On the contrary, the media remains literally dead weight on the server.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Media-Repository Uploading, downloading images and video, thumbnailing T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. z-feature (Deprecated Label) z-privacy-sprint (Deprecated Label)
Projects
None yet
Development

No branches or pull requests