Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update federated posts to not cache sensitive images if not allow by local site #3253

Merged
merged 2 commits into from Jun 30, 2023

Conversation

lemmus-org
Copy link
Contributor

@lemmus-org lemmus-org commented Jun 22, 2023

This should prevent federated images from getting locally stored on servers that don't allow sensitive content. Instead, it will rely on the provided image URL if available.

Let me know if I'm overlooking anything obvious or there's a better way to approach.

Issue: #3276

@lemmus-org
Copy link
Contributor Author

I tried to fix the formatting, but git/CI haven't picked it up. I'll work on getting me dev environment configured to run the full suite.

@Nutomic
Copy link
Member

Nutomic commented Jun 22, 2023

Please include a link to the relevant issue.

@lemmus-org lemmus-org force-pushed the optional-sensitive-image-cache branch from fedf50a to 62c5071 Compare June 22, 2023 23:33
@lemmus-org
Copy link
Contributor Author

@Nutomic Updated to include relevant issue. Let me know if you need anything else.

@lemmus-org lemmus-org force-pushed the optional-sensitive-image-cache branch 2 times, most recently from ab512ab to 94b5c8b Compare June 22, 2023 23:39
{
(metadata, None) if !include_image => (metadata, page.image.map(|i| i.url.into())),
(metadata, thumbnail) => (metadata, thumbnail),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think you need this confusing match here. Should be enough to do something like let thumbnail_url = thumbnail_url.unwrap_or(page.image.url) below. Need to adjust it a bit to make it work.

Though actually it seems like a bad idea to use the full image as "thumbnail". Could end up with some multi MB images served as thumbnails which would take up lots of bandwidth. So I would rather leave the thumbnail as None.

We could also federate thumbnail urls from the original instance, but that will be more complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's definitely a much cleaner approach! I tried to keep the original logic the same as much as possible for now.

We are currently using thumbnail_url in into_json for federating outbound. I guess the problem still stands when someone is federating inbound with large images, for example from a non-lemmy server. Which would likely require cutting a thumbnail to minimize bandwidth issues...

That does fallback to the original problem of trying to ensure sensitive content isn't hosted locally when not configured. I've outlined a few possible followups on the ticket to explore later once I'm more familiar with both the codebase and pict-rs integration. It should be possible to solve for performance while respecting the local site's configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we are federating the thumbnail url in page.image (which is a bit weird naming). So you could change the code to always let thumbnail_url = page.image, no need to check for sensitive at all.

Problem with this approach is that a malicious instance could set a misleading thumbnail. But that would be easy to notice and then admins can block the instance, so I dont think its a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you're getting at. I do think it's a nice design having instances distribute some of the load from the origin by keeping a local copy (text/media). This way if the origin server is having uptime issues it largely only impacts its users and not those federating its content. Additionally, it means everyone is paying a fair share for their users' usage of the broader network.

That said, we do need some better controls or improvements to how the media is federated to avoid any potential legal issues. While I was looking for a band-aid for the short-term while exploring long-term solutions, your approach at least limits the blast radius entirely to the local site.

Ultimately, I'm happy to defer to what you think is best for now and we can always explore alternatives down the line. Let me know and I'll update accordingly.

@lemmus-org lemmus-org force-pushed the optional-sensitive-image-cache branch from 94b5c8b to af62ec4 Compare June 28, 2023 01:19
@Nutomic Nutomic merged commit 8107627 into LemmyNet:main Jun 30, 2023
1 check passed
@lemmus-org lemmus-org deleted the optional-sensitive-image-cache branch June 30, 2023 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants