This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Link preview should respect robots.txt #3242
Labels
O-Occasional
Affects or can be seen by some users regularly or most users rarely
S-Minor
Blocks non-critical functionality, workarounds exist.
T-Defect
Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Description
When we send a link in Matrix, the client can use the Synapse's integrated prefetcher to fetch the link preview. It though doesn't check if the website allows bots there and crawls the page regardlessly.
We need to have a User-agent for Synapse and to parse robots.txt at the root of the domain user wanted to preview, if access for Synapse is denied, do not visit the URL.
The text was updated successfully, but these errors were encountered: