add `robots.txt` and a `nofollow` robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

zenmonkeykstop · 2022-02-18T19:01:08Z

No description provided.

We don't want the SI ending up in traditional clearnet search engines via tor2web proxies. So add a restrictive robots.txt to disallow all crawlers that respect it, plus a `<meta name="robots" content="noindex,nofollow">` tag which tells any search engine that ignores the robots.txt to not index the page, nor follow any links on the page. Fixes #6292.

zenmonkeykstop mentioned this issue Feb 18, 2022

Improve Tor2Web detection and handling #6290

Closed

4 tasks

legoktm self-assigned this Feb 18, 2022

legoktm mentioned this issue Feb 18, 2022

Ask search engines not to index Source Interface #6299

Merged

4 tasks

zenmonkeykstop closed this as completed in #6299 Feb 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `robots.txt` and a `nofollow` robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

add `robots.txt` and a `nofollow` robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

zenmonkeykstop commented Feb 18, 2022

add robots.txt and a nofollow robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

add robots.txt and a nofollow robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

Comments

zenmonkeykstop commented Feb 18, 2022

add `robots.txt` and a `nofollow` robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292

add `robots.txt` and a `nofollow` robots meta tag to dissuade legitimate crawlers from crawling tor2webb-ed instances #6292