Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Respect robots.txt #2

Closed
ineffyble opened this issue Dec 31, 2022 · 3 comments
Closed

Respect robots.txt #2

ineffyble opened this issue Dec 31, 2022 · 3 comments

Comments

@ineffyble
Copy link

If this tool is going to be trying to crawl my instance, I'd like it to respect robots.txt.

@tedivm tedivm closed this as completed in ab0f97f Dec 31, 2022
@tedivm
Copy link
Owner

tedivm commented Dec 31, 2022

The update with robots.txt support was just deployed- thanks for opening the ticket!

@ineffyble
Copy link
Author

I'd recently made my domain_blocks return a HellPot, so I figured it'd be advantageous for you too. Thanks for responding so quickly!

@tedivm
Copy link
Owner

tedivm commented Dec 31, 2022

No problem- and sorry for not checking the robots file right away! I added the user agent yesterday to make sure people could find me if the bot caused problems, and it's only been up for about two days now. I'm really hoping to turn this into a tool that makes it easier for people to avoid joining hostile/bigoted servers and maybe even makes it easier for admins to block those servers, but it's still early days for the project. If you have any other feedback or ideas I'm super happy to hear them!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants