Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: robots.txt extras #35

Open
lbthomsen opened this issue Apr 20, 2016 · 2 comments
Open

Feature request: robots.txt extras #35

lbthomsen opened this issue Apr 20, 2016 · 2 comments

Comments

@lbthomsen
Copy link

Hi,

Quick feature request. I like the fact that bwp-google-xml-sitemaps add multisite sitemaps to robots.txt if a physical file is not there, but at the same time I need to throttle Bing.

A simple text box with a snippet to add to all virtual sitemaps would go along way.

@kminh
Copy link
Collaborator

kminh commented Apr 20, 2016

Hi,

Can you clarify your request? I don't quite get what you mean by "a snippet to add to all virtual sitemaps". Did you mean "virtual robots.txt files"?

@lbthomsen
Copy link
Author

Ok - in my case I have got about 200 sites running in a sub domain multisite setup. I kind of rely on the inclusion of the sitemapindex.xml in the robots.txt of the "base" site (no sub domain). One problem I am constantly facing is BingBot running amok. With regular intervals, bingbot deside to send 5-10 request per second to every single domain and that in turn make my server grind to a halt. There's two solutions to that problem - the first being blocking bingbot at least temporaril. But that is obviously not an attractive solution. The best solution is to include this in all sitemaps:

User-agent: bingbot
Crawl-delay: 5

I can do that by saving the auto generated robots.txt and editing it manually. But that means I have to remember to do that each time a new site is added and that happens almost daily.

So - the best solution would be to somehow be able to define some extra lines that will be included/appended to the generated robots.txt.

One way to do this would be a text box in the configuration. Alternatively could be the ability to define a file/path that would be appended to the robots.txt. Alternatively a hardcoded filename that would always be appended IF it existed (/path/to/site/root/robots.include).

I hope that clarifies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants