-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent google crawling, or make it faster. #24
Comments
Ah that's odd. Google reads our mail! |
Yes I think that's a suitable solution for text files, we could cache them On Wed, Mar 18, 2015 at 9:10 PM, Mark Knol notifications@github.com wrote:
|
What is the state of this? |
See #24 This mostly solves it, though I should still do some DB caching.
We just had a google bot start crawling "preview.lib.haxe.org". (Still not sure where it scraped the URL from, but oh well).
It hit the File Browser, which currently displays a source file by opening the haxelib zip, unpacking the file, rendering it, and sending it to the client. Needless to say, with the tens (hundreds?) of thousands of files, this was causing significant strain on the server.
I've turned the preview site off for now until I fix this, either by having a faster (cached?) implementation, or by using robots.txt to block google from the file browser section.
The text was updated successfully, but these errors were encountered: