Skip to content

Commit

Permalink
hgwsgi: disallow crawling hg.mozilla.org (Bug 1830926) r=glob
Browse files Browse the repository at this point in the history
Crawling on hg.mozilla.org has become a more and more frequent
problem in recent memory. This commit changes the `robots.txt`
file to be fully restrictive, except for the main landing
page and other pages which only list the existing repositories
on hg.mozilla.org.

Differential Revision: https://phabricator.services.mozilla.com/D176928

--HG--
extra : moz-landing-system : lando
  • Loading branch information
cgsheeh committed May 2, 2023
1 parent 0a1593c commit da25bd0
Showing 1 changed file with 12 additions and 7 deletions.
19 changes: 12 additions & 7 deletions hgwsgi/robots.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
User-agent: *
Allow: /$
Allow: /automation$
Allow: /build$
Allow: /ci$
Allow: /conduit-testing$
Allow: /hgcustom$
Allow: /l10n$
Allow: /l10n-central$
Allow: /projects$
Allow: /integration$
Allow: /releases$
Disallow: /

Disallow: /*/annotate/
Disallow: /*/graph
Disallow: /*/graph/
Disallow: /*/archive/
Disallow: /*/diff/
Disallow: /*/comparison/
Disallow: /*/raw-*

0 comments on commit da25bd0

Please sign in to comment.