<div class='mp'>
<h1>Can I Crawl (this URL)</h1>
<p>Hosted robots.txt permissions verifier.</p>
<li><a href=""><code>/</code></a> This page.</li>
<li><a href=""><code>/check</code></a> Runs the robots.txt verification check.</li>
<h2 id="Description">Description</h2>
<p>Verifies if the provided URL is allowed to be crawled by your User-Agent. Pass in the destination URL and the service will download, parse and check the <a href="">robots.txt</a> file for permissions. If you're allowed to continue, it will issue a <strong>3XX</strong> redirect, otherwise a <strong>4XX</strong> code is returned.</p>
<h2 id="Examples">Examples</h2>
<h3 id="-curl-v-http-canicrawl-appspot-com-check-url-http-www-google-com-">$ curl -v</h3>
<pre><code>&lt; HTTP/1.0 302 Found
&lt; Location:
<h3 id="-curl-v-http-canicrawl-appspot-com-check-url-http-www-google-com-search">$ curl -v</h3>
<pre><code>&lt; HTTP/1.0 403 Forbidden
&lt; Content-Length: 23
<h3 id="-curl-H-User-Agent-MyCustomAgent-v-http-canicrawl-appspot-com-check-url-http-www-google-com-">$ curl -H'User-Agent: MyCustomAgent' -v</h3>
<pre><code>&gt; User-Agent: MyCustomAgent
&lt; HTTP/1.0 302 Found
&lt; Location:
<p>Note: <a href=""></a> disallows requests to <em>/search</em>.</p>
<h2 id="License">License</h2>
<p>MIT License - Copyright (c) 2011 <a href="">Ilya Grigorik</a></p>
