Permalink
Browse files

0.4.4

  • Loading branch information...
johnnagro committed May 21, 2009
1 parent 2de5c90 commit 6c24a9667ffbbf368907d5ce85c7ed7ca03ee195
View
@@ -1,3 +1,9 @@
+2009-05-21
+* fixed an issue with robots.txt on ssl hosts
+* fixed an issue with pulling robots.txt from disallowed hosts
+* fixed a documentation error with ExpiredLinks
+* Many thanks to Brian Campbell
+
2008-10-09
* fixed a situation with nested slashes in urls, thanks to Sander van der Vliet and John Buckley
View
6 README
@@ -62,13 +62,12 @@ scraping, collecting, and looping so that you can just handle the data.
=== Track cycles with a custom object
require 'spider'
-
class ExpireLinks < Hash
def <<(v)
- [v] = Time.now
+ self[v] = Time.now
end
def include?(v)
- [v] && (Time.now + 86400) <= [v]
+ self[v].kind_of?(Time) && (self[v] + 86400) >= Time.now
end
end
@@ -141,6 +140,7 @@ Matt Horan
Henri Cook
Sander van der Vliet
John Buckley
+Brian Campbell
With `robot_rules' from James Edward Gray II via
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/177589
@@ -81,8 +81,8 @@
<div id="description">
<p>
A specialized class using memcached to track items stored. It supports
-three operations: <a href="IncludedInMemcached.html#M000001">new</a>,
-&lt;&lt;, and <a href="IncludedInMemcached.html#M000003">include?</a> .
+three operations: <a href="IncludedInMemcached.html#M000015">new</a>,
+&lt;&lt;, and <a href="IncludedInMemcached.html#M000017">include?</a> .
Together these can be used to add items to the memcache, then determine
whether the item has been added.
</p>
@@ -105,9 +105,9 @@
<h3 class="section-bar">Methods</h3>
<div class="name-list">
- <a href="#M000002">&lt;&lt;</a>&nbsp;&nbsp;
- <a href="#M000003">include?</a>&nbsp;&nbsp;
- <a href="#M000001">new</a>&nbsp;&nbsp;
+ <a href="#M000016">&lt;&lt;</a>&nbsp;&nbsp;
+ <a href="#M000017">include?</a>&nbsp;&nbsp;
+ <a href="#M000015">new</a>&nbsp;&nbsp;
</div>
</div>
@@ -129,41 +129,33 @@ <h3 class="section-bar">Methods</h3>
<div id="methods">
<h3 class="section-bar">Public Class methods</h3>
- <div id="method-M000001" class="method-detail">
- <a name="M000001"></a>
+ <div id="method-M000015" class="method-detail">
+ <a name="M000015"></a>
<div class="method-heading">
- <a href="#M000001" class="method-signature">
+ <a href="IncludedInMemcached.src/M000015.html" target="Code" class="method-signature"
+ onclick="popupCode('IncludedInMemcached.src/M000015.html');return false;">
<span class="method-name">new</span><span class="method-args">(*a)</span>
</a>
</div>
<div class="method-description">
<p>
-Construct a <a href="IncludedInMemcached.html#M000001">new</a> <a
+Construct a <a href="IncludedInMemcached.html#M000015">new</a> <a
href="IncludedInMemcached.html">IncludedInMemcached</a> instance. All
arguments here are passed to MemCache (part of the memcache-client gem).
</p>
- <p><a class="source-toggle" href="#"
- onclick="toggleCode('M000001-source');return false;">[Source]</a></p>
- <div class="method-source-code" id="M000001-source">
-<pre>
-<span class="ruby-comment cmt"># File lib/spider/included_in_memcached.rb, line 39</span>
- <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">initialize</span>(<span class="ruby-operator">*</span><span class="ruby-identifier">a</span>)
- <span class="ruby-ivar">@c</span> = <span class="ruby-constant">MemCache</span>.<span class="ruby-identifier">new</span>(<span class="ruby-operator">*</span><span class="ruby-identifier">a</span>)
- <span class="ruby-keyword kw">end</span>
-</pre>
- </div>
</div>
</div>
<h3 class="section-bar">Public Instance methods</h3>
- <div id="method-M000002" class="method-detail">
- <a name="M000002"></a>
+ <div id="method-M000016" class="method-detail">
+ <a name="M000016"></a>
<div class="method-heading">
- <a href="#M000002" class="method-signature">
+ <a href="IncludedInMemcached.src/M000016.html" target="Code" class="method-signature"
+ onclick="popupCode('IncludedInMemcached.src/M000016.html');return false;">
<span class="method-name">&lt;&lt;</span><span class="method-args">(v)</span>
</a>
</div>
@@ -172,24 +164,15 @@ <h3 class="section-bar">Public Instance methods</h3>
<p>
Add an item to the memcache.
</p>
- <p><a class="source-toggle" href="#"
- onclick="toggleCode('M000002-source');return false;">[Source]</a></p>
- <div class="method-source-code" id="M000002-source">
-<pre>
-<span class="ruby-comment cmt"># File lib/spider/included_in_memcached.rb, line 44</span>
- <span class="ruby-keyword kw">def</span> <span class="ruby-operator">&lt;&lt;</span>(<span class="ruby-identifier">v</span>)
- <span class="ruby-ivar">@c</span>.<span class="ruby-identifier">add</span>(<span class="ruby-identifier">v</span>.<span class="ruby-identifier">to_s</span>, <span class="ruby-identifier">v</span>)
- <span class="ruby-keyword kw">end</span>
-</pre>
- </div>
</div>
</div>
- <div id="method-M000003" class="method-detail">
- <a name="M000003"></a>
+ <div id="method-M000017" class="method-detail">
+ <a name="M000017"></a>
<div class="method-heading">
- <a href="#M000003" class="method-signature">
+ <a href="IncludedInMemcached.src/M000017.html" target="Code" class="method-signature"
+ onclick="popupCode('IncludedInMemcached.src/M000017.html');return false;">
<span class="method-name">include?</span><span class="method-args">(v)</span>
</a>
</div>
@@ -198,16 +181,6 @@ <h3 class="section-bar">Public Instance methods</h3>
<p>
True if the item is in the memcache.
</p>
- <p><a class="source-toggle" href="#"
- onclick="toggleCode('M000003-source');return false;">[Source]</a></p>
- <div class="method-source-code" id="M000003-source">
-<pre>
-<span class="ruby-comment cmt"># File lib/spider/included_in_memcached.rb, line 49</span>
- <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">include?</span>(<span class="ruby-identifier">v</span>)
- <span class="ruby-ivar">@c</span>.<span class="ruby-identifier">get</span>(<span class="ruby-identifier">v</span>.<span class="ruby-identifier">to_s</span>) <span class="ruby-operator">==</span> <span class="ruby-identifier">v</span>
- <span class="ruby-keyword kw">end</span>
-</pre>
- </div>
</div>
</div>
View
@@ -93,7 +93,7 @@
<h3 class="section-bar">Methods</h3>
<div class="name-list">
- <a href="#M000011">start_at</a>&nbsp;&nbsp;
+ <a href="#M000029">start_at</a>&nbsp;&nbsp;
</div>
</div>
@@ -115,11 +115,12 @@ <h3 class="section-bar">Methods</h3>
<div id="methods">
<h3 class="section-bar">Public Class methods</h3>
- <div id="method-M000011" class="method-detail">
- <a name="M000011"></a>
+ <div id="method-M000029" class="method-detail">
+ <a name="M000029"></a>
<div class="method-heading">
- <a href="#M000011" class="method-signature">
+ <a href="Spider.src/M000029.html" target="Code" class="method-signature"
+ onclick="popupCode('Spider.src/M000029.html');return false;">
<span class="method-name">start_at</span><span class="method-args">(a_url, &amp;block)</span>
</a>
</div>
@@ -151,19 +152,6 @@ <h3 class="section-bar">Public Class methods</h3>
end
end
</pre>
- <p><a class="source-toggle" href="#"
- onclick="toggleCode('M000011-source');return false;">[Source]</a></p>
- <div class="method-source-code" id="M000011-source">
-<pre>
-<span class="ruby-comment cmt"># File lib/spider.rb, line 54</span>
- <span class="ruby-keyword kw">def</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">start_at</span>(<span class="ruby-identifier">a_url</span>, <span class="ruby-operator">&amp;</span><span class="ruby-identifier">block</span>)
- <span class="ruby-identifier">rules</span> = <span class="ruby-constant">RobotRules</span>.<span class="ruby-identifier">new</span>(<span class="ruby-value str">'Ruby Spider 1.0'</span>)
- <span class="ruby-identifier">a_spider</span> = <span class="ruby-constant">SpiderInstance</span>.<span class="ruby-identifier">new</span>({<span class="ruby-keyword kw">nil</span> =<span class="ruby-operator">&gt;</span> <span class="ruby-identifier">a_url</span>}, [], <span class="ruby-identifier">rules</span>, [])
- <span class="ruby-identifier">block</span>.<span class="ruby-identifier">call</span>(<span class="ruby-identifier">a_spider</span>)
- <span class="ruby-identifier">a_spider</span>.<span class="ruby-identifier">start!</span>
- <span class="ruby-keyword kw">end</span>
-</pre>
- </div>
</div>
</div>
Oops, something went wrong.

0 comments on commit 6c24a96

Please sign in to comment.