Permalink
Browse files

Switched the YARD doc formatting to markdown.

  • Loading branch information...
1 parent 349b338 commit 38546a27c5a399f4b54e723de2dfbb87f61c8762 @postmodern committed Feb 1, 2010
Showing with 78 additions and 78 deletions.
  1. +19 −19 History.rdoc → History.md
  2. +28 −28 README.rdoc → README.md
  3. +1 −1 Rakefile
  4. +5 −5 lib/spidr/agent.rb
  5. +3 −3 lib/spidr/auth_store.rb
  6. +1 −1 lib/spidr/cookie_jar.rb
  7. +1 −1 lib/spidr/filters.rb
  8. +20 −20 lib/spidr/page.rb
@@ -1,4 +1,4 @@
-=== 0.2.2 / 2010-01-06
+### 0.2.2 / 2010-01-06
* Require Web Spider Obstacle Course (WSOC) >= 0.1.1.
* Integrated the new WSOC into the specs.
@@ -15,7 +15,7 @@
* Renamed Spidr::Agent#get_session to {Spidr::SessionCache#[]}.
* Renamed Spidr::Agent#kill_session to {Spidr::SessionCache#kill!}.
-=== 0.2.1 / 2009-11-25
+### 0.2.1 / 2009-11-25
* Added {Spidr::Events#every_ok_page}.
* Added {Spidr::Events#every_redirect_page}.
@@ -44,9 +44,9 @@
* Added {Spidr::Events#every_zip_page}.
* Fixed a bug where {Spidr::Agent#delay} was not being used to delay
requesting pages.
-* Spider +link+ and +script+ tags in HTML pages (thanks Nick Plante).
+* Spider `link` and `script` tags in HTML pages (thanks Nick Plante).
-=== 0.2.0 / 2009-10-10
+### 0.2.0 / 2009-10-10
* Added {URI.expand_path}.
* Added {Spidr::Page#search}.
@@ -91,7 +91,7 @@
* Made {Spidr::Agent#visit_page} public.
* Moved to YARD based documentation.
-=== 0.1.9 / 2009-06-13
+### 0.1.9 / 2009-06-13
* Upgraded to Hoe 2.0.0.
* Use Hoe.spec instead of Hoe.new.
@@ -108,7 +108,7 @@
could not be loaded.
* Removed Spidr::Agent::SCHEMES.
-=== 0.1.8 / 2009-05-27
+### 0.1.8 / 2009-05-27
* Added the Spidr::Agent#pause! and Spidr::Agent#continue! methods.
* Added the Spidr::Agent#running? and Spidr::Agent#paused? methods.
@@ -121,15 +121,15 @@
* Made {Spidr::Agent#enqueue} and {Spidr::Agent#queued?} public.
* Added more specs.
-=== 0.1.7 / 2009-04-24
+### 0.1.7 / 2009-04-24
* Added Spidr::Agent#all_headers.
-* Fixed a bug where Page#headers was always +nil+.
+* Fixed a bug where Page#headers was always `nil`.
* {Spidr::Spidr::Agent} will now follow the Location header in HTTP 300,
301, 302, 303 and 307 Redirects.
* {Spidr::Agent} will now follow iframe and frame tags.
-=== 0.1.6 / 2009-04-14
+### 0.1.6 / 2009-04-14
* Added {Spidr::Agent#failures}, a list of URLs which could not be visited.
* Added {Spidr::Agent#failed?}.
@@ -143,40 +143,40 @@
* Updated the Web Spider Obstacle Course with links that always fail to be
visited.
-=== 0.1.5 / 2009-03-22
+### 0.1.5 / 2009-03-22
-* Catch malformed URIs in {Spidr::Page#to_absolute} and return +nil+.
-* Filter out +nil+ URIs in {Spidr::Page#urls}.
+* Catch malformed URIs in {Spidr::Page#to_absolute} and return `nil`.
+* Filter out `nil` URIs in {Spidr::Page#urls}.
-=== 0.1.4 / 2009-01-15
+### 0.1.4 / 2009-01-15
* Use Nokogiri for HTML and XML parsing.
-=== 0.1.3 / 2009-01-10
+### 0.1.3 / 2009-01-10
* Added the :host options to {Spidr::Agent#initialize}.
* Added the Web Spider Obstacle Course files to the Manifest.
* Aliased {Spidr::Agent#visited_urls} to {Spidr::Agent#history}.
-=== 0.1.2 / 2008-11-06
+### 0.1.2 / 2008-11-06
* Fixed a bug in {Spidr::Page#to_absolute} where URLs with no path were not
- receiving a default path of <tt>/</tt>.
+ receiving a default path of `/`.
* Fixed a bug in {Spidr::Page#to_absolute} where URL paths were not being
- expanded, in order to remove <tt>..</tt> and <tt>.</tt> directories.
+ expanded, in order to remove `..` and `.` directories.
* Fixed a bug where absolute URLs could have a blank path, thus causing
{Spidr::Agent#get_page} to crash when it performed the HTTP request.
* Added RSpec spec tests.
* Created a Web-Spider Obstacle Course
(http://spidr.rubyforge.org/course/start.html) which is used in the spec
tests.
-=== 0.1.1 / 2008-10-04
+### 0.1.1 / 2008-10-04
* Added a reader method for the response instance variable in Page.
* Fixed a bug in {Spidr::Page#method_missing}.
-=== 0.1.0 / 2008-05-23
+### 0.1.0 / 2008-05-23
* Initial release.
* Black-list or white-list URLs based upon:
@@ -1,18 +1,18 @@
-= Spidr
+# Spidr
-* http://spidr.rubyforge.org
-* http://github.com/postmodern/spidr
-* http://github.com/postmodern/spidr/issues
-* http://groups.google.com/group/spidr
+* [spidr.rubyforge.org](http://spidr.rubyforge.org/)
+* [github.com/postmodern/spidr](http://github.com/postmodern/spidr)
+* [github.com/postmodern/spidr/issues](http://github.com/postmodern/spidr/issues)
+* [groups.google.com/group/spidr](http://groups.google.com/group/spidr)
* irc.freenode.net #spidr
-== DESCRIPTION:
+## DESCRIPTION:
Spidr is a versatile Ruby web spidering library that can spider a site,
multiple domains, certain links or infinitely. Spidr is designed to be fast
and easy to use.
-== FEATURES:
+## FEATURES:
* Follows:
* a tags.
@@ -41,21 +41,21 @@ and easy to use.
* Custom proxy settings.
* HTTPS support.
-== EXAMPLES:
+## EXAMPLES:
-* Start spidering from a URL:
+Start spidering from a URL:
Spidr.start_at('http://tenderlovemaking.com/')
-* Spider a host:
+Spider a host:
Spidr.host('coderrr.wordpress.com')
-* Spider a site:
+Spider a site:
Spidr.site('http://rubyflow.com/')
-* Spider multiple hosts:
+Spider multiple hosts:
Spidr.start_at(
'http://company.com/',
@@ -65,30 +65,30 @@ and easy to use.
]
)
-* Do not spider certain links:
+Do not spider certain links:
Spidr.site('http://matasano.com/', :ignore_links => [/log/])
-* Do not spider links on certain ports:
+Do not spider links on certain ports:
Spidr.site(
'http://sketchy.content.com/',
:ignore_ports => [8000, 8010, 8080]
)
-* Print out visited URLs:
+Print out visited URLs:
Spidr.site('http://rubyinside.org/') do |spider|
spider.every_url { |url| puts url }
end
-* Print out the URLs that could not be requested:
+Print out the URLs that could not be requested:
Spidr.site('http://sketchy.content.com/') do |spider|
spider.every_failed_url { |url| puts url }
end
-* Search HTML and XML pages:
+Search HTML and XML pages:
Spidr.site('http://company.withablog.com/') do |spider|
spider.every_page do |page|
@@ -99,19 +99,19 @@ and easy to use.
value = meta.attributes['content']
puts " #{name} = #{value}"
- end
+ end
end
end
-* Print out the titles from every page:
+Print out the titles from every page:
Spidr.site('http://www.rubypulse.com/') do |spider|
spider.every_html_page do |page|
puts page.title
end
end
-* Find what kinds of web servers a host is using, by accessing the headers:
+Find what kinds of web servers a host is using, by accessing the headers:
servers = Set[]
@@ -121,23 +121,23 @@ and easy to use.
end
end
-* Pause the spider on a forbidden page:
+Pause the spider on a forbidden page:
spider = Spidr.host('overnight.startup.com') do |spider|
spider.every_forbidden_page do |page|
spider.pause!
end
end
-* Skip the processing of a page:
+Skip the processing of a page:
Spidr.host('sketchy.content.com') do |spider|
spider.every_missing_page do |page|
spider.skip_page!
end
end
-* Skip the processing of links:
+Skip the processing of links:
Spidr.host('sketchy.content.com') do |spider|
spider.every_url do |url|
@@ -147,15 +147,15 @@ and easy to use.
end
end
-== REQUIREMENTS:
+## REQUIREMENTS:
-* {nokogiri}[http://nokogiri.rubyforge.org/] >= 1.2.0
+* [nokogiri](http://nokogiri.rubyforge.org/) >= 1.2.0
-== INSTALL:
+## INSTALL:
- $ sudo gem install spidr
+ $ sudo gem install spidr
-== LICENSE:
+## LICENSE:
The MIT License
View
@@ -11,7 +11,7 @@ Hoe.spec('spidr') do
self.rspec_options += ['--colour', '--format', 'specdoc']
- self.yard_options += ['--protected']
+ self.yard_options += ['--markup', 'markdown', '--protected']
self.remote_yard_dir = 'docs'
self.extra_deps = [
View
@@ -492,7 +492,7 @@ def enqueue(url)
# The page for the response.
#
# @return [Page, nil]
- # The page for the response, or +nil+ if the request failed.
+ # The page for the response, or `nil` if the request failed.
#
def get_page(url,&block)
url = URI(url.to_s)
@@ -525,7 +525,7 @@ def get_page(url,&block)
# The page for the response.
#
# @return [Page, nil]
- # The page for the response, or +nil+ if the request failed.
+ # The page for the response, or `nil` if the request failed.
#
# @since 0.2.2
#
@@ -557,7 +557,7 @@ def post_page(url,post_data='',&block)
# The page which was visited.
#
# @return [Page, nil]
- # The page that was visited. If +nil+ is returned, either the request
+ # The page that was visited. If `nil` is returned, either the request
# for the page failed, or the page was skipped.
#
def visit_page(url,&block)
@@ -585,8 +585,8 @@ def visit_page(url,&block)
# Converts the agent into a Hash.
#
# @return [Hash]
- # The agent represented as a Hash containing the +history+ and
- # the +queue+ of the agent.
+ # The agent represented as a Hash containing the `history` and
+ # the `queue` of the agent.
#
def to_hash
{:history => @history, :queue => @queue}
@@ -24,7 +24,7 @@ def initialize
#
# @return [AuthCredential, nil]
# Closest matching {AuthCredential} values for the URL,
- # or +nil+ if nothing matches.
+ # or `nil` if nothing matches.
#
# @since 0.2.2
#
@@ -102,13 +102,13 @@ def add(url,username,password)
#
# Returns the base64 encoded authorization string for the URL
- # or +nil+ if no authorization exists.
+ # or `nil` if no authorization exists.
#
# @param [URI] url
# The url.
#
# @return [String, nil]
- # The base64 encoded authorizatio string or +nil+.
+ # The base64 encoded authorizatio string or `nil`.
#
# @since 0.2.2
#
@@ -47,7 +47,7 @@ def each(&block)
# Host or domain name for cookies.
#
# @return [String, nil]
- # The cookie values or +nil+ if the host does not have a cookie in the
+ # The cookie values or `nil` if the host does not have a cookie in the
# jar.
#
# @since 0.2.2
@@ -17,7 +17,7 @@ def self.included(base)
#
# @option options [Array] :schemes (['http', 'https'])
# The list of acceptable URI schemes to visit.
- # The +https+ scheme will be ignored if +net/https+ cannot be loaded.
+ # The `https` scheme will be ignored if `net/https` cannot be loaded.
#
# @option options [String] :host
# The host-name to visit.
Oops, something went wrong.

0 comments on commit 38546a2

Please sign in to comment.