Skip to content

Commit

Permalink
Switched the YARD doc formatting to markdown.
Browse files Browse the repository at this point in the history
  • Loading branch information
postmodern committed Feb 1, 2010
1 parent 349b338 commit 38546a2
Show file tree
Hide file tree
Showing 8 changed files with 78 additions and 78 deletions.
38 changes: 19 additions & 19 deletions History.rdoc → History.md
@@ -1,4 +1,4 @@
=== 0.2.2 / 2010-01-06
### 0.2.2 / 2010-01-06

* Require Web Spider Obstacle Course (WSOC) >= 0.1.1.
* Integrated the new WSOC into the specs.
Expand All @@ -15,7 +15,7 @@
* Renamed Spidr::Agent#get_session to {Spidr::SessionCache#[]}.
* Renamed Spidr::Agent#kill_session to {Spidr::SessionCache#kill!}.

=== 0.2.1 / 2009-11-25
### 0.2.1 / 2009-11-25

* Added {Spidr::Events#every_ok_page}.
* Added {Spidr::Events#every_redirect_page}.
Expand Down Expand Up @@ -44,9 +44,9 @@
* Added {Spidr::Events#every_zip_page}.
* Fixed a bug where {Spidr::Agent#delay} was not being used to delay
requesting pages.
* Spider +link+ and +script+ tags in HTML pages (thanks Nick Plante).
* Spider `link` and `script` tags in HTML pages (thanks Nick Plante).

=== 0.2.0 / 2009-10-10
### 0.2.0 / 2009-10-10

* Added {URI.expand_path}.
* Added {Spidr::Page#search}.
Expand Down Expand Up @@ -91,7 +91,7 @@
* Made {Spidr::Agent#visit_page} public.
* Moved to YARD based documentation.

=== 0.1.9 / 2009-06-13
### 0.1.9 / 2009-06-13

* Upgraded to Hoe 2.0.0.
* Use Hoe.spec instead of Hoe.new.
Expand All @@ -108,7 +108,7 @@
could not be loaded.
* Removed Spidr::Agent::SCHEMES.

=== 0.1.8 / 2009-05-27
### 0.1.8 / 2009-05-27

* Added the Spidr::Agent#pause! and Spidr::Agent#continue! methods.
* Added the Spidr::Agent#running? and Spidr::Agent#paused? methods.
Expand All @@ -121,15 +121,15 @@
* Made {Spidr::Agent#enqueue} and {Spidr::Agent#queued?} public.
* Added more specs.

=== 0.1.7 / 2009-04-24
### 0.1.7 / 2009-04-24

* Added Spidr::Agent#all_headers.
* Fixed a bug where Page#headers was always +nil+.
* Fixed a bug where Page#headers was always `nil`.
* {Spidr::Spidr::Agent} will now follow the Location header in HTTP 300,
301, 302, 303 and 307 Redirects.
* {Spidr::Agent} will now follow iframe and frame tags.

=== 0.1.6 / 2009-04-14
### 0.1.6 / 2009-04-14

* Added {Spidr::Agent#failures}, a list of URLs which could not be visited.
* Added {Spidr::Agent#failed?}.
Expand All @@ -143,40 +143,40 @@
* Updated the Web Spider Obstacle Course with links that always fail to be
visited.

=== 0.1.5 / 2009-03-22
### 0.1.5 / 2009-03-22

* Catch malformed URIs in {Spidr::Page#to_absolute} and return +nil+.
* Filter out +nil+ URIs in {Spidr::Page#urls}.
* Catch malformed URIs in {Spidr::Page#to_absolute} and return `nil`.
* Filter out `nil` URIs in {Spidr::Page#urls}.

=== 0.1.4 / 2009-01-15
### 0.1.4 / 2009-01-15

* Use Nokogiri for HTML and XML parsing.

=== 0.1.3 / 2009-01-10
### 0.1.3 / 2009-01-10

* Added the :host options to {Spidr::Agent#initialize}.
* Added the Web Spider Obstacle Course files to the Manifest.
* Aliased {Spidr::Agent#visited_urls} to {Spidr::Agent#history}.

=== 0.1.2 / 2008-11-06
### 0.1.2 / 2008-11-06

* Fixed a bug in {Spidr::Page#to_absolute} where URLs with no path were not
receiving a default path of <tt>/</tt>.
receiving a default path of `/`.
* Fixed a bug in {Spidr::Page#to_absolute} where URL paths were not being
expanded, in order to remove <tt>..</tt> and <tt>.</tt> directories.
expanded, in order to remove `..` and `.` directories.
* Fixed a bug where absolute URLs could have a blank path, thus causing
{Spidr::Agent#get_page} to crash when it performed the HTTP request.
* Added RSpec spec tests.
* Created a Web-Spider Obstacle Course
(http://spidr.rubyforge.org/course/start.html) which is used in the spec
tests.

=== 0.1.1 / 2008-10-04
### 0.1.1 / 2008-10-04

* Added a reader method for the response instance variable in Page.
* Fixed a bug in {Spidr::Page#method_missing}.

=== 0.1.0 / 2008-05-23
### 0.1.0 / 2008-05-23

* Initial release.
* Black-list or white-list URLs based upon:
Expand Down
56 changes: 28 additions & 28 deletions README.rdoc → README.md
@@ -1,18 +1,18 @@
= Spidr
# Spidr

* http://spidr.rubyforge.org
* http://github.com/postmodern/spidr
* http://github.com/postmodern/spidr/issues
* http://groups.google.com/group/spidr
* [spidr.rubyforge.org](http://spidr.rubyforge.org/)
* [github.com/postmodern/spidr](http://github.com/postmodern/spidr)
* [github.com/postmodern/spidr/issues](http://github.com/postmodern/spidr/issues)
* [groups.google.com/group/spidr](http://groups.google.com/group/spidr)
* irc.freenode.net #spidr

== DESCRIPTION:
## DESCRIPTION:

Spidr is a versatile Ruby web spidering library that can spider a site,
multiple domains, certain links or infinitely. Spidr is designed to be fast
and easy to use.

== FEATURES:
## FEATURES:

* Follows:
* a tags.
Expand Down Expand Up @@ -41,21 +41,21 @@ and easy to use.
* Custom proxy settings.
* HTTPS support.

== EXAMPLES:
## EXAMPLES:

* Start spidering from a URL:
Start spidering from a URL:

Spidr.start_at('http://tenderlovemaking.com/')

* Spider a host:
Spider a host:

Spidr.host('coderrr.wordpress.com')

* Spider a site:
Spider a site:

Spidr.site('http://rubyflow.com/')

* Spider multiple hosts:
Spider multiple hosts:

Spidr.start_at(
'http://company.com/',
Expand All @@ -65,30 +65,30 @@ and easy to use.
]
)

* Do not spider certain links:
Do not spider certain links:

Spidr.site('http://matasano.com/', :ignore_links => [/log/])

* Do not spider links on certain ports:
Do not spider links on certain ports:

Spidr.site(
'http://sketchy.content.com/',
:ignore_ports => [8000, 8010, 8080]
)

* Print out visited URLs:
Print out visited URLs:

Spidr.site('http://rubyinside.org/') do |spider|
spider.every_url { |url| puts url }
end

* Print out the URLs that could not be requested:
Print out the URLs that could not be requested:

Spidr.site('http://sketchy.content.com/') do |spider|
spider.every_failed_url { |url| puts url }
end

* Search HTML and XML pages:
Search HTML and XML pages:

Spidr.site('http://company.withablog.com/') do |spider|
spider.every_page do |page|
Expand All @@ -99,19 +99,19 @@ and easy to use.
value = meta.attributes['content']

puts " #{name} = #{value}"
end
end
end
end

* Print out the titles from every page:
Print out the titles from every page:

Spidr.site('http://www.rubypulse.com/') do |spider|
spider.every_html_page do |page|
puts page.title
end
end

* Find what kinds of web servers a host is using, by accessing the headers:
Find what kinds of web servers a host is using, by accessing the headers:

servers = Set[]

Expand All @@ -121,23 +121,23 @@ and easy to use.
end
end

* Pause the spider on a forbidden page:
Pause the spider on a forbidden page:

spider = Spidr.host('overnight.startup.com') do |spider|
spider.every_forbidden_page do |page|
spider.pause!
end
end

* Skip the processing of a page:
Skip the processing of a page:

Spidr.host('sketchy.content.com') do |spider|
spider.every_missing_page do |page|
spider.skip_page!
end
end

* Skip the processing of links:
Skip the processing of links:

Spidr.host('sketchy.content.com') do |spider|
spider.every_url do |url|
Expand All @@ -147,15 +147,15 @@ and easy to use.
end
end

== REQUIREMENTS:
## REQUIREMENTS:

* {nokogiri}[http://nokogiri.rubyforge.org/] >= 1.2.0
* [nokogiri](http://nokogiri.rubyforge.org/) >= 1.2.0

== INSTALL:
## INSTALL:

$ sudo gem install spidr
$ sudo gem install spidr

== LICENSE:
## LICENSE:

The MIT License

Expand Down
2 changes: 1 addition & 1 deletion Rakefile
Expand Up @@ -11,7 +11,7 @@ Hoe.spec('spidr') do

self.rspec_options += ['--colour', '--format', 'specdoc']

self.yard_options += ['--protected']
self.yard_options += ['--markup', 'markdown', '--protected']
self.remote_yard_dir = 'docs'

self.extra_deps = [
Expand Down
10 changes: 5 additions & 5 deletions lib/spidr/agent.rb
Expand Up @@ -492,7 +492,7 @@ def enqueue(url)
# The page for the response.
#
# @return [Page, nil]
# The page for the response, or +nil+ if the request failed.
# The page for the response, or `nil` if the request failed.
#
def get_page(url,&block)
url = URI(url.to_s)
Expand Down Expand Up @@ -525,7 +525,7 @@ def get_page(url,&block)
# The page for the response.
#
# @return [Page, nil]
# The page for the response, or +nil+ if the request failed.
# The page for the response, or `nil` if the request failed.
#
# @since 0.2.2
#
Expand Down Expand Up @@ -557,7 +557,7 @@ def post_page(url,post_data='',&block)
# The page which was visited.
#
# @return [Page, nil]
# The page that was visited. If +nil+ is returned, either the request
# The page that was visited. If `nil` is returned, either the request
# for the page failed, or the page was skipped.
#
def visit_page(url,&block)
Expand Down Expand Up @@ -585,8 +585,8 @@ def visit_page(url,&block)
# Converts the agent into a Hash.
#
# @return [Hash]
# The agent represented as a Hash containing the +history+ and
# the +queue+ of the agent.
# The agent represented as a Hash containing the `history` and
# the `queue` of the agent.
#
def to_hash
{:history => @history, :queue => @queue}
Expand Down
6 changes: 3 additions & 3 deletions lib/spidr/auth_store.rb
Expand Up @@ -24,7 +24,7 @@ def initialize
#
# @return [AuthCredential, nil]
# Closest matching {AuthCredential} values for the URL,
# or +nil+ if nothing matches.
# or `nil` if nothing matches.
#
# @since 0.2.2
#
Expand Down Expand Up @@ -102,13 +102,13 @@ def add(url,username,password)

#
# Returns the base64 encoded authorization string for the URL
# or +nil+ if no authorization exists.
# or `nil` if no authorization exists.
#
# @param [URI] url
# The url.
#
# @return [String, nil]
# The base64 encoded authorizatio string or +nil+.
# The base64 encoded authorizatio string or `nil`.
#
# @since 0.2.2
#
Expand Down
2 changes: 1 addition & 1 deletion lib/spidr/cookie_jar.rb
Expand Up @@ -47,7 +47,7 @@ def each(&block)
# Host or domain name for cookies.
#
# @return [String, nil]
# The cookie values or +nil+ if the host does not have a cookie in the
# The cookie values or `nil` if the host does not have a cookie in the
# jar.
#
# @since 0.2.2
Expand Down
2 changes: 1 addition & 1 deletion lib/spidr/filters.rb
Expand Up @@ -17,7 +17,7 @@ def self.included(base)
#
# @option options [Array] :schemes (['http', 'https'])
# The list of acceptable URI schemes to visit.
# The +https+ scheme will be ignored if +net/https+ cannot be loaded.
# The `https` scheme will be ignored if `net/https` cannot be loaded.
#
# @option options [String] :host
# The host-name to visit.
Expand Down

0 comments on commit 38546a2

Please sign in to comment.