Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

update spider example to crawl internal links of a site

  • Loading branch information...
commit fa8ac447a7828f565ec783babf54393a1b33ae34 1 parent 57b5738
@jaimeiniesta authored
Showing with 4 additions and 4 deletions.
  1. +4 −4 samples/spider.rb
View
8 samples/spider.rb
@@ -6,7 +6,7 @@
q = Queue.new
visited_links=[]
-puts "Enter a valid http url to spider it following external links"
+puts "Enter a valid http url to spider it following internal links"
url = gets.strip
page = MetaInspector.new(url)
@@ -20,9 +20,9 @@
puts "TITLE: #{page.title}"
puts "META DESCRIPTION: #{page.meta_description}"
puts "META KEYWORDS: #{page.meta_keywords}"
- puts "LINKS: #{page.links.size}"
- page.links.each do |link|
- if link[0..6] == 'http://' && !visited_links.include?(link)
+ puts "LINKS: #{page.internal_links.size}"
+ page.internal_links.each do |link|
+ if !visited_links.include?(link)
q.push(link)
end
end
Please sign in to comment.
Something went wrong with that request. Please try again.