Skip to content

Commit

Permalink
update spider example to crawl internal links of a site
Browse files Browse the repository at this point in the history
  • Loading branch information
jaimeiniesta committed Dec 3, 2012
1 parent 57b5738 commit fa8ac44
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions samples/spider.rb
Expand Up @@ -6,7 +6,7 @@
q = Queue.new
visited_links=[]

puts "Enter a valid http url to spider it following external links"
puts "Enter a valid http url to spider it following internal links"
url = gets.strip

page = MetaInspector.new(url)
Expand All @@ -20,9 +20,9 @@
puts "TITLE: #{page.title}"
puts "META DESCRIPTION: #{page.meta_description}"
puts "META KEYWORDS: #{page.meta_keywords}"
puts "LINKS: #{page.links.size}"
page.links.each do |link|
if link[0..6] == 'http://' && !visited_links.include?(link)
puts "LINKS: #{page.internal_links.size}"
page.internal_links.each do |link|
if !visited_links.include?(link)
q.push(link)
end
end
Expand Down

0 comments on commit fa8ac44

Please sign in to comment.