Permalink
Browse files

Merge pull request #8 from rromanchuk/secondary-descriptions

Secondary descriptions
  • Loading branch information...
2 parents d71b0b6 + 8d5a1de commit 1dc93819dfe7e021de570dad1f7eb861a50dc0e3 @jaimeiniesta committed Sep 26, 2011
Showing with 1,086 additions and 2 deletions.
  1. +13 −1 lib/meta_inspector/scraper.rb
  2. +1,060 −0 spec/fixtures/theonion-no-description.com.response
  3. +13 −1 spec/metainspector_spec.rb
@@ -21,7 +21,13 @@ def initialize(url)
# Returns the parsed document title, from the content of the <title> tag.
# This is not the same as the meta_tite tag
def title
- @data.title ||= parsed_document.css('title').inner_html rescue nil
+ @data.title ||= parsed_document.css('title').inner_html.gsub(/\t|\n|\r/, '') rescue nil
+ end
+
+ # A description getter that first checks for a meta description and if not present will
+ # guess by looking grabbing the first paragraph > 120 characters
+ def description
+ self.meta_description.nil? ? secondary_description : self.meta_description
end
# Returns the parsed document links
@@ -137,5 +143,11 @@ def absolutify_url(url)
def remove_mailto(links)
links.reject {|l| l.index('mailto')}
end
+
+ # Look for the first <p> block with 120 characters or more
+ def secondary_description
+ (p = parsed_document.search('//p').map(&:text).select{ |p| p.length > 120 }.first).nil? ? '' : p
+ end
+
end
end
Oops, something went wrong.

0 comments on commit 1dc9381

Please sign in to comment.