Permalink
Browse files

Bumped Version! Implemented a cool way of following down the [More] b…

…utton on post, polls etc. (to get all the comments) using recusion! Yay! Added a way of requesting these special pages in the [Request] class, with a flag. Methods in [Parser] aren't private anymore...
  • Loading branch information...
jcla1 committed Oct 6, 2012
1 parent 84097f7 commit b32ce865bbdaf879baefe98e98c4cda83d525050
Showing with 37 additions and 5 deletions.
  1. +31 −3 lib/hn2json/parser.rb
  2. +5 −1 lib/hn2json/request.rb
  3. +1 −1 lib/hn2json/version.rb
View
@@ -3,6 +3,9 @@ module HN2JSON
# Public: Parse HTML to produce HackerNews entities
class Parser
+
+ attr_reader :doc
+
def initialize response
html = response.html
@@ -148,9 +151,11 @@ def get_attrs_discussion entity
end
- private
+ #private
def get_voting_on table
+ fulltext = ''
+ voting_on = ''
end
def get_comments
@@ -168,9 +173,32 @@ def get_comments
# $tr = $('tr')
# $($tr[$tr.length - 3]).find('a').eq(0).attr('href')
- #trs = @doc.css('tr')
+ comments = get_comments_more doc, comments
+
+ return comments
+ end
+
+ def get_comments_more doc, comments, flag=false
+ trs = doc.css('tr .title a')
+
+ if trs.length == 0
+ return comments
+ end
+
+ url = trs.last['href']
+
+ url_regex = /\/x\?fnid=(.*)/
+
+ match = url_regex.match(url)
+
+ if match == nil
+ return comments
+ end
+
+ req = Request.new match[1], true
+ parser = Parser.new req
- #more_url = trs[trs.length - 3].css('a')[0]['href']
+ comments = comments + parser.get_comments
return comments
end
View
@@ -3,10 +3,14 @@ module HN2JSON
class Request
attr_accessor :html
- def initialize id
+ def initialize id, more_page=false
@base_url = "http://news.ycombinator.com/item?id="
@complete_url = @base_url + id.to_s
+ if more_page
+ @complete_url = "http://news.ycombinator.com/x?fnid=" + id
+ end
+
request_page
end
View
@@ -1,3 +1,3 @@
module HN2JSON
- VERSION = '0.0.2'
+ VERSION = '0.0.3'
end

0 comments on commit b32ce86

Please sign in to comment.