Skip to content
This repository has been archived by the owner on Nov 30, 2018. It is now read-only.

Commit

Permalink
more fixes for weird cases of google play pages: mainly around missin…
Browse files Browse the repository at this point in the history
…g sections
  • Loading branch information
refaelos authored and chadrem committed Apr 3, 2018
1 parent 4bfc3e3 commit 7056a02
Showing 1 changed file with 19 additions and 13 deletions.
32 changes: 19 additions & 13 deletions lib/market_bot/play/app.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,18 @@ def self.parse(html, opts = {})
h2_additional_info = doc.at('h2:contains("Additional Information")')
if h2_additional_info
additional_info_parent = h2_additional_info.parent.next.children.children
result[:updated] = additional_info_parent.at('div:contains("Updated")').children[1].text
result[:installs] = additional_info_parent.at('div:contains("Installs")').children[1].text
result[:size] = additional_info_parent.at('div:contains("Size")').children[1].text
result[:current_version] = additional_info_parent.at('div:contains("Current Version")').children[1].text
result[:requires_android] = additional_info_parent.at('div:contains("Requires Android")').children[1].text
div_inapp_products = additional_info_parent.at('div:contains("In-app Products")')
result[:in_app_products_price] = div_inapp_products.children[1].text if div_inapp_products
developer_div = additional_info_parent.at('div:contains("Developer")')
node = additional_info_parent.at('div:contains("Updated")')
result[:updated] = node.children[1].text if node
node = additional_info_parent.at('div:contains("Size")')
result[:size] = node.children[1].text if node
node = additional_info_parent.at('div:contains("Current Version")')
result[:current_version] = node.children[1].text if node
node = additional_info_parent.at('div:contains("Requires Android")')
result[:requires_android] = node.children[1].text if node
node = additional_info_parent.at('div:contains("In-app Products")')
result[:in_app_products_price] = node.children[1].text if node

developer_div = additional_info_parent.xpath('div[./text()="Developer"]').first.parent #additional_info_parent.at('div:contains("Developer")')
unless developer_div
developer_div = additional_info_parent.at('div:contains("Contact Developer")')
end
Expand Down Expand Up @@ -102,7 +106,7 @@ def self.parse(html, opts = {})
href_q = URI(href).query
if href_q
q_param = href_q.split('&').select {|p| p =~ /q=/}.first
href = q_param.gsub('q=', '')
href = q_param.gsub('q=', '') if q_param
end
result[:privacy_url] = href

Expand Down Expand Up @@ -251,10 +255,12 @@ def self.parse(html, opts = {})
result[:description] = doc.at_css('div[itemprop="description"]').inner_html.strip if doc.at_css('div[itemprop="description"]')
result[:title] = doc.at_css('h1[itemprop="name"]').text

node = doc.at_css('meta[itemprop="ratingValue"]')
result[:rating] = node[:content].strip
node = doc.at_css('meta[itemprop="ratingCount"]')
result[:votes] = node[:content].strip.to_i
if doc.at_css('meta[itemprop="ratingValue"]')
node = doc.at_css('meta[itemprop="ratingValue"]')
result[:rating] = node[:content].strip
node = doc.at_css('meta[itemprop="ratingCount"]')
result[:votes] = node[:content].strip.to_i
end

a_similar = doc.at_css('a:contains("Similar")')
if a_similar
Expand Down

0 comments on commit 7056a02

Please sign in to comment.