io pagerank not displaying pagerank #74

Closed
oncletom opened this Issue Jan 5, 2012 · 2 comments

Comments

Projects
None yet
2 participants
Contributor

oncletom commented Jan 5, 2012

Hi,

I was testing the pagerank module but no ranking is display.

echo "mastercard.com" | io -g pagerank
DEBUG: Running 1 worker..
DEBUG: Reading from STDIN
DEBUG: GET http://www.google.com/search?client=navclient-auto&ch=61246557760&features=Rank&q=info:http%3A%2F%2Fmastercard.com (request 71375)
DEBUG:   | GET /search?client=navclient-auto&ch=61246557760&features=Rank&q=info:http%3A%2F%2Fmastercard.com HTTP/1.1
DEBUG:   | Accept: _/_
DEBUG:   | Accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
DEBUG:   | User-agent: node.io
DEBUG: 200 http://www.google.com/search?client=navclient-auto&ch=61246557760&features=Rank&q=info:http%3A%2F%2Fmastercard.com (response 71375)
DEBUG:   | Content-type: text/html; charset=ISO-8859-1
DEBUG:   | Date: Thu, 05 Jan 2012 16:04:50 GMT
DEBUG:   | Expires: -1
DEBUG:   | Cache-control: private, max-age=0
DEBUG:   | Set-cookie: PREF=ID=a543aa6fa4c101da:FF=0:TM=1325779490:LM=1325779490:S=yTUqU9zY2UdxZLg6; expires=Sat, 04-Jan-2014 16:04:50 GMT; path=/; domain=.google.com,NID=54=Y4ZGAplvT25VlbJi29DJFwg9agCOyLdRsW6zwynMHHKU0OFh-5T11NiIP3KYBGxoct0T73QTnT-ifMjSJJp9SeAystD7BtekdCYVUy9AWKfbuMC0qWEW5D9p_cfpZE24; expires=Fri, 06-Jul-2012 16:04:50 GMT; path=/; domain=.google.com; HttpOnly
DEBUG:   | P3p: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
DEBUG:   | Server: gws
DEBUG:   | X-xss-protection: 1; mode=block
DEBUG:   | X-frame-options: SAMEORIGIN
DEBUG:   | Transfer-encoding: chunked
DEBUG: Writing to STDOUT
mastercard.com,
OK: Job complete
Contributor

chriso commented Jan 7, 2012

It looks like Google has changed the way their client queries pagerank. You used to be able to get pagerank from

http://www.google.com/search?client=navclient-auto&ch=<HASH>&features=Rank&q=info:<URL>

You could calculate the hash using this code which was available on the internet. I'll have to have a look out if there's a new way of obtaining pagerank.

Contributor

oncletom commented Jan 9, 2012

Hi,

while looking on page_rankr, I found they were calling another URI: toolbarqueries.google.com.

A change occured in early October '11; maybe it's related.

Anyway, http://www.google.com/search?client=navclient-auto&ch=<HASH>&features=Rank&q=info:<URL> should be replaced with http://toolbarqueries.google.com/tbr?client=navclient-auto&ch=<HASH>&features=Rank&q=info:<URL> (also works in HTTPS).

I submit you a pull request.

@oncletom oncletom closed this Jan 9, 2012

chriso added a commit that referenced this issue Jan 9, 2012

Merge pull request #76 from oncletom/patch-2
Fixing rank calculation, cf. #74
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment