Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

User from unknown with provider googlebot #918

Closed
anonymous-piwik-user opened this Issue · 9 comments

3 participants

@anonymous-piwik-user

i'm getting very often a user from the country "unknown" with the provider "googlebot", resolution 1024 x 1024, Browser: Mozilla 5.0 and an unknown operating system.

i'm getting this on a few sites.
maybe a new version of the googlebot?
Keywords: googlebot

@robocoder

Can you check your web server log and give us a User Agent string? Sounds like Google's version of the Bing spambot.

@anonymous-piwik-user

"Mozilla/5.0 (compatible; Googlebot/2.1; !http://www.google.com/bot.html)"

@robocoder

Reference: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80553

Maybe it's time to add an example bot tracking plugin to move bot-specific detection logic out of Visit.php...

@robocoder

Can you provide a few lines from your web server's access log showing the Googlebot requests? Thanks.

@anonymous-piwik-user

66.249.71.35 - - +0200 "GET /ro/tag/englisch/ HTTP/1.1" 200 10033 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /wp-content/plugins/simple-ajax-shoutbox/ajax_shoutbox_process.php?1252281600 HTTP/1.1" 200 83 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /sk/tag/linux/ HTTP/1.1" 200 10899 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
83.64.31.37 - - +0200 "POST /wp-cron.php?doing_wp_cron HTTP/1.0" 200 - "-" "WordPress/2.8.4; http://blog.prasi.at"

66.249.71.35 - - +0200 "GET /tag/englisch/&rurl=translate.google.com&lang=de&usg=ALkJrhja1EyzL9WcZVjz7LgKxhfVOVrJEw HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /2009/06/20/jailbreak-iphone-os-3-0-ist-ab-sofort-verfugbar/ HTTP/1.1" 200 10777 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/nova-rock/&rurl=translate.google.com&lang=de&usg=ALkJrhi6maYH0aia7iAfuV7rpgHyGmOUMA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/magento/&rurl=translate.google.com&lang=de&usg=ALkJrhgLB2Y3EEXHn82mkWp80wufYKHKwA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/usb-stick/&rurl=translate.google.com&lang=de&usg=ALkJrhigtRYLjyjLlIUykrlJ13sz7ofdNw HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /en/about-me/ HTTP/1.1" 200 8398 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /2009/06/23/left-4-dead-patch-diese-woche/&rurl=translate.google.com&lang=de&usg=ALkJrhhG8ZQLkAD2-GicJfebbgFaFc7bng HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/schweden/&rurl=translate.google.com&lang=de&usg=ALkJrhhPiufQvYU-FDqR0mTbZW8qKcqViA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

@mattab
Owner

A DNS lookup of the visitor host is done in the provider plugin when it is enabled.

Technically we should not require this DNS lookup for proper Piwik behavior, it should always be optional (as it can cause performance issues if DNS latency goes up).

@robocoder

prasi: do you have one showing Googlebot fetching piwik.php?

matt: nod a similar latency issue arises with the honeypot suggestion in #653

@anonymous-piwik-user

66.249.71.210 - - +0200 "GET /robots.txt HTTP/1.1" 200 21 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.210 - - +0200 "GET /piwik.php?idsite=1&url=http%3A%2F%2Fblog.prasi.at%2F&res=1024x1024&h=3&m=51&s=21&cookie=1&urlref=&rand=0.278324234&pdf=0&qt=0&realp=0&wma=0&dir=0&fla=0&java=0&gears=0&ag=0&action_name= HTTP/1.1" 200 43 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

@robocoder

In [1470], fixes #918 and #958 - Filter out Googlebot and Bing bot

@anonymous-piwik-user anonymous-piwik-user added this to the Piwik 0.4.4 milestone
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.