Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude all requests from all existing Google bot IP ranges #6503

Closed
abald124 opened this issue Oct 23, 2014 · 5 comments
Closed

Exclude all requests from all existing Google bot IP ranges #6503

abald124 opened this issue Oct 23, 2014 · 5 comments
Assignees
Labels
Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.

Comments

@abald124
Copy link

Re: http://forum.piwik.org/read.php?15,120954

The 8 IPs we saw last month were as follows:

66.249.83.84
64.233.172.84
66.249.84.4
66.249.88.89
66.102.7.4
66.249.82.4
66.249.85.89
173.255.112.208

First seven were Google Proxy, last one was Google User Content

@mattab mattab added the Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change. label Oct 23, 2014
@mattab mattab added this to the Piwik 2.9.0 milestone Oct 23, 2014
@mattab
Copy link
Member

mattab commented Oct 23, 2014

Thanks for the report. Could you also by any chance find out the User-Agent for those requests?

@mattab mattab modified the milestones: Piwik 2.9.0, Piwik 2.10.0 Oct 23, 2014
@mattab
Copy link
Member

mattab commented Dec 3, 2014

Hi @abald124 to be sure that these are from google could you paste one access.log line for each IP above? I'm especiall interested in the User agent. thanks

@mattab mattab modified the milestones: Piwik 2.10.0 , Short term Dec 5, 2014
@abald124
Copy link
Author

Here are a few examples:

66.249.83.204 - - [30/Nov/2014:06:37:48 -0600] "GET /product-category/bridal_collection/engagement_rings/ HTTP/1.1" 200 36220 "http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CFQQFjAD&url=http%3A%2F%2Ftenniesjewelry.com%2Fproduct-category%2Fbridal_collection%2Fengagement_rings%2F&ei=nA97VJWiJaz8nQfvWw&usg=AFQjCNElLa77wDUS03ZHB4RTiXe-LJkGSw" "Mozilla/5.0 (Linux; U; Android 2.3.7; en-us; ZTE V768 Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1"

66.249.88.216 - - [30/Nov/2014:20:45:37 -0600] "GET / HTTP/1.1" 200 9529 "http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDwQFjAA&url=http%3A%2F%2Ftenniesjewelry.com%2F&ei=FdZ7VMbLJKqh2AXupwE&usg=AFQjCNGm0am5WMUh4CdTvsFqBaQPdODcoQ" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727)"

64.233.172.200 - - [01/Dec/2014:18:43:34 -0600] "GET /wp-content/themes/tennies/images/logo.jpg HTTP/1.1" 200 6487 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"

@mattab
Copy link
Member

mattab commented Dec 18, 2014

Manually checking I found 66.249.64.* ==> 66.249.95.* belong to google

Then I find this article that lists all known IP ranges:

The following IP address ranges belong to Google:
    64.233.160.0 - 64.233.191.255
    66.102.0.0 - 66.102.15.255
    66.249.64.0 - 66.249.95.255
    72.14.192.0 - 72.14.255.255
    74.125.0.0 - 74.125.255.255
    209.85.128.0 - 209.85.255.255
    216.239.32.0 - 216.239.63.255

so we could just add those ranges as googlebot, nice!

@mattab
Copy link
Member

mattab commented Dec 18, 2014

These correspond to CIDR notations:

64.233.160.0/19
66.102.0.0/20
66.249.64.0/19
72.14.192.0/18
etc.

then I searched for these CIDR and found them nicely listed in this Gmail help guide

 ip4:216.239.32.0/19
ip4:64.233.160.0/19
ip4:66.249.80.0/20
ip4:72.14.192.0/18
ip4:209.85.128.0/17
ip4:66.102.0.0/20
ip4:74.125.0.0/16
ip4:64.18.0.0/20
ip4:207.126.144.0/20
ip4:173.194.0.0/16

Great, the list of IPs directly from the source!

@mattab mattab self-assigned this Dec 18, 2014
@mattab mattab closed this as completed in f79acd1 Dec 18, 2014
@mattab mattab changed the title Possible Bots Exclude all requests from all existing Google bot IP ranges Dec 18, 2014
@innocraft-automation innocraft-automation removed this from the Backlog (Help wanted) milestone Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Projects
None yet
Development

No branches or pull requests

3 participants