-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lookup fails #3
Comments
I've been investigating the issue myself, and this is most definitely caused by the overlap of IP address ranges, using different netmasks. I've pinpointed the issue to segment_tree/lib/segment_tree.rb Line 87 in e050886
When the IP ranges are sorted, it just happens so that on the first run the the and But because the enquired IP address is further two places to the left at index 50 The binary search gives up, as it expects 69.171.251.1 to be after the current left, while in fact it is not. With the current implementation of SegmentTree using a sorted array and binary search, I don't see how this can be solved in here. Perhaps the list of IPs can be broken up beforehand into the least-common-denominator netmask and passed to SegmentTree as a list of same-sized netmask ranges. |
Hi, thanks for reporting and investigating the bug! So it seems that binary search implementation is fundamentally incorrect and the library requires an overhaul. I'm not sure if I'll have enough time to do that in the nearest future. In the meantime, can you try an interval tree for searching the IP-address? https://github.com/misshie/interval-tree seems to fit your needs. I tested it on your dataset and it seems to work fine: subnets = ["69.63.176.0/20", "66.220.144.0/20", "66.220.144.0/21", "69.63.184.0/21", "69.63.176.0/21", "74.119.76.0/22", "69.171.255.0/24", "173.252.64.0/18", "69.171.224.0/19", "69.171.224.0/20", "103.4.96.0/22", "69.63.176.0/24", "173.252.64.0/19", "173.252.70.0/24", "31.13.64.0/18", "31.13.24.0/21", "66.220.152.0/21", "66.220.159.0/24", "69.171.239.0/24", "69.171.240.0/20", "31.13.64.0/19", "31.13.64.0/24", "31.13.65.0/24", "31.13.67.0/24", "31.13.68.0/24", "31.13.69.0/24", "31.13.70.0/24", "31.13.71.0/24", "31.13.72.0/24", "31.13.73.0/24", "31.13.74.0/24", "31.13.75.0/24", "31.13.76.0/24", "31.13.77.0/24", "31.13.96.0/19", "31.13.66.0/24", "173.252.96.0/19", "69.63.178.0/24", "31.13.78.0/24", "31.13.79.0/24", "31.13.80.0/24", "31.13.82.0/24", "31.13.83.0/24", "31.13.84.0/24", "31.13.85.0/24", "31.13.86.0/24", "31.13.87.0/24", "31.13.88.0/24", "31.13.89.0/24", "31.13.90.0/24", "31.13.91.0/24", "31.13.92.0/24", "31.13.93.0/24", "31.13.94.0/24", "31.13.95.0/24", "69.171.253.0/24", "69.63.186.0/24", "31.13.81.0/24", "179.60.192.0/22", "179.60.192.0/24", "179.60.193.0/24", "179.60.194.0/24", "179.60.195.0/24", "185.60.216.0/22", "45.64.40.0/22", "185.60.216.0/24", "185.60.217.0/24", "185.60.218.0/24", "185.60.219.0/24", "129.134.0.0/16", "157.240.0.0/16", "157.240.8.0/24", "157.240.0.0/24", "157.240.1.0/24", "157.240.2.0/24", "157.240.3.0/24", "157.240.4.0/24", "157.240.5.0/24", "157.240.6.0/24", "157.240.7.0/24", "157.240.9.0/24", "157.240.10.0/24", "157.240.16.0/24", "157.240.19.0/24", "157.240.11.0/24", "157.240.12.0/24", "157.240.13.0/24", "157.240.14.0/24", "157.240.15.0/24", "157.240.17.0/24", "157.240.18.0/24", "157.240.20.0/24", "157.240.21.0/24", "157.240.22.0/24", "157.240.23.0/24", "157.240.0.0/17", "69.171.250.0/24", "157.240.24.0/24", "157.240.25.0/24", "199.201.64.0/24", "199.201.65.0/24", "199.201.64.0/22", "204.15.20.0/22", "157.240.192.0/24", "129.134.0.0/17", "157.240.198.0/24"]
ranges = subnets.map do |ip|
range = ip.to_range
(range.begin.to_i..range.end.to_i) # interval-tree can't work with IPAddr, hence converting to integer
end
tree = IntervalTree::Tree.new(ranges)
ip_to_search = IPAddr.new('69.171.251.1')
found = tree.search(ip_to_search.to_i).size > 1 |
Thank you for the suggestion! |
Hi,
I am an author of Legitbot, it's a library to detect whether a request comes from a valid Web crawler or from some kind of impersonator. I am using "segment_tree" to quickly lookup IPs in a list of Facebook ranges. And we have a problem when looking for a valid IP and getting negative result back: alaz/legitbot#10
I tried to come up with a smaller test case, but frankly it does not look too small. Here I have the full list of Facebook IP ranges, then I'm building a segment tree and the lookup fails:
But If I select only some part of ranges from the same list, then lookup in the resulting segment tree succeeds:
The most evident observation is that some ranges overlap and the list is not sorted. I shall appreciate your help on this issue.
Regards,
Alexander.
The text was updated successfully, but these errors were encountered: