Ensure that IP is anonymized before the heuristic is applied #2902

mattab opened this Issue Feb 4, 2012 · 6 comments

3 participants

Piwik Open Source Analytics member

AS per ULD recommendations we need to

  • ensure that the IP is anonymized before hashing/heuristics.
  • it should still be anonymized AFTER we tested for IP exclusion or we would break this feature
  • While we are at it, should we disable "Provider" plugin, when anonymization is enabled? because, if we do DNS reverse lookup on the anonymized IP, it won't work. If we do DNS reverse lookup on the IP, well this is not respecting user Privacy. So, I suggest we do not do the DNS reverse lookup in this case. (maybe this is already the case I haven't checked)

See Privacy Analytics


I'm opposed to hardcoded logic for mutual exclusion of activated plugins. Users can/should decide for themselves what level of compliance they'll have.

Otherwise we're looking at disablng Provider, UserCountry, and Geolocation plugins, by default.

Piwik Open Source Analytics member

I suggested skipping Provider reverse lookup only when IP is anonymized.

This is based on the belief (not tested!) that doing the reverse lookup on a wrong IP will maybe add performance overhead, and we don't want to add any slowness in the tracker.

Otherwise we're looking at disablng Provider, UserCountry, and Geolocation plugins, by default.

Will GeoIP work "partially" if given an anonymized address? If it will not work, then I think it makes sense to disable them, but only once we confirm that the Geoip lookup will fail (again for performance reasons).

But, we would always leave them enabled by default, since IP anonymization is disabled by default.


(In [5772]) Refs #2233, #2095, #2902 - set ip_address_mask_length and ip_address_pre_mask_length on anonymizeIP-plugin activation. Synchronize both variables on PrivacyManager call.


At least the User should be warned, which Plugins might not work with 100% accuracy, if the anonymize-IP-Plugin is enabled. If he disables all of these plugins, could be decided by the user himself. Other thoughts?

Piwik Open Source Analytics member

(In [5775]) Refs #2902

  • Provider plugin does not do dns reverse lookup when ip was anonimized since it does not work and is slow to fail
  • Changed recommended "IP Anonymization" from 1 byte to 2 bytes as per recos in http://piwik.org/privacy/
  • Now IP is anonmized only after IP Exclude was tested. All code should use the anonymized IP. If there is a need later to access the non original raw IP, we can add this in the code, but for now there is no such use case

Refs #1823

  • For Geoip it means that the plugin will only have access to the anonymized IP.
    • It would be nice if we could still guess the country at least from the anonymized IP. But I suppose that the IP ranges are not evenly distributed between countries and several countries would share the same IP ranges (with the last 2 bytes removed)...
    • If guessing country or region, is not possible from the anonymized IP, it is an acceptable tradeoff that GeoIP does not work when IP anonymization is enabled, since it's the condition to respect visitors privacy. Once confirmed, we could document this in the Anonymize IP UI tooltip.
Piwik Open Source Analytics member

(In [5776]) Fixes #2902

  • removed ip_address_pre_mask_length setting -- now there is only one setting which anonymises after testing for IP exclusion
    • Reverted the "synchronize" from [5772] since not needed anymore
  • Added integration test to test that IP is indeed anonymized
  • Added integration test to test that IP is anonymized AFTER testing for ip exclusion
@mattab mattab added this to the 1.7 Piwik 1.7 milestone Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment