Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GDPR] Ip Anonymisation #322

Open
rolebi opened this issue Mar 30, 2020 · 22 comments
Open

[GDPR] Ip Anonymisation #322

rolebi opened this issue Mar 30, 2020 · 22 comments
Labels
enhancement New feature or request privacy product

Comments

@rolebi
Copy link

rolebi commented Mar 30, 2020

Hi,

Is there a way to anonymise the IP address that is stored with each log ? It is a requirement to be compliant with privacy regulations in Europe (GDPR/E-Privacy).

The full IP shouldn't be available at all in the source data and shouldn't be used for GeoIP resolution. Only the "pseudo-anonymized" IP can be stored and used for GeoIP resolution.

For a lack of official sources I redirect you to the GA documentation: https://support.google.com/analytics/answer/2763052?hl=en

This feature is designed to help site owners comply with their own privacy policies or, in some countries, recommendations from local data protection authorities, which may prevent the storage of full IP address information.

The IP anonymization feature in Analytics sets the last octet of IPv4 user IP addresses and the last 80 bits of IPv6 addresses to zeros in memory shortly after being sent to the Analytics Collection Network. The full IP address is never written to disk in this case.

@hdelaby
Copy link
Contributor

hdelaby commented Mar 30, 2020

Hi @rolebi

This feature is coming very soon. The choice between all these options will be available in Datadog's interface.

@rolebi
Copy link
Author

rolebi commented Mar 30, 2020

Great : )

@bcaudan bcaudan added the enhancement New feature or request label Apr 2, 2020
@bcaudan bcaudan changed the title [Logs] [GDPR] Ip Anonymisation [GDPR] Ip Anonymisation Jun 18, 2020
@hereismass
Copy link

hereismass commented Jul 3, 2020

Very interested by this as well. We would love to use RUM and Logs, but because we can't anonymise IPs to be GDPR compliant we are not using it.
Great that you are working on it :)

@chrys-unito
Copy link

Is there any news on this ? we're also looking to anonymize or have an option to remove network details altogether.

@tchock
Copy link

tchock commented Nov 30, 2020

We would also love to have this feature.
@hdelaby is there any update regarding this?

What we did to make it work right now is:

  • We cloned the browser logs pipeline (to be able to manipulate it)
  • We disabled the geoip process
  • We added a new string builder process on the network.client.ip attribute path and replaced it with [removed]

Hope this will be helpful for the others.
But we would need a proper solution for this. This is basically making this feature not usable for companies in the EU.

@hdelaby
Copy link
Contributor

hdelaby commented Dec 1, 2020

Hi @tchock thanks a lot for raising this. Apologies for the delay! Here is the situation for now:

RUM
It is possible to keep the geoip data (country, city, etc) while getting rid of the IP address. A configuration option will be available in a settings page in the UI in Q1. For now, these requests will need to go through support@datadoghq.com.

Browser Logs
The workaround suggested above is the right one if you want to remove all geoip information and will be documented appropriately. We will also document an alternate version of this workaround in order to keep all geoip information without storing the IP address. Our support will also be able to help configure it.

The vast majority of users actually need the IP address and geoIP data, which is why it is enabled by default. On logs specifically, we are stuck with how integrations pipelines work: there's no simpler way to customize them. Once again thank you for the patience here. I will answer with the appropriate documentation links once it's live.

@henningms
Copy link

Any updates on this? :D

@willhowlett
Copy link

+1 for updates. There's mention of the workaround above being documented. Did this ever happen? Many thanks

@omaratpxt
Copy link

any news ?

@alexander-schneider
Copy link

Update?

@AdelUnito
Copy link

AdelUnito commented Sep 29, 2022

This feature is coming very soon. The choice between all these options will be available in Datadog's interface.

Hi, @hdelaby Any updates on the feature?

@bcaudan
Copy link
Contributor

bcaudan commented Sep 30, 2022

Hello,

The situation is still the same, to remove IP addresses from RUM data, you need to go through support@datadoghq.com.
We still want to build something in the UI and have planned work around that but no ETA to share yet.

We'll let you know here if we have any update on the topic.

@johnkors
Copy link

@bcaudan , we don't use RUM, but we still want to avoid logging IP/geo in the regular browser intake. We've contacted support, and they've only linked us to beforeSend etc, which ofc does not work. AFAIK, the network part is added to the logs not in the SDK here, but on the ingestion level (or similar, outside of our control).

Any recommendation? Is that something support is able to solve similar to for RUM?

@bcaudan
Copy link
Contributor

bcaudan commented Oct 24, 2022

@johnkors for browser logs, did you tried the mentioned workaround?

@johnkors
Copy link

@bcaudan Not sure how that would work for browser logs. We're never sending anything related to network. It's appended at datadog servers.

@henningms
Copy link

@bcaudan , we don't use RUM, but we still want to avoid logging IP/geo in the regular browser intake. We've contacted support, and they've only linked us to beforeSend etc, which ofc does not work. AFAIK, the network part is added to the logs not in the SDK here, but on the ingestion level (or similar, outside of our control).

Any recommendation? Is that something support is able to solve similar to for RUM?

One workaround to ensure that the IP/Geo information is never forwarded from the clients to Datadog regardless of whether it's stored or not (would still show up in access logs etc) is to setup a simple HTTP proxy between your clients and Datadog.

@johnkors
Copy link

@henningms Hi ;) Yeah, that's our last resort.

@henningms
Copy link

henningms commented Oct 24, 2022

@henningms Hi ;) Yeah, that's our last resort.

Hi! 😂

It's quickly becoming my default in the projects 😅 Allows us to control what is sent and eases the minds of the legal/GDPR team

@bcaudan
Copy link
Contributor

bcaudan commented Oct 24, 2022

@johnkors for browser logs, did you tried the mentioned workaround?

@bcaudan Not sure how that would work for browser logs. We're never sending anything related to network. It's appended at datadog servers.

The mentionned workaround allow you to customize what is done by datadog servers.

@johnkors
Copy link

The mentionned workaround allow you to customize what is done by datadog servers.

Sorry, I misread "cloning" as a code change in this repo (as in a fork). My fault. I'll try out the pipeline mods. Thanks.

@JacquesDoubell
Copy link

Any update on this request? Seems like a feature that many would find useful. The workaround mentioned above might not be viable for everyone.

@bcaudan
Copy link
Contributor

bcaudan commented Jul 25, 2023

Hello,

Here is the current state:

RUM
You can choose whether or not you want to include IP or geolocation data from the Datadog UI, more details in the doc.

Logs
You can remove geolocation data by:

  • cloning the browser logs pipeline (to be able to manipulate it)
  • disabling the geoip processor

You can anonymise the IP by:

  • creating a new pipeline after the browser logs pipeline
  • adding a string builder processor to replace network.client.ip attribute value with [removed]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request privacy product
Projects
None yet
Development

No branches or pull requests