Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry should be disabled by default at the system level #15527

Open
coffee-squirrel opened this issue May 17, 2023 · 15 comments
Open

Telemetry should be disabled by default at the system level #15527

coffee-squirrel opened this issue May 17, 2023 · 15 comments

Comments

@coffee-squirrel
Copy link

coffee-squirrel commented May 17, 2023

Expected Behavior

Telemetry should be disabled by default (i.e. opt-in) at a system level.

Current Behavior

Currently (i.e. in the 5.1.0 release), telemetry_enabled defaults to true (

). Without an admin changing that property to false, telemetry must be actively disabled on a per-user basis.

The in-app message (below) does not clearly indicate this data collection defaults to "on".

We would like to collect anonymous usage data to help us prioritize improvements and make Graylog > better in the future.
We do not collect personal data, sensitive information, or content such as logs in your instances.
Learn more on our Privacy Policy .
You can turn data collection off or on any time in the user profile.

Knowing what I was looking for, I searched the documentation and found it mentioned on the Planning Your Deployment > Frequently Asked Questions page-- not the most intuitive spot for something like this. The associated 5.1.0 changelog item is very short.

Possible Solutions

  • telemetry_enabled (i.e. data collection) defaults to false
  • telemetry_enabled is moved into a global runtime setting (defaulting to disabled), and admins have the ability to turn it on or off (perhaps initially with a one-time prompt)

Context

Upgraded an environment to 5.1.0, saw the in-app message, checked the user profile option, and saw it was enabled. I then tracked back the PRs involved to find the telemetry_enabled property (and later found the page mentioned above).

Support ticket #1628167428.

Your Environment

  • Graylog Version: 5.1.0
@defnull
Copy link

defnull commented Jul 25, 2023

The telemetry feature may actually violate GDPR in most European countries. It sends very detailed cluster, device and usage data including identifying personal information (client IP and long-lived tracking cookies) to servers with unknown jurisdiction (telemetry.graylog.cloud sits behind cloudfront, Graylog headquarter is outside EU) and does so before the user has a chance to opt-out. GDPR is pretty clear that stuff like this has to be strictly opt-in.

To add insult to injury, the Privacy Policy link does not actually link to a privacy policy, and the graylog.com Privacy Policy is not applicable to data collection performed by third-party or cloud-hosted Graylog nodes.

Data sent without consent:

  • Client info: IP, OS, browser and version, language, screen and window resolution, user role count, user team count, user admin status. The tracking cookie sticks for a full year.
  • Cluster info: Graylog version, OS details (name, version, arch), creation date, installed plugins, node count, user count, license count, processor count, java heap size, traffic per month, search cluster type and node count.
  • Usage data: Visited pages or performed actions including full URL (which may be internal, or reveal company association).

Even if you do not care about personal privacy, the leaked information may be enough to count as a security incident in some tightly controlled environments.

The config parameter to disable telemetry server-wide (or the fact that admins need to take action if they do not want all users to be tracked by default) was not mentioned in the official changelog or update guides. This oversight (or deliberate decision) is very concerning for a product that is supposed to handle potentially sensitive data.

@benedikt-wegmann
Copy link

🌶

@StefanTheGerman
Copy link

StefanTheGerman commented Sep 18, 2023

The following customer HS#1914022806 is also asking for a function to disable it by default or a switch that works on a global level.

@melvin-suter
Copy link

The telemetry feature may actually violate GDPR in most European countries. It sends very detailed cluster, device and usage data including identifying personal information (client IP and long-lived tracking cookies) to servers with unknown jurisdiction (telemetry.graylog.cloud sits behind cloudfront, Graylog headquarter is outside EU) and does so before the user has a chance to opt-out. GDPR is pretty clear that stuff like this has to be strictly opt-in.

To add insult to injury, the Privacy Policy link does not actually link to a privacy policy, and the graylog.com Privacy Policy is not applicable to data collection performed by third-party or cloud-hosted Graylog nodes.

Data sent without consent:

  • Client info: IP, OS, browser and version, language, screen and window resolution, user role count, user team count, user admin status. The tracking cookie sticks for a full year.
  • Cluster info: Graylog version, OS details (name, version, arch), creation date, installed plugins, node count, user count, license count, processor count, java heap size, traffic per month, search cluster type and node count.
  • Usage data: Visited pages or performed actions including full URL (which may be internal, or reveal company association).

Even if you do not care about personal privacy, the leaked information may be enough to count as a security incident in some tightly controlled environments.

The config parameter to disable telemetry server-wide (or the fact that admins need to take action if they do not want all users to be tracked by default) was not mentioned in the official changelog or update guides. This oversight (or deliberate decision) is very concerning for a product that is supposed to handle potentially sensitive data.

This is a violation of privacy laws. EVERY collection of data must be an opt-in. If this isn't gonna be fixed soon, I will have to stop reccomend graylog and a lot of customers in europe will have to stop using it.

@user29835461
Copy link

user29835461 commented Jan 8, 2024

The data gathered seems to include at least username. Since PII is defined by anything that can be used to identify a person, and username/email addresses definitely are PII, this data processing is governed in the European Union by the GDPR.

User's permission should be voluntary, individualized, informed, and unambiguous expression of will. So called "click-through" UI element the Graylog uses for asking permission does not fulfill the previous, because it doesn't inform the user properly how the PII shall be used, and doesn't as far as I can see allow declining.

Enforcing GDPR should be somewhat easy because Graylog operates also in Germany. The maximum fine is set at 4% of turnover or 20 million euros.

@defnull
Copy link

defnull commented Jan 8, 2024

This is not a technical issue, but a management decision and there are 1.5k open issues in this repository (plus an unknown number of private issues in the customer issue tracker). Unless someone escalates this issue, there will probably be no official response here. I would not hold my breath.

@kroepke
Copy link
Member

kroepke commented Jan 9, 2024

The data gathered seems to include at least username. Since PII is defined by anything that can be used to identify a person, and username/email addresses definitely are PII, this data processing is governed in the European Union by the GDPR.

Hi @user29835461,

I'd like to address at least this part because the data collected does not include user names or email addresses, in fact we make an effort not to send that kind of data.
We also do not store IP addresses, but remove them before storage after looking up city level location information.

If you can point us to where too detailed information about an actor is collected, please let us know so we can address that issue because that would be unintentional.

Thank you!

@defnull
Copy link

defnull commented Jan 9, 2024

@kroepke It does not really matter if usernames are transferred (they are not, AFAIK) because IP addresses already count as PII, tracking cookies require consent and third party data processing (PostHog Cloud) needs to be disclosed. But thanks for confirming that Graylog Inc knows about this issue.

@user29835461
Copy link

It doesn't matter if the system throws the data away. PII has been processed already at that point. Also there is the liability issue that raises from data leakage to 3rd parties, especially as the cookie with the information is set one level towards the root. Again, what Graylog intends to do with the PII is only part of the issue.

@Ghostbird
Copy link

I'd like this to be opt-in as well. Personally, when services provide me with an opt-in, I might enable it. If it's opt-out, I will go to great lengths to block it. It's good that Graylog provides an environment level disable. It was pretty hard to find that option though. In fact, I found the code and credentials that submit the telemetry data first, and it occurred to me that if I could not disable it, I could either block that URL, or periodically send poisoned telemetry data.

@melvin-suter
Copy link

I'd like this to be opt-in as well. Personally, when services provide me with an opt-in, I might enable it. If it's opt-out, I will go to great lengths to block it. It's good that Graylog provides an environment level disable. It was pretty hard to find that option though. In fact, I found the code and credentials that submit the telemetry data first, and it occurred to me that if I could not disable it, I could either block that URL, or periodically send poisoned telemetry data.

@Ghostbird you got that system wide setting on hand by any chance?

@defnull
Copy link

defnull commented Apr 17, 2024

# /etc/graylog/server/server.conf 
telemetry_enabled = false

And yes, it is not mentioned in the server.conf documentation and not present in the default or example config files shipped with the distribution. This is a very convenient oversight, or it was deliberately hidden to make it harder to opt-out.

@Ghostbird
Copy link

Ghostbird commented Apr 17, 2024

The OP links it (with a comment that it's hard to find). It's a single line below the normal per-user telemetry opt-out.

Summarised:

Graylog administrators may set the related property in the server.conf file to telemetry_enabled = false.

I'm using Docker Compose, so I added:

  environment:
    GRAYLOG_TELEMETRY_ENABLED: false

Note: When this options is set, you will still see the telemetry as enabled on the user settings screen. This is confusing, but I'm giving Graylog the benefit of doubt, and assume they do honour their documented features.

@defnull
Copy link

defnull commented Apr 17, 2024

Note: When this options is set, you will still see the telemetry as enabled on the user settings screen. This is confusing, but I'm giving Graylog the benefit of doubt, and assume they do honour their documented features.

If you can click the checkbox and can change its value, then telemetry is not globally disabled. The checkbox should be in "disabled" state (not clickable) if telemetry was disabled globally.

@Ghostbird
Copy link

Ghostbird commented Apr 17, 2024

Interesting, that's definitely not the case. Why would the value not be passed properly through docker, even though other options are?

I've manually hacked it into the defaults of the docker volume, and then it works after a restart.

I'll do some checks to see why this is.

Conclusion

I made an error in the environment variable key in the docker compose file. It works as documented

@defnull Thanks for the follow-up, otherwise I would've accidentally left it enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants