New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected data exposure by default value of "[registry].registry to announce" #2760
Comments
Hi, The hostname sent does not include the domain (cannot be resolved), and the IP address of the host or the browser is not stored by the registry. You can turn any netdata into a registry, so it is pretty simple once you decide to deploy it, to use your own (and most people do). You are right, the wiki page did not document it tracks hostnames (although it is obvious it does it, since the menu shows hostnames). I added it now: https://github.com/firehol/netdata/wiki/mynetdata-menu-item#what-data-the-registry-maintains |
The workaround is fairly simple, as I noted, but just imagine a hostname such as "whitehouse-rhel5.2-webhost-13". You wouldn't need a domain to glean some vital data. ;) |
There's also the argument that hostnames such as that shouldn't exist. If somebody is using such hostnames though, it's very likely that they will get leaked via some other software as well (email software for example). That particular example should be encoded as: '13.webhost.whitehouse.gov'. All the information it contains other than the OS and version (which should never be present in a hostname for a production system) is stuff that should be encoded in conventional DNS, for exactly this reason. |
Well, I clearly understand the issue here. But still netdata requires a registry. Shipping netdata without a working default, means that it would be impossible for people to use it and understand what it is and why they need it. netdata already provides all the means for changing hostnames for the registry and setting up your registry. So, I don't think we should change the default. Of course, I have identified this issue and I think the default registry should be covered by some "license". This is why I have opened issue #1919 to settle this issue. However this requires some work (accept the terms, etc), and I had planned it for v1.8.0, but due to the number of bugfixes, I decided to release v1.8.0 without it and plan it for v1.9.0 Keep in mind that the global health service which is planned as the key new feature of v1.9.0 will increase the exposure. And such a license will then be required by all means. |
With the current default configuration Transparent means for me not somewhere in the wiki. Or would you agree every user should read the whole wiki before using I do not think we should change the default. I agree with @ktsaou without shipping a working default people will not use and and/or continues asking how to configure it. But I think we do not want to loose or frustrate users exposing data they did not want to expose without mentioning it. |
hm... this again will not solve the problem. netdata is installed via many ways and several of them will not allow users to see this info. It is probably better to change the way the registry works. So, before sending any information to the registry, a new call will be made from the browser to the registry, to check if the registry knows the user. This call will return What do you think? |
This is maybe more work but will affect every Installation and is a better solution! |
@ktsaou I don't understand your suggestion. What is a "user" in the sense you are using here? Why does it matter if the "user" is known rather than just checking the "terms accepted" flag in the browser's local storage? Also, without saving a "user rejects the terms" flag also, you are setting up users to get a UX annoyance. |
Does anyone know how to disable this feature? Following this advice to change settings in In case it helps, I am building with the docker/makeself toolchain from this repo. This feature should be more clearly advertised and easier to configure (regardless of the technical details or policy re: revealing hostnames). |
@tjohnston01 currently you can't disable the registry. You can setup your own registry though. Have you tried it? |
So for example if I set
The netdata machine will send GETs to itself. "enabled = no" in this case just means that this machine does not act as a registry (but it still sends these GET requests). Correct? It would be good to have a way to disable both parts of this. |
I agreed that having an option to disable this is a good idea, and I'd argue that it really should be opt-in, not opt-out. Have some pop-up on the dashboard if the config to enable/disable connecting to the registry is unset to prompt the user to opt-in if they want or opt-out if they don't, probably with something about links in alerts not working correctly without it and the fact that potentially identifying data gets sent to a a central server run by the netdata project if it's enabled. Also, as it currently stands, I'm pretty certain that this is not EU GDPR compliant, and in particular because the hostnames are functionally identifiable data and are stored, it probably needs to be adjusted to be compliant (IANAL though). |
Issue #1919 will turn this to opt-in. Abstract from: https://www.eugdpr.org/gdpr-faqs.html
A URL can hardly be mapped to a person. However, the person cookie the registry uses, can be used to identify a person (it tracks the browser the person uses). When we will implement #1919 the registry will also have the person's email. The only data processing done by netdata using these data, is explained in detail at https://github.com/firehol/netdata/wiki/mynetdata-menu-item. We don't expose these data to 3rd parties and we don't process them in any other way. So, I think we can close this and continue at #1919 |
i just found out today about the whole public registry and what it does hidden inside the ui. and well this is a showstopper for me and a huge loss of trust! so i came here.. i think its partly the wrong discsussion here: its not about "a working default" but "a secure default" as netdata IS working even without the need of users to expose data to any thirdparty server. the only way to fix this is to run one node as registry. i've not seen it in the documentation that this is mandatory to prevent data exposure to thirdparties. and a tool you run on every machine should have a secure default setup. i completely do not understand why its not really possible to turn off the need of a registry at all. after all it just shows me some urls of my netdata instances.. ah right, i have bookmarks (and yes we have other tools that orchestrate our infrastructure, and generating a link to each node to connect via http...19999 is trivial). btw: under gdpr even your ip is a personal attribute. and the registry gives personal-identifiers (aka trackingcookie) to visitors without telling them and without consens. its actually hidden in the ui and only visible if you watch traffic. its no good and not trustworthy habit to do so and possibly even not legal. for the love of your otherwise such wonderful tool: please turn this off by default. |
The registry provides a lot more functionality than browser bookmarks. For example, pan or zoom the charts on netdata server A and then click on server B. The panning/zooming is maintained. Mark an area (with alt or control + area select) on a chart on netdata A and then click netdata B. It will be maintained too. Scroll at a mysql server on netdata A and then click netdata B. If B runs a mysql server, it will automatically scroll to it. And many more... The registry is a key attribute in our roadmap. It is the entity that will eventually allow us to provide unified cross server dashboards. So that, no matter how many netdata you have, you will use them all as one integrated application. So, we don't plan to remove it. We actually move towards the opposite direction: enhance it significantly. For example, the next version of the registry may provide OAUTH (for authentication), central health monitoring, cross server custom dashboards editor, storage and sharing, store personal settings for all your netdata, etc. Keep in mind that GDPR is a rulebook for processing personal data. It does not forbid personal data processing. It only enforces a ruleset to do it. Personal data are all those that somehow are associated with a person's identity (everything you can use to identify a person). The current version of the registry does not associate any of the data it maintains with any person's identity. That is, no one can identify the person using a netdata registry cookie (except of course the owner of the cookie). So, the current version of the registry is not related to GDPR. However, the next version will, since it will require from us to somehow login (it will know our emails and this uniquely identifies a person). |
dont get me wrong: but lets collect the facts about your public registry registry.my-netdata.io:
i don't think i need to explain further that this is not good in terms of security. if we dig deeper in that call to your registry we see there, your public registry responds back with:
to sum it up:
as proposed i think good habit would be: do not enable your public registry by default, let user opt-in and opt-out and explain what you do with that data |
I like your sensitivity, but you mix up things. All netdata dashboards have this entry as the last entry of the Read it. It explains everything in detail. None of the information is personal.
To become "personal data", we need a person's identity. Something like:
Only in this case the IP or the cookie is personal information, under GDPR. The registry does not currently have this extra information that personalizes the data. Netdata is distributed. This is the way it works. You can't stop it or change it. A registry is required. The default public registry has been carefully designed to avoid any personal information, it is well documented and in case you are so sensitive, we have given you the option to install your own. Think a bit of it: You publish a site and you add google analytics to it. Google analytics collects a lot more information compared to what the netdata registry collects. Do your users have to opt-in to it? No. Why opt-in is not required in this case? Because it does not collect personal data. The only requirement is to let your users know you use a cookie. This is what the "What is this?" menu entry on all netdata dashboards does. |
i think i raised enough other problems. so please do not only respond/discuss if my ip is under the gdpr a personal data attribute which has to be treated equivalent to my name, emailaddress (or all others here: https://eugdprcompliant.com/personal-data/ ) or not. at the end even to falsify just this point does not matter if you understand and sum up all my other points. noone ever integrates google-analytics into its own private dashboard that possibly holds secret information. thats a rather stupid point and has nothing to do with this discussion here! but you're right who ever thinks its unsafe to have google analytics integrated into his own private dashboard should not use netdata without own registry, as it defacto leaks the same amount and quality of data. besides if you ever add it, you do it by your own decision. but netdata does this without telling you and directly at first load of ui. |
@ktsaou This:
Is not entirely accurate. A registry is only required for certain features. Not everybody needs the my-netdata menu, and nowhere near everyone needs links in the notifications. They may be useful features for how you use it, but the way you use it isn't the only way to use it. I personally never use the my-netdata menu (I've got custom dashboards that scrape-together the info that I need to track across multiple systems into one place, so I don't need quick switching between dashboards), and I also never use the links in the notifications (when I get a notification, I don't go to the Netdata dashboard, I log in to the system in question and see what's wrong, because I usually have a pretty good idea what's going on based on the notifications I have). So, for me personally, the registry is essentially useless right now, I just let the dashboard talk to it because I have no particular reason not to. |
in my opinion a fair workflow would be like this and it would also solve another "problem":
this would solve those issues:
|
@toastbrotch, I agree with you about how bad it is to have the public registry enabled by default—you are stating the case well. There's a problem with allowing a visitor to the Web UI to control the setting, though. Shouldn't that be controlled within the netdata configuration itself, which is on the host, not in the browser? (I.e. I wouldn't assume that a mere visitor should have the rights to make that decision.) So then we're back to—even if the admin who set up netdata did intend to use the public registry, the user of the netdata Web UI may not want to send data there. I think this becomes a question of preferences within the Web UI. Regardless of what registry is or isn't configured on the host, remember that the name of the configuration value is registry to announce. After the registry is announced, it's still up to the browser (the Web UI) to actually send data to that registry or not. I propose that a prompt appear in the Web UI when first visiting a netdata instance using your browser:
|
An update on what has happened on this.
Although issue #1919 (on opt-in to share server hostnames and URLs) was closed, the new registry will require opt-in. See the section "login to enable the registry" in #3990 We will be able to close this issue when #3990 is live. |
Currently netdata team doesn't have enough capacity to work on this issue. We will be more than glad to accept a pull request with a solution to problem described here. This issue will be closed after another 60 days of inactivity. |
#3990 is not done yet. This should not be closed. |
It was incorrectly labeled as discussion. We're close to merging #5095, which will resolve this. |
Opt-in via signing in was implemented in #5095 |
I have to add: https://www.theregister.com/2022/01/31/website_fine_google_fonts_gdpr/ Even an IP address is enough personal information, so any connection to a third party needs explicit consent from a user. |
We are anonymizing IPs everywhere where we suspect they may be leaked. The agent hasn't been using GA for a while now, we just have PostHog there, which is an open source project. I haven't seen any IPs there either, though I did just ask them to double check and verify. |
It does not matter what you do. If I acces example.com of course my IP address gets leaked to them and the hoster they use, but it should not be leaked without explicit acceptance by the user to any third party (e.g. Google) |
We'll take the risk then. As long as the 3rd party provider (which I repeat isn't Google) is verifiably not storing personal data, we believe we'd win in court. The benefits of a sizeable percentage of anonymous statistics far outweigh the small risk of being ordered to do it in a different way. |
The default value in
/opt/netdata/etc/netdata/netdata.conf
is:This is a public registry. What this means is, anytime you navigate with your web browser to the netdata console for a given host, your browser sends information about that host to the public registry. At least its hostname is exposed.
For some organizations hostnames represent sensitive information, so this should be documented better and probably should not be the default.
(A workaround is to set the value to the bogus registry
http://localhost:19999
, so the web browser will attempt to send the data to the host it resides on, which won't do anything.)The text was updated successfully, but these errors were encountered: