-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opt in for error reports and usage statistics #2766
Comments
Well, if it's really anonymous, then it complys with GDPR. Because GDPR does not apply to anonymous Data. |
related discussion: #2497 |
I read some old issues/history regarding this and I think the best one would be to file a complain to the official GDPR bureu in Greece to get this checked as the creators of the software deny fixing this since a long time. As stated on the balena website they have office in Athen all legal action can be easily enforced there. For sure (at least) my IP is leaked when 'anonymous data' is transmitted (without my consent!) and this is already enough to violent laws. Really bad for a foss software to see something like this happens... I now have to choose to go back to |
Please see #2599 (comment) |
Please see more #2599 (comment) |
Hi everyone. I wanted to give an update to where we are with this issue. There are multiple issues raised so I will address them individually. We should separate the discussion between what's legal and required by GDPR and requests that go beyond what the law requires. Specifically, GDPR requires opt-in consent for personally identifiable data, not for anonymous data collection. It is not our intention, nor is it useful for us, to collect personal identifiable information (see Purpose section bellow). So the first question is "Are we collecting personally identifiable information by mistake?" and the second question is "Is making the usage statistics opt-in the best decision for the project?" Personal data collectionWe conducted an extensive audit of all the data we collect from the Etcher application to make sure no personally identifiable data is collected by mistake. Collecting data by mistake might sound strange, but it can easily happen in a desktop application. For example, the mixpanel library will include information about the current system user by default when ran in an Electron app. Whenever we became aware of such issues in the past we prompty fixed them. The results of our investigation showed that Etcher will make connection to the following systems:
The large number of unintended connections happened as a side-effect of loading content from our balena.io website that includes these libraries automatically. Furthermore, we audited all the data we collect to make sure none can be characterised as personally identifiable. To do this properly are consulting our EU based lawyers that can provide an expert opinion on what the GDPR and EU law in general requires. It is important to refrain from making legal claims unless someone is intimately familiar with the legislation. Unfortunately, there have been a number legal claims in this and other threads with questionable validity. To make this extremely clear, we are taking the law seriously and are investing time, money, and effort, to consult experts in the field to guide us on this matter. We do this because it is the right thing to do. We've done it before (for balenaCloud) and we'll happily do it for all the products we offer. Even though our conversation with our legal team is still ongoing we have identified a couple of cases where PII is sent to our data collection system. Sentry, our error collection tool, will log a stacktrace when Etcher hits a critical error that can potentially include a path in the system which includes the username of the user. The IP address of the event was also logged. Purpose of data collectionWith the legal stuff out of the way, I wanted to touch on the reason we are collecting data which will hopefully help guide the discussion about whether it should be an opt-in or opt-out feature. For most software engineers writing an image flashing application sounds easy. After all, at the very core it is a simple block copy operation that we've known how to do for ages. It can't possibly be that complex. However, this is far from the truth! After releasing etcher for the first time, and as the tool was gaining adoption we were seeing it run in more and more obscure combinations of systems. This produced a (very) long tail of issues that we couldn't have predicted or tested during development. It was through constant sieving through error reports and measuring success rates across deployed versions that we managed to reach the level of quality that you see today. When we say that usage data helps develop etcher we're not talking about some abstract possibility. This is very real and has shaped the etcher we know and love. The list of bugs fixed is endless. Discussion on making collection opt-inWith the full context fleshed out we can now re-engage in the discussion of making data collection opt-in. As mentioned above, we have to make the decision that is best for the project and somehow balance what the users expect from a privacy point of view with what the users expect from a robust piece of software point of view. Given the benefits we've already seen this is not a clear-cut decision. At the same time the userbase of Etcher has grown tremendously and one could argue that most issues have already been seen. Unfortunately I don't have a concrete way forward to offer just yet, but we haven't ruled it out as a possibility. Finally, to further steer the discussion towards the right direction I will change the title of the issue to just the opt-in discussion. @rradar if you still think there is a legal issue please open a separate ticket clearly explaining the problem. Rest assured that we are working with our legal professionals to ensure we are not breaking the law. |
Maybe aim for higher than "not breaking the law"… |
Easy! I want to flash complete privately. #2890
Indeed. It is a clear-cut decision. No way around it. |
@rradar I think we're confusing what this issue is about. You can already disable usage statistics from your settings so if you want to flash completely privately by all means use this feature. It's why we put it there in the first place. The only current bug with the feature, which we are working on right now and we'll release a fixed version in the following days, is that some libraries make a call to a remote server even if you merely But please try to keep the discussion on point. What we're discussing here is if anonymous usage statistics should be opt-in. User choice is and will continue to be a feature of Etcher. |
The collection of so called "anonymous usage statistics" needs to be opt-in. Otherwise everyone (even people who prefer not to leak their data) will be forced to participate in the data collection. The setting which is implemented right now leaks data before it can be turned off -> NO GO! |
This comment was marked as abuse.
This comment was marked as abuse.
@thefaj I have repeated this many times but for some reason you seem to ignore it. We're not using Google Analytics. We're only using Mixpanel and Sentry. Secondly, we actually do send anonymous data and strip events from personal information. If you believe this is false you have to provide counter evidence. The code is there for you to inspect. Until then your claim means nothing.
Personal insults are not allowed in this community, please remove this comment. Next time there won't be a warning. |
It turns out that collecting user data without explicit consent means that you end up violating the consent of your users in a fraction of cases where that's not what the user wants. Doing things with a user's computer that they don't want makes your software malware. It is only "not a clear-cut decision" if you don't mind violating the consent of your users, which is a despicable stance, if indeed you hold it. Please default data collection to off. Ask users on a first launch with a modal, if you wish. But do not use the network without explicit permission.
Hiding behind an "is it illegal?" to mask the fact that you violate user consent is not something you should be doing. It is rude and immoral, and you should strive to conduct your business in an ethical and respectful fashion. |
Said @petrosagg:
There is an argument to be made that this is not a personal insult, but in fact an accurate objective description of the current state of affairs. If circumstances are such that failing to protect one's own privacy would result in danger, then Etcher's privacy-compromising default settings are indeed dangerous. It is also clear that releasing the current, consent-violating-by-default version, is an unethical business practice, which would have to be undertaken by unethical people, necessarily. It wasn't an accident or oversight, it was a clear and definitive choice made by Balena staff, to place bug acquisition data over that of user consent. The only thing remaining for Balena to do is to remedy this failure. |
@sneak we are in agreement on this. The reason I brought up legality was only because there were claims that we are breaking the law, which had to be addressed. If you read my comment above I try to steer the conversation away from the legality and towards what is best for Etcher.
The problem is that this decision is not in a vacuum. I could reformulate your statement as "It is only "not a clear-cut decision" if you don't mind ignoring the fraction of users that cannot use the software because of their peculiar setup." Is ignoring accessibility not despicable?
This is not the widely accepted definition of the term malware. You are using it to have this extra "punch" in your message. That's not good faith discourse. From Wikipedia:
But even ignoring that, how far can you stretch this definition? What is allowed for software to do on a user's computer in the first place? You say you shouldn't use the network, but one could say it shouldn't use the disk to store state, not take too much real estate on the screen etc. There is a reason we are ok with some things but not others. This reason, at least for us, is not a deontological one. Since ethics have come up a lot, we are thinking under a consequential framework. When this decision was made we had concluded that having data opt-in will cause some damage to users that don't want to get tracked at all, but it would be less than the damage caused by not improving Etcher from error reports. That's it. Everything else being equal I would choose no tracking every time too. But everything else is not equal. Finally, the reason this issue has remained open is because we agree we should re-evaluate what the best state for Etcher is at the moment. For example, we understand that a lot of the big issues have already been fixed from previously collected data, and that at the current size of the user base it could be that the people that would opt-in make a representative sample. If we had a clear way forward I would have closed the issue stating our position. We would be extremely happy if you could provide different angles from which you can make an argument between the inconvenience of opting out without a modal and better software without resorting to "This is bad, full stop" type of statements. |
It's not that it's bad. It's that you should not use a user's computer to do things that user does not want. If you don't know if the user wants it or not, ask. But don't assume and proceed, because then in some set of cases you do what the user does not want, which is a universally bad thing, regardless of the benefits to you or to other users. |
I wonder if the terminus Spyware is more valid for etcher than malware. Or at least that this software comes bundled with spyware from a users point of view. "Spyware is a software that aims to gather information about a person or organization, sometimes without their knowledge, and send such information to another entity without the consumer's consent." https://en.wikipedia.org/wiki/Spyware Knowledge an consent is the key. Both are not given/asked by etcher. |
I'm hesitant to get involved in this heated conversation, but @rradar says "gather information about a person" and @petrosagg has said that the data collected is anonymous, and I'd say that you can't really classify "anonymous data" as "information about a person". |
Anonymous data:
The real issue is consent, though. |
This affirms that the creators of this software have no respect for user consent. This will need to be forked. |
If that was as blatantly true as you keep claiming, then there wouldn't be any opt-out button at all. IIRC that opt-out option has been there from the very beginning.
Yup, nothing at all stopping you doing that. Hooray for Open Source 🙂 |
@vengerst "What user consent is about? The data when writing? I think etcher should have access to the data" [...] Is like I'm buying a pen and everything I write with it should be accessed by the vendor? Hello? @lurch [...] "there wouldn't be any opt-out button at all. IIRC that opt-out option has been there from the very beginning." This opt-out button which never really worked and "accidentally" leaked information about your habits to not less than 9 servers including GAF even when it was turned to NOT send data?... ...you can really see how serious balena takes the users choice (not talking about consent) and privacy 😞 |
Thoughts how opt-in could look like: First start modal: "Hello we are balena team working hard..... Can we have your data to make the world a better place?" -> If I say no I don't want etcher to make any network connection. (Could ask a second one if etcher would be allowed to phone home to check if a new version is available...) If a error happens during flashing or using the program a modal could be presented to the user: "We catch a error. To get a chance solving this you can upload the crash report to the balena cloud now" Settings: "Error Reports and usage statistics" (initially turned off) |
Action item: Removing all instances of those connections from Etcher 🤔 |
Just came here because this fancy version of
and I just could not believe it. Not cool. |
@tcurdt See the earlier comment where @petrosagg acknowledges that some of these services were being included accidentally. |
@lurch I am on version 1.5.63. That should be the latest release. |
Yeah, the benefit of doubt went out the window long ago. This app is a cesspool of spyware |
It would be nice if you did not use that sort of language. Please edit your comment and be civilized. |
@tcurdt Did it show those connections after disabling the analytics as well? Did you try restarting the application after disabling the anonymous analytics? |
Asking because as you mentioned correctly, we got rid of some of those connections without releasing a new version (beside the content that comes from our marketing website, such as the success banner), while other connections needed a version update (e.g. mixpanel, whose library was still included beside not being used) |
Just found the setting that was enabled. Will give that a try. Thanks! |
And ask for users permission before exposing a users IP! That's why we need opt in! To comply with the laws! (same applies to the ads showing while flashing) I'm still don't get why balena is still having so hard times with the laws even they are aware of the situation (see post from @petrosagg) 😞 |
@tcurdt I just tried this myself with Etcher v1.5.70 and the "Anonymously report errors and usage statistic to balena.io" option disabled and I saw no connections to mixpanel, google, doubleclick, or any other analytics service. Can you retry your test with the latest version? To be clear, there were connections to our static site,
@rradar did Github ask you if you want to log your IP before connecting to Github? No, because this is how the internet works. Etcher loads a small content page from the internet as part of its functionality, you can't do that without doing a TCP connection just like you can't have a webpage on the internet without receiving connections from an IP address |
Patronizing, much @petrosagg |
@thefaj you can claim things about "most of the rest of us" as much as I or anyone else can. It's your personal opinion, I'll grant you that much. Putting that aside, I didn't mean to sound patronizing and I apologize if I sounded that way. I'm pointing out that displaying a webpage inherently includes doing a TCP connection, just like when visiting a webpage. If you have a suggestion on how to do that I'd be very interested to hear the solution and even implement it, but as far as I know you can't load a webpage without a TCP connection from your IP. |
@petrosagg we can argue about the stupidity of GDPR as much as we like - but it is a reality we live in. Of course it can establish TCP/IP connections - but in the EU it now means then there should be a lot of legal information provided to the user for doing so. Legal issues aside - why a fancy version of |
@tcurdt to be clear I don't think GDPR is stupid, in fact I quite like it and even used my rights as an EU citizen. From our talk with lawyers it didn't sound like there need to be a notice just for loading content. On loading the webpage, it is not a requirement to write the image. If you attempt to write one with no internet it will work just fine. We use the webpage to display a featured project while the write is happening. The featured project is a DIY project, usually with a raspberrypi, that our team has created for the users of etcher. For example currently it walks you through making a bluetooth sound receiver that connects to your stereo. We believe that this is high quality content that helps both the users by presenting an interesting project and our organisation to continue funding the development of this project. |
I guess there is a difference in typing in an URL and clicking a link or just loading a resource from an application. And at least for every 3rd party one should provide information what is happening with that kind of information. Even if it is "we are storing nothing" - but IANAL. I like the the idea of GDPR but I am not a fan of the implementation - so to speak. Anyway! Thanks for clarifying about the offline support. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This thread has been dead for a few months, but I want to step in and add my voice to say that not all balena users feel so strongly about this. I use etcher and balenaCloud on a regular basis, and am glad the Balena team is tracking crash reports and usage data. If they keep it anonymous, and it helps them make etcher faster and more stable, then I approve. As for the tutorial, well, that seems like a way to fund development. People who decide to build the projects will get an experience with the Balena platform and may decide to buy paid services at some point. I have read the devblogs, and it sounds like a surprisingly large amount of work went into making Balena stable and fully cross-platform. To me promoting actually useful tutorials seems like a pretty benign way to make money compared to some of the alternatives like targeted advertising. For those of you asking why Etcher uses more than 400Mb, it is because it is built with Electron which means most of Chromium gets bundled into each app. In exchange for this size trade-off you get the ability to write the app with HTML, CSS, Javascript, and familiar browser and Node.js APIs. This is what allows a small company like Balena, who are not really in the business of making desktop apps, to put out something of really high quality like Etcher. I long for the day when a more lightweight framework to build apps with web technologies comes on the scene, but until then I would rather take a bloated app rather than nothing at all. To @rradar, @thefaj, and others. You make some good points, and I almost agree with you that the tracking should be opt-in. But I disagree with your tone in this discussion. You are demanding that the Balena team respect what you consider to be your "rights" in a very disrespectful manner. It seems like your verbal abuse is getting in the way of persuading people. I think you might have been able to push me, and possibly the balena team, over the fence into the "no tracking" camp if you had presented your viewpoint more tactfully. Cheers! |
This comment has been minimized.
This comment has been minimized.
That's fine, you're more than welcome to consent to the tracking of your crash reports and your usage data. You are not in a position, however, to consent for the data of other people who are not you. |
I think we made our case pretty clear, there's no need to keep the discussion going. |
NOTICE:
I just installed balena etcher via the debian repository. After installing I started with
balena-etcher-electron
got a
and the ui was presented. When I clicked the settings wheel top right I saw that a 'service' called
which is/was activated by default.
I didn't digged deeper but I'm almost sure there was already a data leakage before I was able to deactivate this 'feature'. I'm also not a lawyer but with new European laws this is for sure not tolerable anymore.
Please make this option opt in. Thank's
The text was updated successfully, but these errors were encountered: