Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion about user tracking and error reporting #1770

Closed
TheAssassin opened this issue Oct 14, 2017 · 12 comments
Closed

Discussion about user tracking and error reporting #1770

TheAssassin opened this issue Oct 14, 2017 · 12 comments

Comments

@TheAssassin
Copy link

I just opened the AppImage of Etcher and out of curiosity had a look at the settings page. I found a "disable tracking" button there, so I had a look at the code and it looks like data is sent even before being able to reach that button.

Apart from me not willing to be tracked by any kind of software (although it seems to get more popular with Electron since it's easier to get the damn Google's analytics etc. up and running...), the big problem is that the user is not notified about the tracking and can only opt out after data has been collected.

While I'd appreciate an opt-in solution (which a modal dialog on first run is sufficient for, where I can check/uncheck a box), the EU courts have decided that opt-out is sufficient. However, tracking must not occur until the user has had a chance to disable it, which also implies that the user needs to get notified about it in the first place.

Therefore, the authors of Etcher have collected data unlawfully about me (and well, all the other users which ever opened the application).

Just to make clear: I do not intend to take legal actions for now (that'd be hard anyway, and I wouldn't complain here first), but I want to make sure this behavior is terminated immediately. Not only it's annoying for the reasons mentioned above, it's also against the principles of free software and I can't see a reason to do so with such a simple application, it is an unlawful action in the EU jurisdiction.

So, at least make sure that you comply with EU law if you don't want to get sued, which is a non-tracked opt-out possibility. Furthermore, I'd suggest you to not just make it opt-out, but opt-in. Even if you collect less data, you won't lose any relevant information. I'd just add a modal on first startup, notify the user that you plan to track and ask them to hit either yes or no (without making the yes button huge or so). That's probably the easiest and safest way.

But, seriously, did you collect any useful information that had a reasonable impact on the development of this software? I have a feeling this data collection is just made for the sake of collecting data that "could be useful some day". And that's not really a justification for collecting data. Please, think about getting rid of it entirely, because any open source person will love that (feel free to call it your new USP).

@jhermsmeier
Copy link
Contributor

But, seriously, did you collect any useful information that had a reasonable impact on the development of this software? I have a feeling this data collection is just made for the sake of collecting data that "could be useful some day". And that's not really a justification for collecting data.

The only data we're collecting are errors and events (app start, flash, done, etc.), without user-identifiable information – we're not collecting random user data for the sake of it; and since this is all open source, you can even just read in the source what's being sent off. And yes, this data has helped track down a ton of issues across all operating systems, as well as helped us prioritise what to fix quickly.

@jhermsmeier
Copy link
Contributor

Either way, we'll get to this the coming week, and see what the situation is, and what needs changing.

@TheAssassin
Copy link
Author

Well, it looks like Google's analytics services (without digging in too far), and that's not really a system for collecting errors and such. Hard to believe that there's no additional data collected (considering my experiences with tracking in software), but it appears that you're right about it, so thanks for clarifying this.

If it's just about collecting errors, you should consider switching to a more comfortable system called Sentry. I self-host this for a few projects, and we're really happy using it. It focuses exclusively on error reporting and provides frameworks for all major languages. What makes it great is that it automatically aggregates similar error reports and recently even added an option with which you can contact the reporter about the issue if they agree to it. I guess it's better suited for this.

Either way, data is collected without user's explicit consent (they don't agree to it at all, they only can deactivate it if they find that tiny cog symbol), and that appears to be unlawful to me. (Same problems arise with sentry, if you're interested, I can provide information how we solved that issue.) Great to hear you'll be looking into it.

@jhermsmeier
Copy link
Contributor

Well, it looks like Google's analytics services (without digging in too far)

Where did you see that? Google Analytics isn't used in Etcher, we use Sentry for the error tracking, and Mixpanel for the events through resin-corvus.

Same problems arise with sentry, if you're interested, I can provide information how we solved that issue.

Which problems do you mean in respect to Sentry? If there's something we might not be doing properly, I'd love to hear about it.

@TheAssassin
Copy link
Author

I just saw a module "analytics" that got loaded, but the Sentry modules are usually named "raven".

The problem I was talking about from the beginning is that the data collection does not comply with the European laws. Please see my first post about them.

First of all, I wouldn't call it "usage statistics" if it is really not about tracking the usage. That was what primarily caught my attention. I'd be interested in what you consider "usage" tracking, and what is considered "error reporting", because it didn't get clear so far. You basically just talked about Sentry and the error reporting.

I even suggested a "fix" for the entire situation. Considering the new information I got from you, I come up with this set of actual solutions:

  • Disable tracking/reporting entirely (my favorite, but I can understand that this is not an option for you)
  • Disable tracking by default, but keep the entry in the options where it can be switched on (opt-in)
  • Ask the user on first run in a modal dialog whether they want to report the data or not (without favoring either option, in a neutral manner) (opt-in)
  • Show a modal dialog informing the user that you are tracking them, with a direct link to the options where they can switch it off (opt-out, the only realistic way to do it in compliance with EU law)

In any case, please explain the user what is being sent (detailedly) and where it is sent to exactly (i.e., the receiver, "resin.io" seems a bit generic). I believe this is a requirement to comply to law anyway. You might post that on the Etcher homepage, but you should definitely show it in-app as well. This measure increases the amount of trust people have with you guys, and that's a big plus.

@alexandrosm
Copy link
Contributor

alexandrosm commented Oct 15, 2017 via email

@TheAssassin
Copy link
Author

This does still happen without informing the user, and even if it complied with law (might want to have someone verify that, I am not a lawyer, but I have strong doubts regarding EU law and the Privacy Shield), it's bad taste to "just do it TM" by default, and I'd vote for informing the user on startup with a possibility to deactivate it entirely without being tracked, which is currently not possible... Everything else is annoying for the user, and is definitely bad PR for you.

we have made special effort to not collect any personally identifiable information

Since it is difficult to verify what is sent (and I am really sure you need to set up and publish a privacy policy for this kind of usage tracking/error reporting, not everyone is a programmer and "can just read the code"), all these words just lower and lower your trustworthy for me.

A few more suggestions:

  • Split up usage and error tracking to two options, because most people are not willing to be tracked on usage, but don't have a problem with the error reporting
  • Really, just inform the users, and tell them what is sent
    • Add a little popup to the options menu and tell them "By checking this box, you agree to sending .... to Resin.io <insert your address here>"
    • Instead of activating error reporting by default or in this options thingy, why not show a pop-up on errors asking "There was a problem, do you agree to send the following data to <recipient>: <complete, detailed list of data fields and values>"

I really can't recommend Etcher to people right now, there's data sent, but noone can tell what's stored, how it's used, and I'm pretty sure that you're able to find out who did what at any time, being able to identify single users. Even if you say it's anonymized, it's pretty impossible to verify what's actually stored, IP addresses etc. (which are personal data in the EU, and as I know Sentry does store them by default, that alone might be reason enough to file a complaint) might be stored, which could be misused eventually. Especially the usage data, whose purpose doesn't get clear (your app is just not that complex so any kind of usage tracking is justified to me as a potential user), is worrying.

I really think you underestimate the risks of this kind of data collection, and I hope I'll get to analyzing this further...

@alexandrosm
Copy link
Contributor

alexandrosm commented Oct 15, 2017 via email

@jviotti
Copy link
Contributor

jviotti commented Oct 15, 2017

Hi @theassasin,

I think we started off on the wrong foot.

the EU courts have decided that opt-out is sufficient. However, tracking must not occur until the user has had a chance to disable it. Therefore, the authors of Etcher have collected data unlawfully about me

You're claiming that the EU decided that opt-out from analytics is sufficient (which is what we're currently doing), but then that we collected data unlawfully about you. The term "unlawfully" is contradicted by your first statement, so I don't really understand the claim here.

Also, I find it ironic that you're posting this message on GitHub, a platform that is already tracking you without anonymising information, and without allowing you to opt out in any way. Based on your strong claims, I'd have expected you to reach out in some other way.

  • it's also against the principles of free software
  • it's bad taste to "just do it TM" by default,
  • Please, think about getting rid of it entirely, because any open source person will love that

As a software engineer I understand your claims, but please notice that most of them are being backed by your personal taste and concerns about privacy, rather than by laws.

We don't aim to be a purist open source project and a role model of privacy rights, but an image writer that is the best out there, and gets the job done as effectively as possible, for which anonymous analytics and error reports have proved invaluable to achieve. I'll happily take your suggestion about a modal explaining what's going on, as its a nice suggestion, and ignore the strong words you used to write it up.

your app is just not that complex so any kind of usage tracking is justified to me as a potential user

I encourage you to try to tackle some of the obscure edge cases in certain setups that have been reported in the issue tracker. The core logic of the application is simple, but dealing with 3 major operating systems, their own versions, and edge cases in some user's machines depending on their own configurations caused issues that honestly we wouldn't be able to fix without the detailed information we get from the anonymous analytics we send.

In any case, thank you very much for noticing that Mixpanel is triggering a couple of events at startup even when its disabled, before it gets a chance to be shutdown. We'll track and fix this issue at #1772, and if you really want to prevent anyone from tracking you, I suggest to block outgoing network traffic to known analytics/error reporting/ads domains using a firewall.

I created new tickets to track some of your suggestions (#1773, #1774), which I think can greatly improve the product, so lets close this one and continue the discussions there.

Have a great day!

@jviotti jviotti closed this as completed Oct 15, 2017
@TheAssassin
Copy link
Author

What I am referring to is, there is no detailed information what is sent to the non-coder user, and that is a problem. I suggested this a few times (might have appeared more aggressive than intended to you, I am not a US citizen and might not understand language the same way you do, although I have a strong opinion on usage tracking and I am trying to explain my points here as well as my legal worries). Calling me "unreasonable" is rude as well, I never did call anyone unreasonable, but to me, you are "just a US company" (similar to me being "just a random guy" to you I guess) which I have only met in relation to Etcher so far, and I'm trying to express my points rather from a "I'm not a programmer, but just a user" point of view.

I am not trying to offend anyone here, I am really trying to contribute (I do use Etcher occassionally, and like it, hence I'm writing here), and if this issue appeared to be a threat for you, I apologize and hope you can still consider my words. I admitted to have been wrong previously, and to show my willingness to find a consent, I changed the title of the issue. So, please let me try to revise what I got to know so far, and how I see this.

To me, it seems like you are trying to implement these data collection features in a way that does not harm user privacy, but benefit from the collection making development of this application easier for you. I guess that fits your point of view as well. Instead of using Google Analytics (which I wrongly claimed in the first place, sorry about that and thanks for clearing up @jhermsmeier), you use a combination of Sentry for error tracking, and Mixpanel for usage analytics. I'm trying to leave personal taste out of the following, and focus on the legal issues.

The Sentry data tracking is from a user point of view totally reasonable, I understand the reasons, and I stated that I would've activated just this if that was possible. As a software developer, there's nothing more valuable than getting a sufficient amount of information with a bug report. As said before, it is hard to verify what is sent. You suggested analysing the network traffic, which is totally possible to me, and I might do so once. Though, I stated I am worried about the collection of data. This is due to me knowing what Sentry does save by default, and there is a difference between sending information and storing them. The latter is usually the cause of issues. What I mean is that Sentry stores IP addresses along with the other information, and those are considered private data in the EU (and, considering the transmission of data is occurring under the terms of the EU-US privacy shield as it's not stated otherwise), US companies need to follow this decision unless they ask for the users' consent to store anything classified as private information. The IP address bit is the only thing that is able to identify users in the Sentry reports, other than that it's probably anonymized. This is also what makes this a problem legally, as long as IPs are stored, if they are not, everything's fine here.
I'd be very happy if you could tell me whether those are stored by your Sentry instance, and if they are, please think about turning off this behavior of Sentry, as I believe IPs don't have any value in error reporting systems, and that'll fix any potential issues here.

Regarding usage tracking, Mixpanel is (not only to me, but this is nothing that'd mattered legally) a non-trustworthy company considering what they promise to do with the collected data. I have looked at their site, privacy policies and a few related media posts, and that is definitely worrying to me, especially since with any kind of app (whether it's Electron or not), it gets harder to protect oneself (Mixpanel in the browser for example is less of an issue, as my plugins can filter it out, but as soon as I open an app on my desktop, until I get to deactivating the tracking, I can't do very much to turn that off except for non-trivial solutions like hosts file modifications and such). This kind of tracking is legally worrying because of the following points: the user is not informed what of his private data (and it really is private data) is transferred to a US company (Mixpanel) on behalf of you guys (Resin) under the terms of the EU-US privacy shield, which violates the principle of having to get the users' consent about sending data. The opt-out solution you have is a start, but the user still needs to agree. They can do so passively (opt-out), but due to not being informed about the possibility of turning it off, the opt-out is not implemented in a satisfying way (that is, having a chance to deactivate, or otherwise agree by "not disagreeing"). Therefore, I suggested to show a little modal dialog on first run with a link to the options page (tracking free of course), or to just show a modal with a yes/no (that'd be opt-in then, and is IMO simpler to implement, because that first modal dialog can be implemented without any track code, so rather than having to keep the state of "first run" until the options page is reached (IIRC it contains track code, too), you can just keep everything as-is). See #1773.
(The problems in #1772 by the way should be fixed whether you consider the rest of my words or not, they just do the opposite of what the checkbox tells.)

I highly appreciate @jviotti's efforts (especially #1774), and I'll add my two cents to the new tickets if necessary. #1774 is important either, because without this, the user can't agree sufficiently to any kind of data collection if you're not telling them what they agree to. It's like asking for a signature for an NDA without giving the full text to them before they signed

I hope I could clear up what I think might violate EU laws (and the EU-US privacy shield), and if you need me to post the relevant paragraphs (like, if the kind of "moral" argument didn't convince you yet and you need the legal justifications for changes as well), I can do so, but that'll take time as it's additional effort.

There is a nice blog post that explains some of the Mixpanel backgrounds, why it's bad to activate this without informing the user properly, and why being transparent and open is a good thing, also PR wise: http://jnorthrop.me/privacy-considerations-with-mixpanel-people-analytics
The Wikipedia article (yes, I know, it's not necessarily reliable) on the GPDR contains a list of principles on how to implement data protection properly: https://en.wikipedia.org/wiki/Privacy_policy#European_Union

Regarding GitHub: Yes, the platform does attempt to collect data, but the browser plugin argument mentioned above applies here. Also, I have decided to share data with GitHub, and know what they're collecting (they have that privacy policy etc.).

Based on your strong claims, I'd have expected you to reach out in some other way.

I like discussing such issues in public. But I'm curious what way you'd have suggested.

Have a great day!

You, too, everyone!

@alexandrosm
Copy link
Contributor

Thanks for trying to be clearer, and improving the tone of the conversation.

I will further consult with our lawyers, but looking at the US-EU Privacy Shield page, it seems that joining the program is voluntary for companies, so it's unclear what legal claim you are making. We have not joined this programme, so how could it apply to us? Please understand that we take claims that we are breaking laws extremely seriously, but you have to actually be clear about exactly what you are claiming otherwise we can't really comment in an informed way.

One thing I want to say before we move further, is that suggestions like opt-in dialogues etc have been rejected in the past, and will be again in the future, because they damage the simplicity and usability of the application.

Etcher tries hard to be the simplest application for performing the task at hand. Every single word on the application is questioned for usefulness, and we try hard to reduce the amount of decisions users have to make to use the application. This is why users love Etcher, and why it's been successful.

Offering a legal notice on start-up with complex details and multiple checkboxes is completely against this philosophy. Data privacy is of course important, and we take pains to ensure we don't collect more information than is needed. But usability is as big a concern for us. We of course want what we do to be legal, but beyond that, we're not willing to compromise the UX of the application.

As such, I'll close the "opt-in dialog" issue, and continue the conversation here about things we could do.

One thing I'm thinking about is to buffer the "application start" and "settings page open" and other such events until the user clicks on "select image". As such, if they were to follow the path to disable analytics, no events would be sent. I think it's fair to give people the opportunity to opt-out without having any data sent.

@probonopd
Copy link

probonopd commented Oct 16, 2017

Etcher tries hard to be the simplest application for performing the task at hand.

And that really shows. (Sorry for being OT here, but I always wanted to say "thank you" for this.)

I am not a lawyer but personally I would be more than happy to voluntarily provide back some information at the end of using the application, if it would show me what information it wants to transfer to the server. What I don't like is to have to fear about some abstract "tracking" where one can only guess which unrelated information such as my shoe size and bank account information might or might not get phoned home. Hence, imho, transparency (showing which data is sent, and why) would be key. (But this is merely a personal preference rather than a legal claim.)

@jviotti jviotti added the type:ux label Dec 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants