Skip to content

Add dependency on Scarf for installation analytics #2012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 25, 2020
Merged

Add dependency on Scarf for installation analytics #2012

merged 5 commits into from
Mar 25, 2020

Conversation

aviaviavi
Copy link
Contributor

This change adds a devDependency on Scarf for installation analytics to support the maintainers of React Table.

Discussion from everyone is welcome! Some background on the library and the problem it's solving can be found in this post here: https://github.com/scarf-sh/scarf-js/blob/master/WHY.org. The package README (https://github.com/scarf-sh/scarf-js) has more info on exactly what data is sent on each install.

The analytics introduced here are opt-out by default, so data will only be collected from users who explicitly enable it. Because it has been added as a devDependency, it also won't be triggered during production builds that depend on react-table.

The data Scarf collects will help maintainers by keeping them informed about how the package is being used and will help drive sponsorship to support all the work that goes into React Table.

@tannerlinsley
Copy link
Collaborator

This looks really great. How exactly would we be able to view the gathered stats?


```bash
# If you'd like to support react-table by sending installation analytics:
$ export SCARF_ANALYTICS=true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that the analytics are disabled by default for now, but I wish the mechanism for opting in was easier or more fluid. I was thinking something like a post-install hook that would just prompt the user for their permissions to collect analytics like this:

$ Would you like to send usage/support analytics to React Table? (Y/n): _

They hit enter for yes (or change it), and it edits the package.json file to denote their choice. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! We can get this added to scarf-js. We'll just need to make sure that wait time for user input isn't too long to avoid slowing down any builds, but a short wait for input seems fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this trivial? I would love to wait for this if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally possible! It's not quite trivial, but definitely straight-forward. Working on this now and will have it ready soon.

Copy link
Contributor Author

@aviaviavi aviaviavi Mar 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR on scarf's side is just about ready to go, just had one question for you in scarf-sh/scarf-js#4 (comment)

@aviaviavi
Copy link
Contributor Author

How exactly would we be able to view the gathered stats?

The statistics will be available on https://scarf.sh. You can make an account, create a new library entry for react-table, and everything will be shown there with access controls to authorize any of your co-maintainers too.

@tannerlinsley
Copy link
Collaborator

As per our conversation on Twitter DMs, let's make this opt-in by default with no prompt and simply display that we are collecting anonymous usage statistics and that they can opt out if they want.

@aviaviavi
Copy link
Contributor Author

Done.

There's one extra change in here to call out - I had to move this into dependencies, not devDependencies. It turns out that transitive dependencies are not installed with devDependencies at all, so in order to make that happen for packages that depend on react-table, it needs to be a proper dependency

@aviaviavi aviaviavi changed the title Add dev dependency on Scarf for installation analytics Add dependency on Scarf for installation analytics Mar 25, 2020
@tannerlinsley
Copy link
Collaborator

That's totally fine. It's not getting imported and is super lightweight, so no worries there.

@tannerlinsley tannerlinsley merged commit d0a9b97 into TanStack:master Mar 25, 2020
@iamstarkov
Copy link

this is not great

@tannerlinsley
Copy link
Collaborator

@iamstarkov

  • Analytics are viewable by me and me only. I use this information to know more about who my users are and what direction the library should take in the future.
  • You can opt out several different ways, including your entire machine and/or the package.json method.
  • If you are a hobbyist user of React Table, I don't see how supplying an install count to my analytics is an unfair trade.
  • If you happen to be a commercial entity and are using React Table for commercial purposes, then I see no reason that you would want to hide your company's name from my analytics unless you are attempting to mask your companies use of my software. Again, you can opt out.

With that said, I would love to hear any concerns that aren't adequately covered by these points.

@iamstarkov
Copy link

iamstarkov commented Apr 29, 2020

first of all Scarf isnt GDPR-friendly.
then there are several open attack vectors for spreading malware, which im not comfortable with

  • aws bucket on which it is being stored, can be compromised and "end-user" devs won't be aware or notified
  • no integrity checks on the aws artifact download enabling MitM attack
  • no way to verify 1 to 1 mapping between scarf's GitHub releases and npm releases, enabling stealthy substitution in the npm package's code downloaded from the npm registry. This is how few malware npm worms had started.

@tannerlinsley
Copy link
Collaborator

Well, those all sound like great points you should bring up to @aviaviavi. Pending proof that these things are for sure not covered in security, I will take appropriate action.

@tannerlinsley
Copy link
Collaborator

I'm curious now where this would actually fall on GDPR compliance since any personal data as I've been told is being anonymized before ingestion. The IP address to my knowledge is also thrown out after any publicly known company information is mapped from it. Does GDPR apply to information about a business as well? Or just individuals?

@aviaviavi
Copy link
Contributor Author

aviaviavi commented Apr 29, 2020

It looks like there's some confusion here, so let me clarify a few things:

Scarf currently has two open-source projects on Github, and it looks like they are getting conflated a bit here. There's scarf-js (https://github.com/scarf-sh/scarf-js), the npm library that is being used by react-table and our topic of discussion here. Then there's scarf (https://github.com/aviaviavi/scarf), the experimental system package manager, which is entirely separate from react-table.

Scarf-js does not have the security vulnerabilities described. Scarf-js is hosted on github and is published to npm, but aws/s3 are not in the picture on Scarf's side. There is no s3 bucket to be compromised on Scarf's side, and no artifact to pull down and verify. It is not susceptible to MitM attacks as described. The only integrity check that's relevant here is the one that the js package manager is responsible for when pulling from npm's registry, and this will take place properly like for any other js package. It's not even currently using the github "releases" feature, though that will be a good idea for us to go back and tag releases 😄. If you're skeptical about any of these claims the code is always available.

The attack vectors @iamstarkov mentioned are actually pertaining to the package manager, which is still very much an experimental beta project. These things will certainly be fixed by the time a stable release is produced, but those issues are not relevant to react-table.

As for GDPR compliance - which part of GDPR are you claiming we are violating? I'm not aware of any issues with the scarf-js library. As @tannerlinsley said, raw IP addresses get discarded after we look up the metadata, and we will certainly respond to any inbound requests to delete data, as stated on the website's policies. We fully intend to comply with GDPR, and any issues that arise will be addressed quickly.

Overall your security concerns are taken seriously. If you find issues with scarf-js, they will be resolved.

@tannerlinsley
Copy link
Collaborator

That's good to hear :) Thanks for the quick response @aviaviavi!

@glompix
Copy link

glompix commented Jun 16, 2020

why? i have a HIPAA app and this is going to force me to fork

@aviaviavi
Copy link
Contributor Author

aviaviavi commented Jun 16, 2020

Hi @glompix - scarf-js does not impact the HIPAA compliance of your application, so there's no need to fork. These analytics are only collected when you run npm install. This means that data is only being collected from machines that are building your application, and is completely out of the picture for any users of your application. Even then, no PII is being collected here.

Happy to dig into any specifics if you have more questions or concerns!

@samsch
Copy link

samsch commented Jun 30, 2020

Linking this discussion here: final-form/react-final-form#822

As per the notes there, react-table will also be recommended against or forked until scarf is removed.

@tannerlinsley
Copy link
Collaborator

Let me set this straight, scarf is not malware.

It’s an analytics package that is gdrp compliant out of the box, collects install counts, version counts, and publicly linkable company information at dev-install time (and nothing else) and reports that information to a dashboard that only I have access to.

By definition, is not even close to malware, nor does it have any more malicious capability on your machine that any other npm library you install every day.

Now, that said, if you really wish that I personally don’t see your company’s name and website on my internal list of users, then just opt out.

If you choose not to opt out for whatever reason, and you just have some issue with my personal choice to include analytics in my library that I distribute to you for free... then sure. Fork it, change it, get yourself stuck. It’s open source after all.

But for heavens sake, get your terms right. Usages analytics is nothing even close to your “malware” claims.

@tannerlinsley
Copy link
Collaborator

More to say on this matter over at the issue above, but I'll extract one of my comments here:

It looks like this will just come down to maintainer's choice on if they consider an associated company name, website, and install count sensitive data. Clearly, this specific project belongs to @erikras, so that decision is up to him. Personally, I have been able to use scarf to reach out to companies that use my libraries and offer support, consultation, and sometimes facilitate a sponsorship to help fund my open source work. For now, I think I can say that as long as scarf isn't doing anything illegal, mishandling personally identifiable data, or causing harm to my users, I will continue to use it to help fund my open source work. It has ultimately done zero harm to my users and allows them to keep using a library that is well-maintained.

@tannerlinsley
Copy link
Collaborator

The scarf-js maintainer also offered even more insight to the discussion here.

@tannerlinsley
Copy link
Collaborator

Scarf has been removed from all of my libraries following a more granular discussion in the Reactiflux Discord. At ease.

@iamstarkov
Copy link

@tannerlinsley what made you change your mind?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants