Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc(decision): Document sensitive data collected #70

Merged
merged 8 commits into from
Feb 10, 2023

Conversation

antonpirker
Copy link
Member

@antonpirker antonpirker commented Jan 27, 2023

We need an exact but concise documentation on what sensitive data our SDKs collect. This should be available in the SDKs documentation on docs.sentry.io and be specific to all the integrations that each SDK supports.

Rendered RFC

@antonpirker antonpirker force-pushed the rfc/document-sensitive-data-collected branch from 2369bc0 to 44a4651 Compare January 27, 2023 09:04
Copy link
Member

@philipphofmann philipphofmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for A) to get started. B) would be great in general for docs and not only for PII, but it's way harder to set up.


# Unresolved questions

- How do we guarantee, that the documentation stays up to date with the implementation?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Pros:

- Generated from code, so it should be always up to date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a process to ensure that developers mark code collecting PII as such. A PR template checkbox as suggested in #70 (comment) could help.

- Documentation need to be kept up to date with seperate PR in `sentry-docs` repo when changes to SDK are made
- Documentation for different versions of the SDK not solved yet

## B) Automatic documentation creation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to work on such a project, not just for this specific case. Our docs often need to be updated when we add new options / features to our SDKs. Ideally, they would be linked, so they are always up to date, but I think that's a complex problem to solve. The book Living Documentation has great tips on automatic doc creation.


# Options Considered

## A) Table in docs of each integration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like A a lot - also for the reason that we could use this to play around with the idea of keeping a "feature list" of our SDKs documented somewhere, so people can compare which SDKs have what features.

B seems like a lot of effort which may not be worth it at the current time. Perhaps we can re-evaluate when we have more docs personnel?


## B) Automatic documentation creation

If we go with _Option B)_ in [RFC-0062 Controlling PII and Credentials in SDKs](https://github.com/getsentry/rfcs/pull/62) we could add doc strings in the code of the implemented `EventScrubber` and then generate documentation from this code to render a table similar to the one in Option A) in this RFC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to live in docs or can it simply be the code itself documenting what we consider sensitive data? Docs generation could be a future improvement if enough people ask for plain text version vs just looking at the code. This would also auto solve the problem of versioning the docs for people stuck on older versions of the SDKs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC was created because people were asking for documentation on what data is collected, so only having it in the code is not enough. :-)

But I guess when we have the EventScrubber we can have a documentation on how the event scrubbing works in general and then point to the default EventScrubbers code to the people that want to know the details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was referring to the other RFCs outcome of having a default filtering logic which could then also serve as docs (at least for people able to read the code).


# Options Considered

## A) Table in docs of each integration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like it'll be hard to get a nice B without a properly implemented A, so IMO option A is a good start, and then we might consider implement B for languages/frameworks that have relevant tooling and conventions (versus starting to build a one-fits-all solutions and then regretting it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants