New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentry: Obfuscation and dealing with sensitive data #7958
Comments
PII Data filteringThis report cover investigationg of PII data sent over sentry. Sentry could expose PII data in following fields:
Sensitive information could be handled on two sides:
Also there is an option of sending data to middle layer prior sending it to Sentry Client side scrubbingUsing hooksClient side data scrubbling could be implemented by setting before_send and before_breadcrumb hooks in init method: def before_send(event, hint):
event["sensitive-info"] = None
return event
def before_breadcrumb(event, hint):
event["sensitive-info"] = None
return event
sentry_sdk.init(before_send = before_send,
before_breadcrumb = before_breadcrumb) before_send hook is called before sending event to sentry server. It allow appply PII data filtering before sending event. Event could be message, handled/unhanled exception or transaction data. before_breadcrumb called before adding breadcrumb to the scope. Breadcrumb is event that comes with exception to give a clue on preceding events. Disabling default integrationsSome events (like http request or logging) could be sent automatically by default integration. We can disable or override default integrations to minimazie data sending to Sentry. Server side scrubbingBy default Sentry server applies Server side scrubbing based on regular expression and field name containing possible secrets (password, key, access_token etc.). Look "Example" section for details. More Details on default server side scrubbing: https://docs.sentry.io/product/data-management-settings/scrubbing/server-side-scrubbing/ Also server side scrubbing could be tuned to apply against regular expression, filter out emails, PEM keys, IP addresses, SSN and so on (Details:https://docs.sentry.io/product/data-management-settings/scrubbing/advanced-datascrubbing/ ) ExampleHere is some examples of Servier side scrubbing for running sentry on existing connectors (no client side filtering were applied): |
@avida I see that Sentry does server side filtering, but in my opinion this is not sufficiently safe. As an Airbyte user, I would be pretty shocked if Airbyte sent my API keys to a 3rd party and let them handle the filtering. I would require that secrets are not sent at all. What are our options for implementing that? It seems like it might be difficult since each source would have to implement this in a custom way? Could the next step here be to demo this for one of the connectors you showed in the example e.g: Iterable? How could we be 100% certain that the filtering is working correctly? |
@sherifnada Ive updated sentry PR with client side sensitive data scrubbing. It implemented on CDK level and works for each connector.
Ive tried to come up with unittests overriding http transport and cover every possible case of possible sensitive data transfer (including transaction, contexts, different events and integrations). Looks like everything works fine. |
demo sounds great! Let's do it |
Here we explore all options that Sentry provides for obfuscation and dealing with sensitive data
The text was updated successfully, but these errors were encountered: