Sentry: Obfuscation and dealing with sensitive data #7958

keu · 2021-11-14T21:39:49Z

Here we explore all options that Sentry provides for obfuscation and dealing with sensitive data

avida · 2021-11-24T14:46:39Z

PII Data filtering

This report cover investigationg of PII data sent over sentry.

Sentry could expose PII data in following fields:

Set directly by application with context/tags/log messages
captured by default integrations
Frame variables captured by stacktrace for unhandled exception
Transaction's span name/description/context

Sensitive information could be handled on two sides:

Scrubbing data on client side before sending to sentry.io server
Server side data scrubbing after sending data on sentry.io

Also there is an option of sending data to middle layer prior sending it to Sentry

Client side scrubbing

Using hooks

Client side data scrubbling could be implemented by setting before_send and before_breadcrumb hooks in init method:

def before_send(event, hint):
    event["sensitive-info"] = None
    return event

def before_breadcrumb(event, hint):
    event["sensitive-info"] = None
    return event

sentry_sdk.init(before_send = before_send,
                before_breadcrumb = before_breadcrumb)

before_send hook is called before sending event to sentry server. It allow appply PII data filtering before sending event. Event could be message, handled/unhanled exception or transaction data.

before_breadcrumb called before adding breadcrumb to the scope. Breadcrumb is event that comes with exception to give a clue on preceding events.

Disabling default integrations

Some events (like http request or logging) could be sent automatically by default integration. We can disable or override default integrations to minimazie data sending to Sentry.

Server side scrubbing

By default Sentry server applies Server side scrubbing based on regular expression and field name containing possible secrets (password, key, access_token etc.). Look "Example" section for details.

More Details on default server side scrubbing: https://docs.sentry.io/product/data-management-settings/scrubbing/server-side-scrubbing/

Also server side scrubbing could be tuned to apply against regular expression, filter out emails, PEM keys, IP addresses, SSN and so on (Details:https://docs.sentry.io/product/data-management-settings/scrubbing/advanced-datascrubbing/ )

Example

Here is some examples of Servier side scrubbing for running sentry on existing connectors (no client side filtering were applied):


Scrubbing Authorization header for context data (asana source)


Filtering url from integration's breadcrumb and context (iterable source)


No PII filtering for api key inside url for transaction event (iterable source)


No PII filtering for api key and email inside url for transaction event (pipedrive source)


Filtering out client_secret and access_token from frame variables of stacktrace (surveymonkey source)

sherifnada · 2021-12-03T07:49:09Z

@avida I see that Sentry does server side filtering, but in my opinion this is not sufficiently safe. As an Airbyte user, I would be pretty shocked if Airbyte sent my API keys to a 3rd party and let them handle the filtering. I would require that secrets are not sent at all. What are our options for implementing that? It seems like it might be difficult since each source would have to implement this in a custom way?

Could the next step here be to demo this for one of the connectors you showed in the example e.g: Iterable?

How could we be 100% certain that the filtering is working correctly?

avida · 2021-12-10T08:01:48Z

@sherifnada Ive updated sentry PR with client side sensitive data scrubbing. It implemented on CDK level and works for each connector.
We can do a demo on todays sync call.

How could we be 100% certain that the filtering is working correctly?

Ive tried to come up with unittests overriding http transport and cover every possible case of possible sensitive data transfer (including transaction, contexts, different events and integrations). Looks like everything works fine.

sherifnada · 2021-12-10T08:30:08Z

demo sounds great! Let's do it

keu added the type/enhancement New feature or request label Nov 14, 2021

keu mentioned this issue Nov 14, 2021

[EPIC] Integrate Sentry #7956

Closed

4 tasks

sherifnada added area/connectors Connector related issues area/reliability labels Nov 15, 2021

sherifnada added this to the Connectors Nov 26 2021 milestone Nov 15, 2021

avida self-assigned this Nov 24, 2021

sherifnada modified the milestones: Connectors Nov 26 2021, Connectors Dec 10 2021 Nov 29, 2021

sherifnada unassigned avida Nov 29, 2021

VasylLazebnyk modified the milestones: Connectors Dec 10 2021, Connectors Dec 24 2021 Dec 13, 2021

VasylLazebnyk mentioned this issue Dec 21, 2021

Integrate Sentry for performance and errors tracking. #8248

Merged

VasylLazebnyk linked a pull request Dec 21, 2021 that will close this issue

Integrate Sentry for performance and errors tracking. #8248

Merged

sherifnada closed this as completed Dec 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentry: Obfuscation and dealing with sensitive data #7958

Sentry: Obfuscation and dealing with sensitive data #7958

keu commented Nov 14, 2021

avida commented Nov 24, 2021

sherifnada commented Dec 3, 2021 •

edited

avida commented Dec 10, 2021 •

edited

sherifnada commented Dec 10, 2021

Sentry: Obfuscation and dealing with sensitive data #7958

Sentry: Obfuscation and dealing with sensitive data #7958

Comments

keu commented Nov 14, 2021

avida commented Nov 24, 2021

PII Data filtering

Client side scrubbing

Using hooks

Disabling default integrations

Server side scrubbing

Example

sherifnada commented Dec 3, 2021 • edited

avida commented Dec 10, 2021 • edited

sherifnada commented Dec 10, 2021

sherifnada commented Dec 3, 2021 •

edited

avida commented Dec 10, 2021 •

edited