Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(replays): Enable PII scrubbing for all organizations #1678

Merged
merged 5 commits into from
Jan 12, 2023

Conversation

cmanallen
Copy link
Member

@cmanallen cmanallen commented Dec 6, 2022

#skip-changelog

@cmanallen cmanallen requested a review from a team December 6, 2022 18:39
@cmanallen cmanallen merged commit 588e46a into master Jan 12, 2023
@cmanallen cmanallen deleted the replays-enable-pii-scrubbing branch January 12, 2023 05:21
jjbayer added a commit that referenced this pull request Jan 12, 2023
We noticed an increase in total processing time which we suspect was
caused by #1678.

Add a statsd metric to measure replay recording processing in isolation.
olksdr added a commit that referenced this pull request Jan 16, 2023
olksdr added a commit that referenced this pull request Jan 16, 2023
…1747)

Reverts #1678

This is the suspected change increased the event's processing time. And
maybe introduced a memory leak.
jan-auer added a commit that referenced this pull request Jan 18, 2023
* master: (35 commits)
  ref(actix): Migrate ProjectUpstream to `relay_system::Service` (#1727)
  feat(general): Add unknown SessionStatus variant (#1736)
  ref: Convert integration tests about dropping transactions to unit tests (#1720)
  release: 0.8.16
  ci: Skip redundant self-hosted E2E on library release (#1755)
  doc(changelog): Add relevant changes to python changelog (#1753)
  feat(profiling): Add profile context (#1748)
  release: 23.1.0
  profiling(fix): use an unpadded base64 encoding (#1749)
  Revert "feat(replays): Enable PII scrubbing for all organizations" (#1747)
  feat: Switch from base64 to data-encoding (#1743)
  instr(replays): Add timer metric to recording processing (#1742)
  feat(replays): Use Annotated struct definition for replay-event parsing (#1582)
  feat(sessions): Retire session duration metric (#1739)
  feat(general): Scrub all fields with IP address (#1725)
  feat(replays): Enable PII scrubbing for all organizations (#1678)
  chore(project): Add backoff mechanism for fetching projects (#1726)
  feat(profiling): Add new measurement units for profiling (#1732)
  chore(toolchain): update rust to 1.66.1 (#1735)
  ref(actix): Migrate server actor to the "service" arch (#1723)
  ...
jjbayer pushed a commit that referenced this pull request Jan 26, 2023
Co-authored-by: Oleksandr <1931331+olksdr@users.noreply.github.com>
jjbayer added a commit that referenced this pull request Jan 27, 2023
After deploying #1678, we saw a
rise in memory consumption. We narrowed down the reason to
deserialization of replay recordings, so this PR attempts to replace
those deserializers with more efficient versions that do not parse an
entire `serde_json::Value` to get the tag (`type`, `source`) of the
enum.

A custom deserializer is necessary because serde does not support
[integer tags for internally tagged
enums](serde-rs/serde#745).

- [x] Custom deserializer for `NodeVariant`, based on serde's own
`derive(Deserialize)` of internally tagged enums.
- [x] Custom deserializer for `recording::Event`, based on serde's own
`derive(Deserialize)` of internally tagged enums.
- [x] Custom deserializer for `IncrementalSourceDataVariant`, based on
serde's own `derive(Deserialize)` of internally tagged enums.
- [x] Box all enum variants.

### Benchmark comparison

Ran a criterion benchmark on `rrweb.json`. It does not tell us anything
about memory consumption, but the reduced cpu usage points to simpler
deserialization:

#### Before

```
rrweb/1                 time:   [142.37 ms 148.17 ms 155.61 ms]
```

#### After

```
rrweb/1                 time:   [31.474 ms 31.801 ms 32.137 ms]
```

#skip-changelog

---------

Co-authored-by: Colton Allen <cmanallen90@gmail.com>
Co-authored-by: Oleksandr <1931331+olksdr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants