Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Store breadcrumbs to disk for OOM events #2347

Merged
merged 36 commits into from
Nov 14, 2022

Conversation

kevinrenskers
Copy link
Contributor

@kevinrenskers kevinrenskers commented Nov 1, 2022

📜 Description

We're storing breadcrumbs to disk, so that when an OOM event happens, we can attach the breadcrumbs again.

💡 Motivation and Context

Closes #1759.

💚 How did you test it?

Unit tests

📝 Checklist

  • I reviewed the submitted code
  • I added tests to verify the changes
  • I updated the docs if needed
  • Review from the native team if needed
  • No breaking changes

🔮 Next steps

* master:
  meta: add log source to format; update log format (#2337)
  test: disable flaky testFlush_CalledMultipleTimes_ImmediatelyReturnsFalse (#2339)
  fix: dont require profiler to start on main thread (#2345)
@github-actions
Copy link

github-actions bot commented Nov 1, 2022

Fails
🚫 Please consider adding a changelog entry for the next release.
Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Store breadcrumbs to disk for OOM events ([#2347](https://github.com/getsentry/sentry-cocoa/pull/2347))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against 6143773

@kevinrenskers
Copy link
Contributor Author

Just a work in progress to get the basic idea up and running, and to get some early feedback. It works now (yay!), but we're not limiting the number of stored breadcrumbs at all. See my comment here: #1759 (comment).

@github-actions
Copy link

github-actions bot commented Nov 2, 2022

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1221.48 ms 1241.13 ms 19.65 ms
Size 20.75 KiB 374.17 KiB 353.41 KiB

Baseline results on branch: master

Startup times

Revision Plain With Sentry Diff
4e037c4 1214.71 ms 1235.20 ms 20.49 ms
0979ac6 1217.96 ms 1244.88 ms 26.92 ms
f5f8910 1222.41 ms 1251.78 ms 29.37 ms
202334c 1265.41 ms 1277.34 ms 11.93 ms
8b040e4 1234.76 ms 1244.71 ms 9.95 ms
347a8e9 1243.50 ms 1265.90 ms 22.40 ms
5025d2e 1248.52 ms 1251.72 ms 3.20 ms
0fdf0b2 1245.88 ms 1247.69 ms 1.82 ms
bdfccaa 1202.83 ms 1228.96 ms 26.13 ms
b15627c 1256.48 ms 1264.68 ms 8.20 ms

App size

Revision Plain With Sentry Diff
4e037c4 20.50 KiB 361.79 KiB 341.29 KiB
0979ac6 20.50 KiB 342.31 KiB 321.81 KiB
f5f8910 20.75 KiB 368.04 KiB 347.29 KiB
202334c 20.51 KiB 331.79 KiB 311.28 KiB
8b040e4 20.50 KiB 333.54 KiB 313.04 KiB
347a8e9 20.51 KiB 333.16 KiB 312.65 KiB
5025d2e 20.51 KiB 331.79 KiB 311.28 KiB
0fdf0b2 20.51 KiB 332.90 KiB 312.39 KiB
bdfccaa 20.50 KiB 361.80 KiB 341.29 KiB
b15627c 20.50 KiB 337.76 KiB 317.25 KiB

Previous results on branch: feat/1759-breadcrumbs-to-disk

Startup times

Revision Plain With Sentry Diff
a67286f 1242.78 ms 1265.92 ms 23.14 ms
1729eef 1228.40 ms 1260.57 ms 32.17 ms
89ade98 1237.24 ms 1237.96 ms 0.72 ms
ca80c8e 1220.92 ms 1251.56 ms 30.64 ms
6f5fc8a 1233.64 ms 1240.54 ms 6.90 ms
6a4dfdb 1253.67 ms 1270.86 ms 17.18 ms
4823b35 1264.90 ms 1273.52 ms 8.62 ms
91e662f 1233.63 ms 1249.32 ms 15.69 ms
f183f14 1236.53 ms 1283.19 ms 46.66 ms
5badce4 1233.43 ms 1269.22 ms 35.79 ms

App size

Revision Plain With Sentry Diff
a67286f 20.75 KiB 372.05 KiB 351.30 KiB
1729eef 20.75 KiB 371.87 KiB 351.12 KiB
89ade98 20.75 KiB 372.04 KiB 351.29 KiB
ca80c8e 20.75 KiB 370.56 KiB 349.81 KiB
6f5fc8a 20.50 KiB 364.30 KiB 343.79 KiB
6a4dfdb 20.75 KiB 372.03 KiB 351.28 KiB
4823b35 20.50 KiB 364.22 KiB 343.72 KiB
91e662f 20.75 KiB 370.57 KiB 349.81 KiB
f183f14 20.50 KiB 364.23 KiB 343.72 KiB
5badce4 20.75 KiB 371.87 KiB 351.11 KiB

@kevinrenskers
Copy link
Contributor Author

Ok, writing to 2 breadcrumb files is now done, and the scope observers now get a serialized breadcrumb, so that we don't have to do that work twice (now that we have 2 scope observers).

The code all works and I am happy with the direction. Would be great to get some input before I write unit tests for all this new code. cc @philipphofmann @brustolin

@brustolin
Copy link
Contributor

Why is this breadcrumb feature tie to OOM? I understand that OOM crashes cause the SDK to loose all breadcrumbs. So does ANR, or any other reason the system kills the app.

I dont get it why do we have a SentryOutOfMemoryScopeObserver when it could be a BreadcrumbManager with no OOM feature dependency

@kevinrenskers
Copy link
Contributor Author

Why is this breadcrumb feature tie to OOM? I understand that OOM crashes cause the SDK to loose all breadcrumbs. So does ANR, or any other reason the system kills the app.

I dont get it why do we have a SentryOutOfMemoryScopeObserver when it could be a BreadcrumbManager with no OOM feature dependency

It's only used for sending breadcrumbs to OOM events. So if you don't use the OOM integration, there is no need to write breadcrumbs to disk.

Not sure if I am following what you are saying?

@brustolin
Copy link
Contributor

It's only used for sending breadcrumbs to OOM events. So if you don't use the OOM integration, there is no need to write breadcrumbs to disk.

Not sure if I am following what you are saying?

Im saying that I believe this should not be used only for OOM events, but as the standard way to keep track of Breadcrumbs and to be used for every event.

@kevinrenskers
Copy link
Contributor Author

Reading them back from disk and deserializing them one by one isn't something you want to do for every event.

@brustolin
Copy link
Contributor

Reading them back from disk and deserializing them one by one isn't something you want to do for every event.

You only need to do this during app start. Then you keep a ring buffer in memory.

* master:
  test: Delete empty OOMLogicTests (#2361)
  fix: profiling transaction timestamps (#2359)
  fix: profiling transaction thread IDs (#2358)
  test: Use flush for macOS-SPM sample (#2360)
  release: 7.30.0
  fix: SentryCrash writing nan for invalid number (#2348)
  HTTP Client errors (#2308)
  ci: Call make for analyze (#2353)
  fix: CoreData tracking entity name retrieval (#2329)
  fix: sampled profile format backend validation errors (#2350)
  build(deps): bump github/codeql-action from 2.1.29 to 2.1.30 (#2351)
  build(deps): bump github/codeql-action from 2.1.28 to 2.1.29 (#2344)
  fix: Fixed flaky testTimezoneChangeNotificationBreadcrumb (#2335)

# Conflicts:
#	Sentry.xcodeproj/project.pbxproj
@kevinrenskers
Copy link
Contributor Author

I really do not understand what you are talking about, can you please start from the beginning and explain what you want to achieve wrt breadcrumbs and "every event". What is it about the current system you want changed, and how can this PR help with that? Because this PR only writes things to disk, which is only needed for OOM events, so I don't understand what you want to achieve.

kevinrenskers and others added 3 commits November 7, 2022 09:47
* master:
  ci: Update brew before installing clang-format (#2365)
  docs: Why GH actions formats code differently (#2363)
@kevinrenskers
Copy link
Contributor Author

kevinrenskers commented Nov 8, 2022

Ran benchmarks with 1000 breadcrumbs, with and without the observer:

import XCTest

class SentryOOMScopeObserverBenchmark: XCTestCase {
    func testWithoutObserver() {
        let sut = Scope()

        let crumb = TestData.crumb

        // About 0.2
        self.measure {
            var count = 0
            while count < 1000 {
                sut.add(crumb)
                count += 1
            }
        }
    }

    func testWithObserver() {
        let sut = Scope()
        let options = Options()
        let currentDate = TestCurrentDateProvider()
        let fileManager = try! SentryFileManager(options: options, andCurrentDateProvider: currentDate)
        let observer = SentryOutOfMemoryScopeObserver(maxBreadcrumbs: 100, fileManager: fileManager)
        sut.add(observer)

        let crumb = TestData.crumb

        // About 0.215
        self.measure {
            var count = 0
            while count < 1000 {
                sut.add(crumb)
                count += 1
            }
        }
    }
}

So a difference of about 0.015 seconds for 1000 breadcrumbs. I ran these tests 10 times and took the average.

I haven't committed this benchmark test class, as I don't think it makes sense to always run this in CI. I could add it and immediately disable it, but not sure if that makes sense either. What do you think?

* master:
  chore: Add match_local to Fastfile
  Revert "feat: Track usage of the enableCaptureFailedRequests option (#2368)" (#2371)
  chore: Remove PrivateSentrySDKOnly.h from umbrella header (#2349)
  feat: Track usage of the enableCaptureFailedRequests option (#2368)
  Update SentryNetworkTrackerIntegrationTests.swift (#2366)
  Fix typos in NetworkTrackerIntegrationTests (#2367)
  build(deps): bump github/codeql-action from 2.1.30 to 2.1.31 (#2364)
@philipphofmann
Copy link
Member

So a difference of about 0.015 seconds for 1000 breadcrumbs. I ran these tests 10 times and took the average.

That seems acceptable. Could you please add this info as a code comment to https://github.com/getsentry/sentry-cocoa/blob/c5a6d9fda03c778d37908493b0419faf3e72d6a7/Sources/Sentry/include/SentryOutOfMemoryScopeObserver.h

I haven't committed this benchmark test class, as I don't think it makes sense to always run this in CI. I could add it and immediately disable it, but not sure if that makes sense either. What do you think?

The benchmark tests won't run in CI, as the benchmarks are strictly tied together with a machine configuration. The benchmarks you recorded would only be used on an M1 with the exact same configuration that you have. They wouldn't run on my machine with an Intel CPU. For more info, please check #1482.

@philipphofmann
Copy link
Member

What still bothers me is that we do file I/O on the main thread with this approach. I would guess that most of OOM issues don't happen immediately after a breadcrumb is added. If you do a memory-hungry operation, the SDK should have enough time to store a breadcrumb to disk, even on a background thread. I would rather do the I/O on a background thread and accept that the breadcrumbs might not be 100% up to date.

* master:
  fix: Call UIDevice methods on the main thread (#2369)

# Conflicts:
#	Sentry.xcodeproj/project.pbxproj
Copy link
Member

@philipphofmann philipphofmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty great. Thanks @kevinrenskers.

Tests/SentryTests/SentrySDKTests.swift Show resolved Hide resolved
develop-docs/README.md Outdated Show resolved Hide resolved
develop-docs/README.md Show resolved Hide resolved
Co-authored-by: Philipp Hofmann <philipp.hofmann@sentry.io>
Copy link
Member

@philipphofmann philipphofmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM

@philipphofmann
Copy link
Member

Please ping Markus Wahl in #1852, once this PR gets merged, and ask him if he now has enough context to give us a repro for #1852.

Copy link
Contributor

@brustolin brustolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@philipphofmann philipphofmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @kevinrenskers.

* master:
  Report pre-warmed app starts (#1969)
  fix: Too long flush duration (#2370)
  release: 7.30.2
  fix: profile payload issues (#2375)
@kevinrenskers kevinrenskers merged commit 46deabf into master Nov 14, 2022
@kevinrenskers kevinrenskers deleted the feat/1759-breadcrumbs-to-disk branch November 14, 2022 09:30
kevinrenskers added a commit that referenced this pull request Nov 14, 2022
…ents

* master:
  feat: Store breadcrumbs to disk for OOM events (#2347)
  Report pre-warmed app starts (#1969)
  fix: Too long flush duration (#2370)
  release: 7.30.2
  fix: profile payload issues (#2375)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Periodically store breadcrumbs to disk for OOMs
4 participants