Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUMM-1026 Use sysctl to gather a more accurate App Launch Time #381

Merged
merged 5 commits into from Jan 28, 2021

Conversation

acgh
Copy link
Contributor

@acgh acgh commented Jan 25, 2021

What and why?

ObjcAppLaunchHandler is the existing mechanism that observes early steps of the App Launch sequence to provide a Launch Time for further reporting by RUMViewScope into RUMActionEvent.Action for the start_application event.

This currently relies on Objective-C's runtime initialisation which calls classes' +(void)load method as soon as they are loaded into the runtime. This happens before the UIApplication.sharedInstance is instantiated, and this provides an early point in time regarding the launch sequence.
The limitations are that the "earliness" is not guaranteed and depends on many factors, so it might lead to not accounting for some launch time.

This is why we aim to use a more accurate and stable source of information for the App Launch time.

Note that in that PR, we don't change the existing definition of App Launch Time as we still include the time to load the main UI, create/setup the app delegate.

How?

In that PR, we refactor the pair ObjcAppLaunchHandler/AppLaunchTimer into a single AppLaunchHandler class that becomes an implementation detail for a C function AppLaunchTime().

The AppLaunchTimer class still implements and reacts to +(void)load to set up some observation on UIApplication.
Note the AppLaunchTimer is only kept alive by the capturing blocks until it has computed the times.

More specifically, AppLaunchTimer calls sysctl() to get a kinfo_proc for the current process (getpid()) and especially looks at p_starttime.

Review checklist

  • Run the Example.app target in Simulator/on Device and check that the values are provided and make sense.
  • Make sure each commit and the PR mention the Issue number or JIRA reference
  • Feature or bugfix MUST have appropriate tests (unit, integration)
    • This depends on Foundation/UIKit/System APIs that can't be mocked so this wouldn't be straightforward.

@acgh acgh requested a review from a team as a code owner January 25, 2021 13:04
@@ -16,10 +16,7 @@ internal protocol LaunchTimeProviderType {

internal class LaunchTimeProvider: LaunchTimeProviderType {
var launchTime: TimeInterval? {
// Following dynamic dispatch synchronisation mutes TSAN issues when running tests from CLI.
objc_sync_enter(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't reproduce the TSAN issues locally, but I believe that if we don't involve any Objc objects (and call a C function) in that call paths there can't be any dynamic dispatches involved to begin with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable 👍. My suspicion behind this TSAN issue was indeed around the dynamic dispatch with this call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had to come back to using objc_sync_enter()/objc_sync_exit(). I could not reproduce locally the test failure, but based on Bitrise logs, TSAN reported a potential data race on the call to AppLaunchTime() where a write happened according to TSAN. When reporting, TSAN crashes on a SIGABRT.

The only write in that C function might come from pthread_rwlock_rdlock(&rwLock); which in itself must be thread safe (otherwise nothing can be). So anyway, this is back here, it might be extraneous but let's trust TSAN for now.

Copy link
Contributor

@xgouchet xgouchet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really smart approach, we might reuse this on Android too !

// Knowing how this value is collected and that it is written only once, the use of
// `atomic` property is enough for thread safety. It guarantees that neither partial
// nor garbage value will be returned.
@property (atomic) NSTimeInterval timeToApplicationBecomeActive;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe atomic is already the default for a @property. That being said, I don't think this was necessary due the inherent sequencing of setting/getting this property throughout the App lifecycle.

Copy link
Contributor Author

@acgh acgh Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like LaunchTimeProviderTests.testThreadSafety() disagrees with my assumptions 🤔
see: https://app.bitrise.io/build/87853c6c2bfa5825#?tab=log

@acgh
Copy link
Contributor Author

acgh commented Jan 25, 2021

Thanks @xgouchet for the swift review! I'll keep that PR open a little bit longer to get @buranmert and @ncreated's opinion on this change :)

Copy link
Contributor

@buranmert buranmert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't know sysctl gives that info, TIL 👍

Comment on lines 87 to 90
} else {
// Fallback to less accurate delta with DD's framework load time
_processStartTime = _frameworkLoadTime;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know documented cases where res != 0?
this is quite a longshot but if res is practically always 0 and if we can find a way to execute code pre-main in Swift, we can call sysctl from Swift and remove this ObjC code. eventually, we can also remove _Datadog_Private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the documentation it's only going to be 0 or -1:

Upon successful completion, the value 0 is	returned; otherwise the
     value -1 is returned and the global variable errno	is set to indicate the
     error.

In general, we should be able to call sysctl() at any time on the Swift side, but to get close to a relevant app launch time without the Objc +(void)load trick we should either ask for explicit calls from our users (which could be easily overlooked), or integrate this mechanism into the Datadog/singleton setup that is hopefully done as part of the UIApplication.delegate application(...:didFinishLaunching...) by our users.
An obvious downside of that is spreading the whole app launch time state across our SDK, along with potential discrepancies if our assumptions of Datadog's setup are wrong.

// This is called at the `_Datadog_Private` load time, keep the work minimal
NSTimeInterval now = CFAbsoluteTimeGetCurrent();
// Schedule a call for the first turn of run loop before the main nib/storyboard is loaded:
[NSRunLoop.mainRunLoop performInModes:@[NSRunLoopCommonModes] block:^{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some piece of offline feedback @buranmert shared with me:
This run loop callback is specifically there as a mean to call markEndOfUIApplicationMain at the right time. This is a value that we will not use yet. So maybe it would be better to take note of this solution for later and get rid of this block (i.e. just alloc/init and call scheduleUIApplicationDidFinishLaunchingMarking inline).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO: if we don't use it → remove it as it's just a dead code. Do I miss something 🤔?

[[NSNotificationCenter defaultCenter] removeObserver:self];
- (void)scheduleUIApplicationDidFinishLaunchingMarking {
id __block token = [NSNotificationCenter.defaultCenter
addObserverForName:UIApplicationDidFinishLaunchingNotification
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using UIApplicationDidBecomeActiveNotification before, but now we use UIApplicationDidFinishLaunchingNotification - if they do not mark the same moment in time, this changes the definition of RUM "launch time" for iOS.

How is UIApplicationDidFinishLaunchingNotification different than the first UIApplicationDidBecomeActiveNotification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the UIApplication.delegate is done with the callbacks and UIWindow is made key and visible, the sequence of notifications will be UIApplicationDidFinishLaunchingNotification then UIApplicationDidBecomeActiveNotification. So I believe they happen close to each other (depending on potential observers and the work they do).

That being said UIApplicationDidBecomeActiveNotification is a repeating notification that is meant for the whole app lifecycle (although by removing the observer this is only fires initially) and UIApplicationDidFinishLaunchingNotification fits with the semantic of App Launch Time better.

I don't think it really changes the definition of RUM "launch time" but I'm ok using UIApplicationDidBecomeActiveNotification if that is safer as it's still related to the app first appearance anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it 👍. I also did some digging around and realised that "finished launching" relates to cold start, whereas "did become active" is for returning from background. Both should work fine and mean the same in our implementation 👌.

@property (atomic) NSTimeInterval timeToApplicationBecomeActive;
// Note that the following properties will be set early enough in the App lifecycle to avoid any race conditions without any specific synchronization means.
static NSTimeInterval TimeToApplicationFinishedLaunching = 0.0;
static NSTimeInterval TimeToLoadApplicationRootUI = 0.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any usages of this value - we only set it, but never read / process. Do we need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I've sent this along for reference in case we consider this additional step eventually, but I'm going to clean this up in a new commit.

Copy link
Collaborator

@ncreated ncreated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to use the process start time 👌. I left few questions and suggestions. Main blocker for me is the presence of code that adds complexity but is not used.

@acgh acgh changed the title RUMM-1026 Use sysctl and NSRunLoop to gather a more accurate App Launch Time RUMM-1026 Use sysctl to gather a more accurate App Launch Time Jan 26, 2021
Copy link
Collaborator

@ncreated ncreated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG 🎸!

// https://developer.apple.com/documentation/uikit/app_and_environment/responding_to_the_launch_of_your_app/about_the_app_launch_sequence

// A Read-Write lock to allow concurrent reads of TimeToApplicationDidBecomeActive, unless the initial (and only) write is locking it.
static pthread_rwlock_t rwLock;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't it work with objc_sync_enter/exit in LaunchTimeProvider? now we have a global lock staying around, not great :/

@acgh acgh merged commit 782473e into master Jan 28, 2021
@acgh acgh deleted the acgh/RUMM-1026-use-process-start-time-in-app-launch-time branch January 28, 2021 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants