Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preheating combined with image loading produces access races #224

Closed
michaelnisi opened this issue Apr 19, 2019 · 11 comments

Comments

Projects
None yet
2 participants
@michaelnisi
Copy link
Contributor

commented Apr 19, 2019

Preheating combined with image loading produces access races because locks are accessed from different threads. TSan traps _createSession and loadImage accessing the ImagePipeline lock from different threads. These are the two methods in ImagePipeline.

    private func getNextTaskId() -> Int {
        os_unfair_lock_lock(&lock)
        defer { os_unfair_lock_unlock(&lock) }
        nextTaskId += 1
        return nextTaskId
    }

    private func getNextSessionId() -> Int {
        os_unfair_lock_lock(&lock)
        defer { os_unfair_lock_unlock(&lock) }
        nextSessionId += 1
        return nextSessionId
    }
WARNING: ThreadSanitizer: Swift access race (pid=9524)
  Modifying access of Swift variable at 0x7b4000006ed8 by thread T9:
    #0 $s4Nuke13ImagePipelineC13getNextTaskId33_64FA2D582EFB9D605832F3421CEE67ADLLSiyF <null> (Nuke:x86_64+0x1f327)
    #1 $s4Nuke13ImagePipelineC04loadB04with8progress10completionAA0B4TaskCAA0B7RequestV_yAA0B8ResponseCSg_s5Int64VAPtcSgyAN_AC5ErrorOSgtcSgtF <null> (Nuke:x86_64+0x1e72e)
    #2 $s4Nuke14ImagePreheaterC16_startPreheating33_337920196C15B3DB24A28FE684F1274ELL4withyAA0B7RequestV_tFyyyccfU_ <null> (Nuke:x86_64+0x9ad10)
    #3 $s4Nuke14ImagePreheaterC16_startPreheating33_337920196C15B3DB24A28FE684F1274ELL4withyAA0B7RequestV_tFyyyccfU_TA <null> (Nuke:x86_64+0x9e880)
    #4 $s4Nuke9OperationC5startyyF <null> (Nuke:x86_64+0x7e327)
    #5 $s4Nuke9OperationC5startyyFTo <null> (Nuke:x86_64+0x7e580)
    #6 __NSOQSchedule_f <null> (Foundation:x86_64+0x97f5a)
    #7 _dispatch_client_callout <null> (libdispatch.dylib:x86_64+0x3db4)

  Previous modifying access of Swift variable at 0x7b4000006ed8 by thread T3:
    #0 $s4Nuke13ImagePipelineC16getNextSessionId33_64FA2D582EFB9D605832F3421CEE67ADLLSiyF <null> (Nuke:x86_64+0x1f7c7)
    #1 $s4Nuke13ImagePipelineC14_createSession33_64FA2D582EFB9D605832F3421CEE67ADLL4withAA0b7LoadingE0AELLCAA0B7RequestV_tF <null> (Nuke:x86_64+0x214a8)
    #2 $s4Nuke13ImagePipelineC013_startLoadingB033_64FA2D582EFB9D605832F3421CEE67ADLL3for8handlersyAA0B4TaskC_AA0bE7SessionAELLC8HandlersVtF <null> (Nuke:x86_64+0x1fb2c)
    #3 $s4Nuke13ImagePipelineC04loadB04with8progress10completionAA0B4TaskCAA0B7RequestV_yAA0B8ResponseCSg_s5Int64VAPtcSgyAN_AC5ErrorOSgtcSgtFyycfU_ <null> (Nuke:x86_64+0x1f053)
    #4 $s4Nuke13ImagePipelineC04loadB04with8progress10completionAA0B4TaskCAA0B7RequestV_yAA0B8ResponseCSg_s5Int64VAPtcSgyAN_AC5ErrorOSgtcSgtFyycfU_TA <null> (Nuke:x86_64+0x1f5eb)
    #5 $sIeg_IeyB_TR <null> (Nuke:x86_64+0x1f130)
    #6 __tsan::invoke_and_release_block(void*) <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x676ab)
    #7 _dispatch_client_callout <null> (libdispatch.dylib:x86_64+0x3db4)

  Location is heap block of size 248 at 0x7b4000006e00 allocated by main thread:
    #0 malloc <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x4a4fa)
    #1 swift_slowAlloc <null> (libswiftCore.dylib:x86_64+0x2ce3c8)
    #2 globalinit_33_64FA2D582EFB9D605832F3421CEE67AD_func0 <null> (Nuke:x86_64+0x1af40)
    #3 dispatch_once <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x682d4)
    #4 dispatch_once_f <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x683c0)
    #5 swift_once <null> (libswiftCore.dylib:x86_64+0x2f0578)
    #6 $s4Nuke14ImagePreheaterC8pipeline11destination25maxConcurrentRequestCountAcA0B8PipelineC_AC11DestinationOSitcfcfA_ <null> (FeedKit:x86_64+0x74586)
    #7 $s7FeedKit15ImageRepositoryCACycfc <null> (FeedKit:x86_64+0x73ec3)
    #8 $s7FeedKit15ImageRepositoryCACycfC <null> (FeedKit:x86_64+0x73da8)
    #9 globalinit_33_88DD861C5C731547631CB11144617E34_func1 <null> (FeedKit:x86_64+0x7425e)
    #10 dispatch_once <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x682d4)
    #11 dispatch_once_f <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x683c0)
    #12 swift_once <null> (libswiftCore.dylib:x86_64+0x2f0578)
    #13 $s6Podest6Config33_5FF95AB3DAF026F375C61CEF09A78E42LLC14freshImageRepo7FeedKit6Images_pyKF <null> (Podest:x86_64+0x10024a8a9)
    #14 globalinit_33_5FF95AB3DAF026F375C61CEF09A78E42_func7 <null> (Podest:x86_64+0x10024e311)
    #15 dispatch_once <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x682d4)
    #16 dispatch_once_f <null> (libclang_rt.tsan_iossim_dynamic.dylib:x86_64+0x683c0)
    #17 swift_once <null> (libswiftCore.dylib:x86_64+0x2f0578)
    #18 $s6Podest19QueueViewControllerC10dataSourceAA0b4DataF0CvgAFyXEfU_ <null> (Podest:x86_64+0x10016baa7)
    #19 $s6Podest19QueueViewControllerC10dataSourceAA0b4DataF0Cvg <null> (Podest:x86_64+0x10016b777)
    #20 $s6Podest19QueueViewControllerC11viewDidLoadyyF <null> (Podest:x86_64+0x100173be1)
    #21 $s6Podest19QueueViewControllerC11viewDidLoadyyFTo <null> (Podest:x86_64+0x1001740d0)
    #22 -[UIViewController loadViewIfRequired] <null> (UIKitCore:x86_64+0x34b43a)
    #23 $s6Podest18RootViewControllerC11viewDidLoadyyFTo <null> (Podest:x86_64+0x100198280)
    #24 -[UIViewController loadViewIfRequired] <null> (UIKitCore:x86_64+0x34b43a)
    #25 start <null> (libdyld.dylib:x86_64+0x1540)

  Thread T9 (tid=6762333, running) is a GCD worker thread

  Thread T3 (tid=6762311, running) is a GCD worker thread

SUMMARY: ThreadSanitizer: Swift access race (/Users/michael/Library/Developer/Xcode/DerivedData/podest-aliqrfnsmnyeqnccmouxgsyzcjmx/Build/Products/Debug-iphonesimulator/Nuke.framework/Nuke:x86_64+0x1f327) in $s4Nuke13ImagePipelineC13getNextTaskId33_64FA2D582EFB9D605832F3421CEE67ADLLSiyF

🛠 PR that resolves this follows, not to merge but to discuss. It looks like we are missing a point of synchronisation.

@kean

This comment has been minimized.

Copy link
Owner

commented Apr 19, 2019

Hi, @michaelnisi! This is the change that I made in the latest version to replace deprecated OSAtomic CAS functions which were used previously to make sure each task/session has a unique ID. I'm a bit surprised that TSan complains about this.

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2019

Oh, I found it rather obvious—the preheater is on its operation queue and the common image loading rolls in on the main queue, but maybe I got it wrong. Please check the PR.

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2019

Actually 🤔 Maybe we just need another lock… The problem might be that nextTaskId and nextSessionId are sharing the lock. 🔀 What do you think?

@kean

This comment has been minimized.

Copy link
Owner

commented Apr 19, 2019

It seems like yet another false positive form TSan 😔

What I'm trying to achieve here seems fairly reasonable to me – I want to have as little synchronization as possible which is as fast as possible. Tasks can be safely created and started from any thread. But in order to ensure that each task has a unique Id, I need to increment the counter atomically. And there doesn't seem to be a any clear ways to do that in Swift anymore unfortunately.

I tried using UUID instead but it was too slow.

We should probably switch to locks instead. They are almost as fast as atomic operations or spinlocks when they don't have to go to the kernel level.

@kean

This comment has been minimized.

Copy link
Owner

commented Apr 19, 2019

What I'm doing here is similar to what URLSession does with its task IDs https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/URLSession/URLSession.swift#L463

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2019

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 19, 2019

Have you measured the difference? How much slower would that be?

@kean

This comment has been minimized.

Copy link
Owner

commented Apr 20, 2019

I think Swift Foundation implementation is a bit naive – workQueue.sync waits until all other potentially large work items dispatched using workQueue.async are executed. I doubt that's how the "actual" URLSession works.

Have you measured the difference?

Not in this case but I was profiling these things heavily when I was working on the performance focused releases: Nuke 4, Nuke 6. In general, I try to use non-blocking synchronization like Atomic operations as much as possible.

NSLock – unless it needs to go to the kernel level – is also quite fast. It's still a reference type so it takes a bit of time to allocate it which is not a problem in ImagePipeline but might be in ImageTask.

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

Yes, using the workQueue for access caught my attention too. But I think, a dedicated queue, doing nothing else, for accessing these two ID Integers synchronously is probably fast enough. This is something I’ve been wanting to measure for a while now, comparing NSLock to queue.sync access control.

@michaelnisi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

These problems are in an unfamiliar domain for me, low level performance optimisations aren’t my bread and butter, I must admit. 🔮

@kean kean added the bug label Apr 24, 2019

@kean kean added this to the 7.6.2 milestone Apr 24, 2019

@kean

This comment has been minimized.

Copy link
Owner

commented Apr 24, 2019

Fixed in https://github.com/kean/Nuke/releases/tag/7.6.2, see release notes for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.