Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XCode 14.2/Swift 5.7 memory corruption on Task creation when app is launched on iOS < 16. #63420

Closed
mstyura opened this issue Feb 3, 2023 · 17 comments · Fixed by #63524 or #63525
Closed
Assignees
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. concurrency Feature: umbrella label for concurrency language features crash Bug: A crash, i.e., an abnormal termination of software run-time crash Bug → crash: Swift code crashed during execution standard library Area: Standard library umbrella swift 5.7

Comments

@mstyura
Copy link

mstyura commented Feb 3, 2023

Description
Task creation in some contexts can lead to corruption of the application's memory. Examples can be seen in the demo application code.

Steps to reproduce

  1. Download demo application source code: https://github.com/mstyura/Swift-Task-Heap-Curruption-on-iOS-15-and-earlier.
  2. Select scheme TaskCrasher - Release - libgmalloc - use with arm64 simulator;
  3. Run demo app on arm64 iOS simulator with iOS 15.4;

Expected behavior
Application is not crashed.

Actual behavior
Demo app crashes with libgmalloc enabled. Address sanitizer unfortunately unable to catch the issue.

выява

I've actually encountered problem on real production application on iOS device where app is randomly crashed on memory access to address 0x100000000 + offset where offset is usually +0x10, +0x20 or -0x8 in application or system library code.
Also in real application libmalloc were capable to detect it was corrupted on some point, but it was unclear who was responsible for corruption.

Environment

  • Swift compiler version info
swift-driver version: 1.62.15 Apple Swift version 5.7.2 (swiftlang-5.7.2.135.5 clang-1400.0.29.51)
Target: arm64-apple-macosx13.0
  • Xcode version info
Xcode 14.2
Build version 14C18
  • Deployment target:
    checked that issue happen on iOS 14 and iOS 13 and iOS 15 as deployment target.

Additional context:
An app that I observed crashing worked fine when built with XCode 13.4.1 and Swift 5.6, but experienced significant, random crashes with "bad access" errors on addresses like 0x100000020 when built with XCode 14.2 or XCode 14.1 and Swift 5.7. As a result, it's currently not feasible to target iOS versions lower than 16 and use Swift 5.7 with async functions and Tasks without encountering random crashes.

UPD: On demo application issue is only happen when compiled with -O (optimize for speed), but not when compiled with -Osize (optimize for size) or -Onone.

@mstyura mstyura added bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. triage needed This issue needs more specific labels labels Feb 3, 2023
@mstyura
Copy link
Author

mstyura commented Feb 7, 2023

Hello @DougGregor and @rjmccall! Sorry for bothering you by directly mentioning here. Could you please take a look at the issue, maybe you will suggest how severe the actual problem is. Thanks a lot in advance, Yury!

@tbkka
Copy link
Contributor

tbkka commented Feb 7, 2023

CC: @mikeash

@mikeash
Copy link
Contributor

mikeash commented Feb 7, 2023

@mstyura Thanks a lot for the nice reproducer. I'm able to replicate the crash here. I'll see if I can figure out what's going on.

@mikeash
Copy link
Contributor

mikeash commented Feb 7, 2023

I uncommented just willCauseHeapCorruption1 to focus on a single thing. This creates a task and passes in the AsyncFunctionPointer $sxIeghHr_xs5Error_pIegHrzo_s8SendableRzs5NeverORs_r0_lTRyt_Tg5043$s11TaskCrasher19globalAsyncFunctionyyYaFyyI6YbcfU_Tf3npf_nTu (demangles to async function pointer to function signature specialization <Arg[1] = [Constant Propagated Function : closure #1 @Sendable () async -> () in TaskCrasher.globalAsyncFunction() async -> ()]> of generic specialization <()> of reabstraction thunk helper <A, B where A: Swift.Sendable, B == Swift.Never> from @escaping @callee_guaranteed @Sendable @async () -> (@out A) to @escaping @callee_guaranteed @async () -> (@out A, @error @owned Swift.Error)).

This async function pointer specifies an initial context size of 16. However, the runtime assumes that it will be able to store a AsyncContext struct, which is 32 bytes. Thus we run off the end.

In the current version of the concurrency runtime, AsyncContext is only 16 bytes. A Flags field was removed from it in commit aca744b. Looks like the compiler part of this change unconditionally shrinks the type, but the old runtime still writes to the now-missing Flags field:

  initialContext->Flags = AsyncContextKind::Ordinary;

I think this is usually harmless, as this field will usually extend into whatever else the task uses the initial context for. But if there is no other such use, we don't allocate enough memory, and this dead write turns into a memory smasher. The compiler probably needs to pad this value a bit when targeting older OSes. @rjmccall can you take a look?

@mstyura
Copy link
Author

mstyura commented Feb 8, 2023

I think this is usually harmless, as this field will usually extend into whatever else the task uses the initial context for

@mikeash could you please elaborate on the reasons behind the safety of writing outside of "bounds", as I understand that this always happens when run over an old runtime? I've observed that crashes do not occur when a lambda captures external variables, and I'm curious about how these variables are stored. I wonder if writing to the "Flags" field might potentially interfere with the data structures related to the context of the lambda capture. Thank you.

In the current version of the concurrency runtime, AsyncContext is only 16 bytes. A Flags field was removed from it in commit aca744b. Looks like the compiler part of this change unconditionally shrinks the type, but the old runtime still writes to the now-missing Flags field.

I understand that it's a natural behavior for a compiler that was compiled with the AsyncContext definition without flags, but with a backward-compatible runtime that assumes the presence of flags. In light of this, perhaps it would be sensible to reintroduce the "flags" field in the form of a private and unusable field (aka placeholder), in order to preserve size assumptions and avoid any potential breakage.

@ktoso ktoso added the concurrency Feature: umbrella label for concurrency language features label Feb 8, 2023
@mikeash
Copy link
Contributor

mikeash commented Feb 8, 2023

When creating a task, the compiler generates an "initial context size" which tells the runtime how much extra space to allocate. This space is then used for various task-local stuff, like variables that persist across suspension points. The runtime also puts an AsyncContext at the start of this area. The way it's supposed to look is:

AsyncContext             Task-local whatnot
+--------+--------------+-------------------+
| Parent | ResumeParent | ...other stuff... |
+--------+--------------+-------------------+

That's on newer concurrency runtimes, after the commit I linked above. On older concurrency runtimes, it's supposed to look like this:

AsyncContext                     Task-local whatnot
+--------+--------------+-------+-------------------+
| Parent | ResumeParent | Flags | ...other stuff... |
+--------+--------------+-------+-------------------+

However, with a new compiler that thinks AsyncContext is always smaller, it actually looks like this when running on an older runtime:

AsyncContext             Task-local whatnot
+--------+--------------+-------+
| Parent | ResumeParent | Flags |
+--------+--------------+-------+-----------+
                        | ...other stuff... |
                        +-------------------+

The Flags field overlaps with what follows, which is, forgive the use of technical jargon, Bad.

It looks like this is usually harmless, as the runtime sets Flags and then it's never used again. This happens before any other stuff is written, so that space will be reused and everything works fine. The problem occurs when there is no other stuff. Then you end up with this picture:

AsyncContext
+--------+--------------+-------+
| Parent | ResumeParent | Flags |
+--------+--------------+-------+
                        ^
                        |
                        end of the allocation is here!

If nothing else needs to be stored, the compiler asks the runtime to allocate just enough space for Parent and ResumeParent. That means that Flags points off the end of the allocation. If we're lucky (or running with Guard Malloc), Flags lies in an unmapped page and crashes. If we're unlucky, Flags lines up with some other random chunk of memory, and smashes whatever happens to be there.

I don't think we need to go so far as to reintroduce Flags, but we do need to ensure that the compiler reserves enough space for it in the allocation when targeting older OSes.

@rjmccall
Copy link
Contributor

rjmccall commented Feb 8, 2023

Yes, I think setting a minimum size is probably the way to go.

@mstyura
Copy link
Author

mstyura commented Feb 21, 2023

Hello @rjmccall! Sorry for bothering you. I've downloaded XCode 14.3 beta1 with swiftc of version swiftlang-5.8.0.117.11 clang-1403.0.22.8.60 (released on 16 Feb 2023 according to https://xcodereleases.com/) and tried demo app in it. It still crashes under libgmalloc. It might be due to the fact that 2 relevant MRs with fixes:

  1. [5.8] Round the initial context size of tasks up to 32 on 64-bit <=5.6 runtimes #63525 (merged on February 9)
  2. [5.8] Fix backdeploy compat-56 ABI #63531 (merged on February 14)

are just not include in the 5.8.0.117.11 version of swiftc. Is there any way to know this for sure, I mean a way to map swiftc version to git commit? Thanks a lot in advance, Yury.

@AnthonyLatsis AnthonyLatsis added run-time crash Bug → crash: Swift code crashed during execution crash Bug: A crash, i.e., an abnormal termination of software labels Feb 21, 2023
@AnthonyLatsis
Copy link
Collaborator

@mstyura You can try out the latest 5.8 snapshot to know for certain.

@AnthonyLatsis AnthonyLatsis added standard library Area: Standard library umbrella swift 5.7 and removed triage needed This issue needs more specific labels labels Feb 21, 2023
@mstyura
Copy link
Author

mstyura commented Feb 21, 2023

@AnthonyLatsis the problems is that whenever I use custom toolchain with xcode, I'm getting the error when trying to launch app with debugger attached:

Details

Could not launch “TaskCrasher”
Domain: IDEDebugSessionErrorDomain
Code: 3
Failure Reason: LLDB provided no error string.
User Info: {
    DVTErrorCreationDateKey = "2023-02-21 15:14:32 +0000";
    DVTRadarComponentKey = 855031;
    IDERunOperationFailingWorker = DBGLLDBLauncher;
    RawUnderlyingErrorMessage = "LLDB provided no error string.";
}
--

Analytics Event: com.apple.dt.IDERunOperationWorkerFinished : {
    "device_model" = "iPhone14,6";
    "device_osBuild" = "15.4 (19E240)";
    "device_platform" = "com.apple.platform.iphonesimulator";
    "launchSession_schemeCommand" = Run;
    "launchSession_state" = 1;
    "launchSession_targetArch" = arm64;
    "operation_duration_ms" = 1;
    "operation_errorCode" = 3;
    "operation_errorDomain" = IDEDebugSessionErrorDomain;
    "operation_errorWorker" = DBGLLDBLauncher;
    "operation_name" = IDERunOperationWorkerGroup;
    "param_consoleMode" = 0;
    "param_debugger_attachToExtensions" = 0;
    "param_debugger_attachToXPC" = 0;
    "param_debugger_type" = 3;
    "param_destination_isProxy" = 0;
    "param_destination_platform" = "com.apple.platform.iphonesimulator";
    "param_diag_MainThreadChecker_stopOnIssue" = 0;
    "param_diag_MallocStackLogging_enableDuringAttach" = 0;
    "param_diag_MallocStackLogging_enableForXPC" = 1;
    "param_diag_allowLocationSimulation" = 0;
    "param_diag_checker_tpc_enable" = 0;
    "param_diag_gpu_frameCapture_enable" = 0;
    "param_diag_gpu_shaderValidation_enable" = 0;
    "param_diag_gpu_validation_enable" = 1;
    "param_diag_memoryGraphOnResourceException" = 0;
    "param_diag_queueDebugging_enable" = 0;
    "param_diag_runtimeProfile_generate" = 0;
    "param_diag_sanitizer_asan_enable" = 0;
    "param_diag_sanitizer_tsan_enable" = 0;
    "param_diag_sanitizer_tsan_stopOnIssue" = 0;
    "param_diag_sanitizer_ubsan_stopOnIssue" = 0;
    "param_diag_showNonLocalizedStrings" = 0;
    "param_diag_viewDebugging_enabled" = 0;
    "param_diag_viewDebugging_insertDylibOnLaunch" = 1;
    "param_install_style" = 0;
    "param_launcher_UID" = 2;
    "param_launcher_allowDeviceSensorReplayData" = 0;
    "param_launcher_kind" = 0;
    "param_launcher_style" = 1;
    "param_launcher_substyle" = 0;
    "param_runnable_appExtensionHostRunMode" = 0;
    "param_runnable_productType" = "com.apple.product-type.application";
    "param_testing_launchedForTesting" = 0;
    "param_testing_suppressSimulatorApp" = 0;
    "param_testing_usingCLI" = 0;
    "sdk_canonicalName" = "iphonesimulator16.4";
    "sdk_osVersion" = "16.4";
    "sdk_variant" = iphonesimulator;
}
--


System Information

macOS Version 13.1 (Build 22C65)
Xcode 14.3 (21801.3) (Build 14E5197f)
Timestamp: 2023-02-21T16:14:32+01:00

The workaround suggested here haven't helped.

@AnthonyLatsis
Copy link
Collaborator

Hm. cc @shahmishal

@shahmishal
Copy link
Member

@JDevlieghere @adrian-prantl can you help look into this issue?

@jasonmolenda
Copy link
Contributor

Hi @mstyura RawUnderlyingErrorMessage = "LLDB provided no error string."; usually means that either lldb-rpc-server was not involved in the failure, or lldb-rpc-server crashed. When lldb-rpc-server crashes, you'll find a crash log in ~/Library/Logs/DiagnosticReports. Do you see anything in that directory?

@jasonmolenda
Copy link
Contributor

FWIW I tried to repo this with a slightly different Xcode and swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a-osx.pkg, and did not reproduce the failure.

@mstyura
Copy link
Author

mstyura commented Feb 22, 2023

@jasonmolenda thanks for a hint. ~/Library/Logs/DiagnosticReports had a crash report:

Report
{"app_name":"lldb-rpc-server","timestamp":"2023-02-22 09:58:32.00 +0100","app_version":"","slice_uuid":"c8fe70f6-1e03-35ed-b1cf-be9b10758b7d","build_version":"","platform":1,"share_with_app_devs":0,"is_first_party":1,"bug_type":"309","os_version":"macOS 13.1 (22C65)","roots_installed":0,"incident_id":"D164F174-2269-4220-8742-5E2AB6ACF1FE","name":"lldb-rpc-server"}
{
  "uptime" : 26000,
  "procRole" : "Unspecified",
  "version" : 2,
  "userID" : 502,
  "deployVersion" : 210,
  "modelCode" : "MacBookPro18,1",
  "coalitionID" : 3056,
  "osVersion" : {
    "train" : "macOS 13.1",
    "build" : "22C65",
    "releaseType" : "User"
  },
  "captureTime" : "2023-02-22 09:58:32.1027 +0100",
  "incident" : "D164F174-2269-4220-8742-5E2AB6ACF1FE",
  "pid" : 68342,
  "translated" : false,
  "cpuType" : "ARM-64",
  "roots_installed" : 0,
  "bug_type" : "309",
  "procLaunch" : "2023-02-22 09:58:32.0185 +0100",
  "procStartAbsTime" : 634239149465,
  "procExitAbsTime" : 634241145322,
  "procName" : "lldb-rpc-server",
  "procPath" : "\/Applications\/Xcode_14.3_beta.app\/Contents\/SharedFrameworks\/LLDBRPC.framework\/Versions\/A\/Resources\/lldb-rpc-server",
  "parentProc" : "Xcode",
  "parentPid" : 68266,
  "coalitionName" : "com.apple.dt.Xcode",
  "crashReporterKey" : "0226B24A-284E-4EB7-74B1-5552D11514AE",
  "responsiblePid" : 68266,
  "responsibleProc" : "Xcode",
  "wakeTime" : 566,
  "sleepWakeUUID" : "071FE87B-9A5E-4E63-A86D-F8574E81268C",
  "sip" : "enabled",
  "exception" : {"codes":"0x0000000000000000, 0x0000000000000000","rawCodes":[0,0],"type":"EXC_CRASH","signal":"SIGABRT"},
  "termination" : {"code":1,"flags":518,"namespace":"DYLD","indicator":"Library missing","details":["(terminated at launch; ignore backtrace)"],"reasons":["Library not loaded: @rpath\/Python3.framework\/Versions\/3.8\/Python3","Referenced from: <FE0FAA10-6242-3AC6-B7A2-8FD049E7988C> \/Library\/Developer\/Toolchains\/swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a.xctoolchain\/System\/Library\/PrivateFrameworks\/LLDB.framework\/Versions\/A\/LLDB","Reason: tried: '\/Library\/Developer\/Toolchains\/swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a.xctoolchain\/System\/Library\/PrivateFrameworks\/Python3.framework\/Versions\/3.8\/Python3' (no such file), '\/usr\/lib\/swift\/Python3.framework\/Versions\/3.8\/Python3' (no such file, not in dyld cache), '\/System\/Volumes\/Preboot\/Cryptexes\/OS\/usr\/lib\/swift\/Python3.framework\/Versions\/3.8\/Python3' (no such file), '\/Library\/Developer\/Toolchains\/swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a.xctoolchain\/System\/Library\/PrivateFrameworks\/LLDB.framework\/Versions\/A\/..\/..\/..\/..\/..\/..\/..\/..\/Library\/Frameworks\/Python3.framework\/Versions\/3.8\/Python3' (no such file), '\/Library\/Developer\/Toolchains\/swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a.xctoolchain\/System\/Library\/PrivateFra"]},
  "extMods" : {"caller":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"system":{"thread_create":0,"thread_set_state":114,"task_for_pid":17},"targeted":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"warnings":0},
  "faultingThread" : 0,
  "threads" : [{"triggered":true,"id":824673,"threadState":{"x":[{"value":6},{"value":1},{"value":6126364592},{"value":216},{"value":6126363568},{"value":0},{"value":0},{"value":0},{"value":32},{"value":9},{"value":1},{"value":10},{"value":0},{"value":54},{"value":3},{"value":18446744073709551615},{"value":521},{"value":6909596484,"symbolLocation":392,"symbol":"__simple_bprintf"},{"value":0},{"value":0},{"value":6126363568},{"value":216},{"value":6126364592},{"value":1},{"value":6},{"value":8448488256,"symbolLocation":0,"symbol":"gProcessInfo"},{"value":6126367680},{"value":6126367272},{"value":0}],"flavor":"ARM_THREAD_STATE64","lr":{"value":6910089728},"cpsr":{"value":4096},"fp":{"value":6126363520},"sp":{"value":6126363456},"esr":{"value":1442840704,"description":" Address size fault"},"pc":{"value":6910046608,"matchesCrashFrame":1},"far":{"value":4343496704}},"frames":[{"imageOffset":463248,"symbol":"__abort_with_payload","symbolLocation":8,"imageIndex":0},{"imageOffset":506368,"symbol":"abort_with_payload_wrapper_internal","symbolLocation":104,"imageIndex":0},{"imageOffset":506420,"symbol":"abort_with_payload","symbolLocation":16,"imageIndex":0},{"imageOffset":41124,"symbol":"dyld4::halt(char const*)","symbolLocation":328,"imageIndex":0},{"imageOffset":28824,"symbol":"dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*)","symbolLocation":4204,"imageIndex":0},{"imageOffset":24004,"symbol":"start","symbolLocation":2404,"imageIndex":0}]}],
  "usedImages" : [
  {
    "source" : "P",
    "arch" : "arm64e",
    "base" : 6909583360,
    "size" : 568164,
    "uuid" : "487cfdeb-9b07-39bf-bfb9-970b61aea2d1",
    "path" : "\/usr\/lib\/dyld",
    "name" : "dyld"
  }
],
  "sharedCache" : {
  "base" : 6908936192,
  "size" : 3434283008,
  "uuid" : "00a1fbb6-43e1-3c11-8483-faf0db659249"
},
  "vmSummary" : "ReadOnly portion of Libraries: Total=991.7M resident=0K(0%) swapped_out_or_unallocated=991.7M(100%)\nWritable regions: Total=9248K written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=9248K(100%)\n\n                                VIRTUAL   REGION \nREGION TYPE                        SIZE    COUNT (non-coalesced) \n===========                     =======  ======= \nSTACK GUARD                       56.0M        1 \nStack                             8176K        1 \nVM_ALLOCATE                         16K        1 \n__AUTH                              46K       11 \n__AUTH_CONST                        70K       38 \n__DATA                            1311K       40 \n__DATA_CONST                      7964K       42 \n__DATA_DIRTY                        78K       22 \n__LINKEDIT                       863.0M        4 \n__OBJC_CONST                        11K        5 \n__OBJC_RO                         65.4M        1 \n__OBJC_RW                         1986K        1 \n__TEXT                           128.7M       44 \ndyld private memory                512K        2 \n===========                     =======  ======= \nTOTAL                              1.1G      213 \n",
  "legacyInfo" : {
  "threadTriggered" : {

  }
},
  "trialInfo" : {
  "rollouts" : [
    {
      "rolloutId" : "63582c5f8a53461413999550",
      "factorPackIds" : {

      },
      "deploymentId" : 240000002
    },
    {
      "rolloutId" : "60f8ddccefea4203d95cbeef",
      "factorPackIds" : {

      },
      "deploymentId" : 240000022
    }
  ],
  "experiments" : [

  ]
}
}

Basically swift toolchain has LLDB which attempts to load Python3.framework, but toolchain is absent Python3.framework. Manually added symlink to XCodes Python3.framework to resolve issue: sudo ln -nfs /Applications/Xcode_14.2.app/Contents/Developer/Library/Frameworks/Python3.framework /Library/Developer/Toolchains/swift-5.8-DEVELOPMENT-SNAPSHOT-2023-02-20-a.xctoolchain/System/Library/PrivateFrameworks/Python3.framework. Now I can debug apps with custom swift toolchains. Thanks for a help!

@jasonmolenda
Copy link
Contributor

@mstyura excellent, thanks for getting to the bottom of it. Ah, I see from the LLDB LC_RPATHs that LLDB looks for the installed Xcode.app as /Applications/Xcode.app to find Python, but you have Xcode installed as Xcode_14.2.app. The Xcode app bundle is normally self contained, but the CommandLineTools lldb has an assumption about the name. I'll look into what we might be able to do to make this more resilient.

@JDevlieghere
Copy link
Contributor

As @jasonmolenda pointed out, Xcode and the Command Line Tools (CLT) are self contained and each contain a copy of the Python 3 framework, which allows us to have a relative RPATH from LLDB. There's (purposely) no Python 3 in the operating system so there's no relative RPATH that can guarantee you to find the Python framework from /Library/Developer/Toolchains. As a workaround, we have two absolute RPATHs: one for Xcode installed in /Applications/Xcode.app and one to the /Library/Developer/CommandLineTools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. concurrency Feature: umbrella label for concurrency language features crash Bug: A crash, i.e., an abnormal termination of software run-time crash Bug → crash: Swift code crashed during execution standard library Area: Standard library umbrella swift 5.7
Projects
None yet
9 participants