Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _stdlib_random for more platforms #1

Merged
merged 8 commits into from Feb 10, 2018

Conversation

benrimmington
Copy link

I've tested (using macOS only):

  • arc4random_buf
  • /dev/urandom
  • getentropy
  • SecRandomCopyBytes

I haven't tested BCryptGenRandom for Windows, but it will require Bcrypt.dll (and/or Bcrypt.lib).
https://msdn.microsoft.com/en-us/library/windows/desktop/aa375458(v=vs.85).aspx

Should _stdlib_random be available in Swift code?
Maybe as a public/testable method of UnsafeMutableRawBufferPointer?
How can we test that _stdlib_random fills the entire buffer with random bytes?

@Azoy
Copy link
Owner

Azoy commented Feb 7, 2018

Every looks pretty good, but FreeBSD's arc4random still appears to use RC4 (which is now considered non-cryptographically secure due to RC4 weakness). Source It also appears that this hasn't been updated since 1997. We could explore looking at RAND_bytes Source I could be looking at the wrong man pages, and correct me if I'm wrong, but this looks like the best route for FreeBSD.

Other than that, everything else looks correct. I know @xwu had brought up the idea of a more primitive solution to getting random bytes, and I think that's definitely something we can look at. I would have to post it on evolution, but I don't think there would be much disagreement there.

In terms of testing, that's a hard one that I can't quite answer right now. Most of these methods are coming from the kernel, and if the kernel tells us that it's given us n bytes, then we should trust it in believing it gave us n bytes. I still think there should be a way for us to test it, but like I said, not something I can answer quite yet. One last thing, when you say you tested arc4random, did you test both implementations or just the mac version?

@xwu
Copy link

xwu commented Feb 7, 2018

Agree that it's best to avoid arc4random on platforms where it's known not to be cryptographically secure, if there's an alternative.

However, it appears that arc4random on FreeBSD is cryptographically secure. We could also look into using getentropy for all BSDs as an alternative (also implemented on macOS).

@Azoy
Copy link
Owner

Azoy commented Feb 7, 2018

In the link you posted, it states

The arc4rand() function uses the RC4 algorithm to generate successive
pseudo-random bytes. The arc4random() function uses arc4rand() to gener-
ate pseudo-random numbers in the range from 0 to (2**32)-1.

@@ -13,14 +13,34 @@
#if defined(__APPLE__)
#define _REENTRANT
#include <math.h>
#include <CoreFoundation/CoreFoundation.h>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this kosher for the standard library? We could do without the detailed error message, IMO, if that simplifies dependencies.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xwu This dependency already exists. But I could remove the SecCopyErrorMessageString call.

@@ -332,44 +352,101 @@ swift::_stdlib_cxx11_mt19937_uniform(__swift_uint32_t upper_bound) {
}

#if defined(__APPLE__)
#include <Security/Security.h>

SWIFT_RUNTIME_STDLIB_INTERNAL
void swift::_stdlib_random(void *buf, __swift_size_t nbytes) {
if (__builtin_available(macOS 10.12, iOS 10, tvOS 10, watchOS 3, *)) {
Copy link

@xwu xwu Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why isn't this conditional written in the form of a macro? And why not just stick with SecRandomCopyBytes for all Apple platforms?

(I recognize that this isn't being changed here, but I thought I'd ask.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea so Greg Parker brought this up as well in the actual pr, but I know a few on the mailing list had said that the Darwin solution should use arc4random. In addition to those comments, the whole family of arc4random is advertised as being always successful on the mac man pages man arc4random, so removing the possibility of error in something I'm in favor of. Although, I do see using one solution for all versions as an easier implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why isn't this conditional written in the form of a macro?

@xwu The original was using macros. I suggested __builtin_available because it should be a runtime conditional. AFAIK, until the stdlib is a system library, MAC_OS_X_VERSION_MIN_REQUIRED will always be MAC_OS_X_VERSION_10_9 (see supported target platforms).

@xwu
Copy link

xwu commented Feb 7, 2018

@Azoy You're right. I didn't see that part. I was misled by the earlier documentation:

The arc4rand() function will return very good quality random numbers, better suited for security-related purposes.

Ugh. RAND_bytes is an OpenSSL function, and therefore likely out of the question. We can reasonably default, I think, to reading /dev/(u)random directly on FreeBSD. There is a patch to replace RC4 with ChaCha20 which hasn't been merged after nearly a year. Swift can always opt later versions of FreeBSD into the arc4random-based code.

@Azoy
Copy link
Owner

Azoy commented Feb 7, 2018

Reading from /dev/random sounds about right. @xwu how do you feel about interfacing _stdlib_random through something like UnsafeMutableRawBufferPointer like Ben mentioned? I know you were advertising for something like this pretty early on. I think it's a fine idea imo and shouldn't have much disagreement over on evolution. The core team expressed "general use" case design, and I don't think that addition takes away from what's already proposed. I think some will argue that since this addition is a bit more primitive that it should provide some sort error handling, but I think that takes away from general use design. This also brings up an interesting discussion on how to let UnsafeMutableRawBufferPointer interface with other RNGs since the only thing they can do is produce unsigned integers.

@xwu
Copy link

xwu commented Feb 7, 2018

I think the proposal should be as minimal as possible, so I'd rather not. _stdlib_random should be an internal method only, IMO. It should be testable via the properly designed API to generate a random UInt32/UInt64/DoubleWidth<UInt64>/etc.

@Azoy
Copy link
Owner

Azoy commented Feb 7, 2018

Yeah I can respect that. If users needed random bytes they could simply make a series of calls to UInt8 or Random.default.

@Azoy
Copy link
Owner

Azoy commented Feb 7, 2018

Also, would adding more platform support need some mention of in evolution? I assume by now we've agreed to make the default rng a csrng, so extending that to more platforms shouldn't be an issue no? Just a simple change of the proposal?

@benrimmington
Copy link
Author

We can reasonably default, I think, to reading /dev/(u)random directly on FreeBSD.

LibC++ also uses "/dev/urandom" by default:

We could use "/dev/urandom" for Android, Cygwin, FreeBSD, Haiku, and (possibly all versions of) Linux.

@benrimmington
Copy link
Author

If _stdlib_random can be used internally to fill large buffers, it should be tested somehow. When the buffer is larger than 256 bytes, there will be successive calls to getentropy or getrandom or _stdlib_read, so I'd like to test that the pointer arithmetic is correct.

A public method on UnsafeMutableRawBufferPointer might be used to fully initialize the Mersenne Twister, which has 19937 bits of internal state.
http://www.pcg-random.org/posts/cpp-seeding-surprises.html
(Or you could call Random.default.next() 312 times.)

@benrimmington
Copy link
Author

One last thing, when you say you tested arc4random, did you test both implementations or just the mac version?

@Azoy I've only tested on macOS. For other platforms, I've checked online manual pages, etc.

@xwu
Copy link

xwu commented Feb 7, 2018

@benrimmington Large buffers can be tested using DoubleWidth<DoubleWidth<DoubleWidth<DoubleWidth<UInt64>>>>, etc.

@benrimmington
Copy link
Author

I've tried this in the integrated REPL (because LLDB doesn't build for me).

***  You are running Swift's integrated REPL,  ***
***  intended for compiler and stdlib          ***
***  development and testing purposes only.    ***
***  The full REPL is built as part of LLDB.   ***
***  Type ':help' for assistance.              ***
(swift) typealias UInt128 = DoubleWidth<UInt64>
(swift) typealias UInt256 = DoubleWidth<UInt128>
(swift) typealias UInt512 = DoubleWidth<UInt256>
(swift) typealias UInt1024 = DoubleWidth<UInt512>
(swift) typealias UInt2048 = DoubleWidth<UInt1024>
(swift) 
(swift) UInt8.random(in: 0 ... .max)
// r0 : UInt8 = 158
(swift) UInt16.random(in: 0 ... .max)
// r1 : UInt16 = 8540
(swift) UInt32.random(in: 0 ... .max)
// r2 : UInt32 = 639028176
(swift) UInt64.random(in: 0 ... .max)
// r3 : UInt64 = 11283549236613816025
(swift) UInt128.random(in: 0 ... .max)
// r4 : UInt128 = (0, 14176280962140810311)
(swift) UInt256.random(in: 0 ... .max)
// r5 : UInt256 = (0, 6701370950449278658)
(swift) 
(swift) UInt512.random(in: 0 ... .max) // 1 second
// r6 : UInt512 = (0, 3662136189629791096)
(swift) 
(swift) UInt1024.random(in: 0 ... .max) // 10 seconds
// r7 : UInt1024 = (0, 7596761616392362926)
(swift) 
(swift) UInt2048.random(in: 0 ... .max) // 120 seconds
// r8 : UInt2048 = (0, 14280515300012998372)

UInt2048.random (i.e. 256 bytes) takes 2 minutes!

I don't think DoubleWidth.random is supported. The result will be a truncated UInt64, if I'm reading the ClosedRange.random and RandomNumberGenerator.next methods correctly.

Also, the _stdlib_random function is only used in the Random.default.next method, to return a UInt64 result. So the DoubleWidth tests won't help.

@xwu
Copy link

xwu commented Feb 7, 2018

Great: so we've demonstrated at least several bugs with the implementation:

  • It should not take 2 minutes to retrieve 256 bytes of randomness
  • The result should not be truncated to 64 bits
  • Generating an unbounded integer should be a primitive operation where _stdlib_random is used to fill the underlying bytes of memory

After those are addressed--and they must--_stdlib_random can be adequately tested via DoubleWidth.

@benrimmington
Copy link
Author

It should not take 2 minutes to retrieve 256 bytes of randomness

This might be a problem with DoubleWidth or the integrated REPL.
For example, UInt2048.max on its own takes 90 seconds.

The result should not be truncated to 64 bits

I think the result is sign-extended from a random 64-bit value.
But I haven't looked too closely at the Swift code yet.

Generating an unbounded integer should be a primitive operation where _stdlib_random is used to fill the underlying bytes of memory

Does the existing design allow this?
What happens when the generator argument isn't Random.default?

@xwu
Copy link

xwu commented Feb 7, 2018

There are various issues with DoubleWidth speed, but whatever those are shouldn't impact the performance of getting a certain number of random bytes.

Either the existing design for random numbers is flawed or its implementation is flawed, or both, but that is what should happen (at least in optimized code) as there is no security reason otherwise. This is a sine qua non that any reasonable design for random numbers must be able to meet.

If the RNG used is not the default, that's a different kettle of fish, but I strongly advocated for (and will do so again) stripping that entire part of the design out of the standard library.

@Azoy
Copy link
Owner

Azoy commented Feb 8, 2018

Sorry, I was unaware of this API. I successfully prevented truncation to UInt64 for all types which are greater than 64 bits (using Xiaodi's earlier implementation). Primitive operations like _stdlib_random are not the root cause of long elapsed time to retrieve 256 bytes (you can test this by directly arc4random_buf(&random, MemoryLayout<UInt2048>.size) which only takes a second or so. You can probably get the roughly the same results with other methods of getting randomness.) I've concluded that the performance tank is due to issues with DoubleWidth or the REPL as Ben mentioned. I use Nate's implementation to generate FixedWidthIntegers ClosedRange and Range. There might be optimizations to be made there. You can see that either the REPL or DoubleWidth has issues when it simply wants to print itself (print(random)) which takes a minute or two.

@Azoy
Copy link
Owner

Azoy commented Feb 8, 2018

For reference on it successfully generating 256 bytes,

// random : UInt2048 = (138960784646314793880539886029324821466946539449800443724570716748568926883141730223018453597179127740877518226535476874204958397750032686409339245312849599134409520934263865560454060251032864550921819014778114286454956199832791058862793378692701274617009890163700323833133172711640424681505912360731390412394, 102307650499519674829200418635993667542789979026650630878379668899878853613577651217383117247610455913076411481708620934252890054697832942869023213462139479253644415711502638465595878867195524452026546909583717141857479124682987677041918547177638431454354215758859878126365306290771961732934612729362444659922)

@xwu
Copy link

xwu commented Feb 8, 2018

@Azoy I've been made aware that there are compilation time issues with DoubleWidth; it's possible that Ben is experiencing some of that in the REPL. If we can't get those sorted out, then it's possible that DoubleWidth would be removed from the next release of the standard library and we'd need some other way to test.

I'm not saying here that _stdlib_random is the cause of long elapsed time; I'm saying that something about the overall design of the random number API may be involved because generating a random value between 0 and UInt2048.max should be nearly exactly as fast as arc4random_buf(&random, MemoryLayout<UInt2048.size>). And if the API design does not allow that, then there is something wrong with the API design.

@Azoy
Copy link
Owner

Azoy commented Feb 8, 2018

IMO, I think the easiest solution would be something like this:

var random: UInt2048 = 0
Random.default.next(&random)
print(random)

This would allow unbounded integers direct access to the performance of _stdlib_random, but this breaks the core teams "general use" case design IMO. This would be easy to make exclusive for Random.default (since I think most of the use case would be for security purposes), but this disallows for external RNGs which is something that is very needed. The obstacles that DoubleWidth has to go through with the current implementation does impede on performance, so direct access to _stdlib_random is something that is probably going to have to be added.

@xwu
Copy link

xwu commented Feb 8, 2018

No, no. You misunderstand. I don't want a special API for UInt2048. I care deeply that the API that's supposed to generate any integer value within a range functions correctly and with good performance, and currently we're not sure that it can do either of those. If it can, great.

@Azoy
Copy link
Owner

Azoy commented Feb 8, 2018

No, no. I didn't intend to make it come off as a special API for UInt2048. It was meant to show a general purpose backdoor access to _stdlib_random. In regards to current implementation, it does correctly generate any integer value within a range. The part where it fails (only with large int types mind you) is good performance. This is definitely something we need to look over if we want to get this accepted for Swift 5.

@benrimmington
Copy link
Author

@Azoy, I think this is ready. Are you able to test on Linux?

@Azoy
Copy link
Owner

Azoy commented Feb 9, 2018

@benrimmington I rebased a commit yesterday which caused one of my older commits to appear in your pr. Can you pull from origin to fix that? And yes, I should be able to test Linux.

@Azoy
Copy link
Owner

Azoy commented Feb 9, 2018

I can squash and merge when I get home

@Azoy Azoy merged this pull request into Azoy:random-unification Feb 10, 2018
@benrimmington benrimmington deleted the random-unification branch February 10, 2018 15:10
Azoy pushed a commit that referenced this pull request Mar 25, 2018
* Remove refs to Countable ranges

* Add `_stdlib_random` for more platforms

* Use `getrandom` (if available) for Android, Cygwin

* Reorder the `_stdlib_random` functions

* Also include <features.h> on Linux

* Add `#error TODO` in `_stdlib_random` for Windows

* Colon after Fatal Error

Performance improvement for Random

gybify ranges

Fix typo in 'basic random numbers'
Add _stdlib_random as a testable method

Switch to generic constraints

Hopefully link against bcrypt
Azoy pushed a commit that referenced this pull request May 4, 2018
[test] Update diagnostic test for SR-7599
Azoy pushed a commit that referenced this pull request May 5, 2018
* Remove refs to Countable ranges

* Add `_stdlib_random` for more platforms

* Use `getrandom` (if available) for Android, Cygwin

* Reorder the `_stdlib_random` functions

* Also include <features.h> on Linux

* Add `#error TODO` in `_stdlib_random` for Windows

* Colon after Fatal Error

Performance improvement for Random

gybify ranges

Fix typo in 'basic random numbers'
Add _stdlib_random as a testable method

Switch to generic constraints

Hopefully link against bcrypt

Fix some implementation details

1. Uniform distribution is now uniform
2. Apply Jens' method for uniform floats

Fix a lineable attribute
Azoy pushed a commit that referenced this pull request Dec 8, 2018
The standard library has two versions of the `abs(_:)` function:

```
func abs<T : SignedNumeric>(_ x: T) -> T where T.Magnitude == T
func abs<T : SignedNumeric & Comparable>(_ x: T) -> T
```

The first is more specialized than the second because `T.Magnitude` is
known to conform to `Comparable`. Indeed, it’s a more specialized
implementation that returns `magnitude`.

However, this overload behaves oddly: in the expression `abs(-8)`, the type
checker will pick the first overload because it is more specialized. That’s
a general guiding principle for overloading: pick the most specialized
overload that works.

However, to select that overload, it needs to pick a type for the literal
“8” for which that overload works, and it chooses `Double`. The “obvious”
answer, `Int`, doesn’t work because `Int.Magnitude == UInt`.

There is a conflict between the two rules, here: we prefer more-specialized
overloads (but we’ll fall back to less-specialized if those don’t work) and we prefer to use `Int` for integer literals (but we’ll fall back to `Double` if it doesn’t work). We have a few options from a type-checker
perspective:

1. Consider the more-specialized-function rule to be more important
2. Consider the integer-literals-prefer-`Int` rule to be more important
3. Call the result ambiguous and make the user annotate it

The type checker currently does #1, although at some point in the past it
did #2. Moving forward, #1 is a better choice because it prunes the number
of overloads that need to be considered: if the more-specialized overload
succeeds its type-check, the others need not be considered. It’s also
easier to reason about than the literal-scoring approach, because there can
be a direct definition for “more specialized than” that can be reasoned
about.

I think we should dodge the issue by removing the more-specialized version
of `abs(_:)`. Its use of `magnitude` seems unlikely to provide a
significant performance benefit, and the presence of overloading either
forces us to consider both overloads always (which is bad for type checker
performance) or accept the regression that `abs(-8)` is `Double`. Better
to eliminate the overloading and, if needed in the future, find a better
way to introduce the more-specialized implementation without it being a
separate signature.

Fixes rdar://problem/42345366.
Azoy pushed a commit that referenced this pull request Jan 15, 2019
The standard library has two versions of the `abs(_:)` function:

```
func abs<T : SignedNumeric>(_ x: T) -> T where T.Magnitude == T
func abs<T : SignedNumeric & Comparable>(_ x: T) -> T
```

The first is more specialized than the second because `T.Magnitude` is
known to conform to `Comparable`. Indeed, it’s a more specialized
implementation that returns `magnitude`.

However, this overload behaves oddly: in the expression `abs(-8)`, the type
checker will pick the first overload because it is more specialized. That’s
a general guiding principle for overloading: pick the most specialized
overload that works.

However, to select that overload, it needs to pick a type for the literal
“8” for which that overload works, and it chooses `Double`. The “obvious”
answer, `Int`, doesn’t work because `Int.Magnitude == UInt`.

There is a conflict between the two rules, here: we prefer more-specialized
overloads (but we’ll fall back to less-specialized if those don’t work) and we prefer to use `Int` for integer literals (but we’ll fall back to `Double` if it doesn’t work). We have a few options from a type-checker
perspective:

1. Consider the more-specialized-function rule to be more important
2. Consider the integer-literals-prefer-`Int` rule to be more important
3. Call the result ambiguous and make the user annotate it

The type checker currently does #1, although at some point in the past it
did #2. Moving forward, #1 is a better choice because it prunes the number
of overloads that need to be considered: if the more-specialized overload
succeeds its type-check, the others need not be considered. It’s also
easier to reason about than the literal-scoring approach, because there can
be a direct definition for “more specialized than” that can be reasoned
about.

I think we should dodge the issue by removing the more-specialized version
of `abs(_:)`. Its use of `magnitude` seems unlikely to provide a
significant performance benefit, and the presence of overloading either
forces us to consider both overloads always (which is bad for type checker
performance) or accept the regression that `abs(-8)` is `Double`. Better
to eliminate the overloading and, if needed in the future, find a better
way to introduce the more-specialized implementation without it being a
separate signature.

Fixes rdar://problem/42345366.

(cherry picked from commit 85d488d)
Azoy pushed a commit that referenced this pull request Mar 11, 2019
Preparations for landing numist/swift
Azoy pushed a commit that referenced this pull request Jul 31, 2019
To display a failure message in the debugger, create a function in the debug info which has the name of the failure message.
The debug location of the trap/cond_fail is then wrapped into this function and the function is declared as "inlined".
In case the debugger stops at the trap instruction, it displays the inline function, which looks like the failure message.
For example:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x0000000100000cbf a.out`testit3(_:) [inlined] Unexpectedly found nil while unwrapping an Optional value at test.swift:14:11 [opt]
   11
   12   @inline(never)
   13   func testit(_ a: Int?) -> Int {
-> 14     return a!
   15   }
   16

This change is currently not enabled by default, but can be enabled with the option "-Xllvm -enable-trap-debug-info".
Enabling this feature needs some changes in lldb. When the lldb part is done, this option can be removed and the feature enabled by default.
Azoy pushed a commit that referenced this pull request Jul 31, 2019
To display a failure message in the debugger, create a function in the debug info which has the name of the failure message.
The debug location of the trap/cond_fail is then wrapped into this function and the function is declared as "inlined".
In case the debugger stops at the trap instruction, it displays the inline function, which looks like the failure message.
For example:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x0000000100000cbf a.out`testit3(_:) [inlined] Unexpectedly found nil while unwrapping an Optional value at test.swift:14:11 [opt]
   11
   12   @inline(never)
   13   func testit(_ a: Int?) -> Int {
-> 14     return a!
   15   }
   16

This change is currently not enabled by default, but can be enabled with the option "-Xllvm -enable-trap-debug-info".
Enabling this feature needs some changes in lldb. When the lldb part is done, this option can be removed and the feature enabled by default.
Azoy pushed a commit that referenced this pull request Sep 27, 2019
Due to the fact that `matchCallArgument` can't and
doesn't take types into consideration while matching
arguments to parameters, when both arguments are
un-labeled, it's impossible to say which one is missing:

func foo(_: Int, _: String) {}
foo("")

In this case first argument is missing, but we end up with
two fixes - argument mismatch (for #1) and missing argument
(for #2), which is incorrect so it has to be handled specially.
Azoy pushed a commit that referenced this pull request Dec 17, 2019
Azoy pushed a commit that referenced this pull request Apr 18, 2021
[CodeCompletion] Migrate some tests to batch completion test #1
Azoy pushed a commit that referenced this pull request Jun 28, 2021
* Synchronize both versions of actor_counters.swift test

* Synchronize on Job address

Make sure to synchronize on Job address (AsyncTasks are Jobs, but not
all Jobs are AsyncTasks).

* Add fprintf debug output for TSan acquire/release

* Add tsan_release edge on task creation

without this, we are getting false data races between when a task
is created and immediately scheduled on a different thread.

False positive for `Sanitizers/tsan/actor_counters.swift` test:
```
WARNING: ThreadSanitizer: data race (pid=81452)
  Read of size 8 at 0x7b2000000560 by thread T5:
    #0 Counter.next() <null>:2 (a.out:x86_64+0x1000047f8)
    #1 (1) suspend resume partial function for worker(identity:counters:numIterations:) <null>:2 (a.out:x86_64+0x100005961)
    #2 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef)

  Previous write of size 8 at 0x7b2000000560 by main thread:
    #0 Counter.init(maxCount:) <null>:2 (a.out:x86_64+0x1000046af)
    #1 Counter.__allocating_init(maxCount:) <null>:2 (a.out:x86_64+0x100004619)
    #2 runTest(numCounters:numWorkers:numIterations:) <null>:2 (a.out:x86_64+0x100006d2e)
    #3 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef)
    apple#4 main <null>:2 (a.out:x86_64+0x10000a175)
```

New edge with this change:
```
[4357150208] allocate task 0x7b3800000000, parent = 0x0
[4357150208] creating task 0x7b3800000000 with parent 0x0
[4357150208] tsan_release on 0x7b3800000000    <<< new release edge
[139088221442048] tsan_acquire on 0x7b3800000000
[139088221442048] trying to switch from executor 0x0 to 0x7ff85e2d9a00
[139088221442048] switch failed, task 0x7b3800000000 enqueued on executor 0x7ff85e2d9a00
[139088221442048] enqueue job 0x7b3800000000 on executor 0x7ff85e2d9a00
[139088221442048] tsan_release on 0x7b3800000000
[139088221442048] tsan_release on 0x7b3800000000
[4357150208] tsan_acquire on 0x7b3800000000
counters: 1, workers: 1, iterations: 1
[4357150208] allocate task 0x7b3c00000000, parent = 0x0
[4357150208] creating task 0x7b3c00000000 with parent 0x0
[4357150208] tsan_release on 0x7b3c00000000    <<< new release edge
[139088221442048] tsan_acquire on 0x7b3c00000000
[4357150208] task 0x7b3800000000 waiting on task 0x7b3c00000000, going to sleep
[4357150208] tsan_release on 0x7b3800000000
[4357150208] tsan_release on 0x7b3800000000
[139088221442048] getting current executor 0x0
[139088221442048] tsan_release on 0x7b3c00000000
...
```

rdar://78932849

* Add static_cast<Job *>()

* Move TSan release edge to swift_task_enqueueGlobal()

Move the TSan release edge from `swift_task_create_commonImpl()` to
`swift_task_enqueueGlobalImpl()`.  Task creation itself is not an event
that needs synchronization, but rather that task creation "happens
before" execution of that task on another thread.

This edge is usually added when the task is scheduled via
`swift_task_enqueue()` (which then usually calls
`swift_task_enqueueGlobal()`).  However, not all task scheduling goes
through the `swift_task_enqueue()` funnel as some places call the more
specific `swift_task_enqueueGlobal()` directly.  So let's annotate this
function (duplicate edges aren't harmful) to ensure we cover all
schedule events, including newly-created tasks (our original problem
here).

rdar://78932849

Co-authored-by: Julian Lettner <julian.lettner@apple.com>
Azoy pushed a commit that referenced this pull request Dec 2, 2021
Case #1. Literal zero = natural alignment
   %1 = integer_literal $Builtin.Int64, 0
   %2 = builtin "uncheckedAssertAlignment"
	(%0 : $Builtin.RawPointer, %1 : $Builtin.Int64) : $Builtin.RawPointer
   %3 = pointer_to_address %2 : $Builtin.RawPointer to [align=1] $*Int

   Erases the `pointer_to_address` `[align=]` attribute:

Case #2. Literal nonzero = forced alignment.

   %1 = integer_literal $Builtin.Int64, 16
   %2 = builtin "uncheckedAssertAlignment"
	(%0 : $Builtin.RawPointer, %1 : $Builtin.Int64) : $Builtin.RawPointer
   %3 = pointer_to_address %2 : $Builtin.RawPointer to [align=1] $*Int

   Promotes the `pointer_to_address` `[align=]` attribute to a higher value.

Case #3. Folded dynamic alignment

   %1 = builtin "alignof"<T>(%0 : $@thin T.Type) : $Builtin.Word
   %2 = builtin "uncheckedAssertAlignment"
	(%0 : $Builtin.RawPointer, %1 : $Builtin.Int64) : $Builtin.RawPointer
   %3 = pointer_to_address %2 : $Builtin.RawPointer to [align=1] $*T

   Erases the `pointer_to_address` `[align=]` attribute.
Azoy pushed a commit that referenced this pull request Dec 13, 2021
Update DifferentiableProgrammingImplementation.md
Azoy pushed a commit that referenced this pull request Sep 25, 2022
While trying to reuse the liveness-points analysis originally in DI for
injecting actor hops for more general purposes, Pavel and I discovered
that the point at which we are injecting the hops might not have
fully-computed the liveness information.

That appears to be the case because we were computing the fully-initialized
points before having processed destroy/releases of TheMemory. While this
most likely had no influence on the actor hop injection, it does affect
what the outgoing AvailabilitySet contains for a block. In particular, for
this example:

```swift
struct X {
  init(cond: Bool) {
    var _storage: (name: String, age: Int)
    _storage.name = ""
    if cond {
      _storage.age = 30
    } else {
      _storage.age = 40
    }
  }
}
```

But because we are determine the full initialization points before processing
the destroy, the liveness analysis doesn't iterate to correctly determine the
out-availability of block 1 and 3 (corresponding to the then and else blocks
in the example above). Here's the debug output showing that issue:

```
*** Definite Init looking at:   %5 = mark_uninitialized [var] %4 : $*(name: String, age: Int) // users: %37, %12, %22, %32

Get liveness 0, #1 at   assign %11 to %13 : $*String                    // id: %14
Get liveness 1, #1 at   assign %21 to %23 : $*Int                       // id: %24
  Get liveness for block 1
    Iteration 0
    Result: (yn)
Get liveness 1, #1 at   assign %31 to %33 : $*Int                       // id: %34
  Get liveness for block 3
    add block 2 to worklist
    Iteration 0
      Block 2 out: (yn)
    Iteration 1
      Block 2 out: (yn)
    Result: (yn)
full-init-finder: rejecting bb0 b/c non-Yes OUT avail
full-init-finder: rejecting bb1 b/c non-Yes OUT avail
full-init-finder: rejecting bb2 b/c no non-load uses.
full-init-finder: rejecting bb3 b/c non-Yes OUT avail
full-init-finder: rejecting bb4 b/c no non-load uses.
Get liveness 0, #2 at   destroy_addr %5 : $*(name: String, age: Int)    // id: %37
  Get liveness for block 4
    add block 3 to worklist
    add block 1 to worklist
    Iteration 0
      Block 1 out: (yy)
      Block 3 out: (yy)
    Iteration 1
      Block 1 out: (yy)
      Block 3 out: (yy)
    Result: (yy)
```

So, this patch basically just sinks the computation so it happens after, so that
we force the incremental liveness analysis to also consider the liveness at the
point of the destroy, but before having done any other transformations or modifications
to the CFG to handle a destroy of something partially initialized.
Azoy pushed a commit that referenced this pull request Nov 25, 2023
Azoy added a commit that referenced this pull request Jan 16, 2024
Co-authored-by: Karoy Lorentey <klorentey@apple.com>
Azoy pushed a commit that referenced this pull request Feb 4, 2024
[Inclusive-Language] change comment with "sanity" to "soundness"
Azoy pushed a commit that referenced this pull request Feb 23, 2024
Azoy pushed a commit that referenced this pull request May 1, 2024
Add a new demangler option which excludes a closure's type signature.

This will be used in lldb.

Closures are not subject to overloading, and so the signature will never be used to 
disambiguate. A demangled closure is uniquely identifiable by its index(s) and parent.

Where opaque types are involved, the concrete type signature can be quite complex. This 
demangling option allows callers to avoid printing the underlying complex nested 
concrete types.

Example:

before: `closure #1 (Swift.Int) -> () in closure #1 (Swift.Int) -> () in main`
after: `closure #1 in closure #1 in main`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants