New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OOL transfers for macOS payloads. #199

Merged
merged 6 commits into from Aug 15, 2018

Conversation

Projects
None yet
6 participants
@jdm
Member

jdm commented May 29, 2018

Fixes #98. This was much easier than figuring out how to calculate if a message would exceed the maximum limit, and all tests continue to pass.

@pcwalton

This comment has been minimized.

Collaborator

pcwalton commented May 29, 2018

Seems fine, r=me. We can fix this later if it turns out to be a perf problem.

src/test.rs Outdated
@@ -428,3 +428,12 @@ fn test_reentrant() {
sender.send(null.clone()).unwrap();
assert_eq!(null, receiver.recv().unwrap());
}
#[test]
fn test_large_send() {

This comment has been minimized.

@antrik

antrik May 29, 2018

Contributor

We don't use test_ prefixes in Rust...

BTW, is there some specific reason why you put this test in src/test.rs rather than src/platform/test.rs, where the existing bid data tests are?

(This would make it easier to just copy the existing big_data() test as huge_data() or so, thus addressing the problems pointed out below...)

src/test.rs Outdated
#[test]
fn test_large_send() {
let payload = vec![5u32; 1024 * 1024 * 12];

This comment has been minimized.

@antrik

antrik May 29, 2018

Contributor

Such a simplistic "pattern" doesn't ensure that the data was actually transferred correctly. IIRC one of the first bugs I fixed in ipc-channel was obscured by that, which is why I modified the existing tests in src/platform/test.rs...

src/test.rs Outdated
let payload = vec![5u32; 1024 * 1024 * 12];
let (tx, rx) = ipc::channel().unwrap();
tx.send(payload.clone()).unwrap();
let received_payload = rx.recv().unwrap();

This comment has been minimized.

@antrik

antrik May 29, 2018

Contributor

This won't fly on platforms that block while sending large messages. (Including unix, as well as the upcoming windows.) See existing big_data() test...

@antrik

This comment has been minimized.

Contributor

antrik commented May 29, 2018

@jdm can you please run the benchmarks, and post before/after results? Intuitively, I would expect messages smaller than a page size to become significantly slower...

@jdm

This comment has been minimized.

Member

jdm commented May 30, 2018

Oh good, benchmarks yield this on master:

test ipc::receiver_set::add_and_remove_100_closed_receivers ... bench:   1,526,152 ns/iter (+/- 306,578)
test ipc::receiver_set::add_and_remove_10_closed_receivers  ... thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Custom(Custom { kind: Other, error: StringError("Unknown Mach error: 3") }) }', libcore/result.rs:916:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: bench failed

@jdm jdm force-pushed the mac-ool branch from 5a2817d to a1bb188 May 30, 2018

@antrik

This comment has been minimized.

Contributor

antrik commented May 30, 2018

@jdm well, now that you mention it, it's kinda expected that benchmarks would fail, since they measure sizes up to 16 MiB... I don't understand though why it panics at this particular point. Maybe the output is misleading due to buffering or something?...

Just try removing the tests that are too big, and check the remaining results I guess?

(Or remove any other problematic tests. Though in that case it would be very desirable to create a proper test case to reproduce the problem...)

@jdm

This comment has been minimized.

Member

jdm commented May 30, 2018

With fixes to the library to allow the benchmarks to complete:

 name                                                    before ns/iter  after ns/iter  diff ns/iter   diff %  speedup
 ipc::receiver_set::add_and_remove_100_closed_receivers  1,823,385       1,436,453          -386,932  -21.22%   x 1.27
 ipc::receiver_set::add_and_remove_10_closed_receivers   111,555         100,058             -11,497  -10.31%   x 1.11
 ipc::receiver_set::add_and_remove_1_closed_receivers    12,855          10,877               -1,978  -15.39%   x 1.18
 ipc::receiver_set::create_and_destroy_empty_set         1,980           1,825                  -155   -7.83%   x 1.08
 ipc::receiver_set::create_and_destroy_set_of_1          9,683           9,660                   -23   -0.24%   x 1.00
 ipc::receiver_set::create_and_destroy_set_of_10         99,107          86,550              -12,557  -12.67%   x 1.15
 ipc::receiver_set::create_and_destroy_set_of_100        1,332,879       1,313,073           -19,806   -1.49%   x 1.02
 ipc::receiver_set::send_on_100_of_100                   816,479         990,687             174,208   21.34%   x 0.82
 ipc::receiver_set::send_on_1_of_1                       2,765           4,420                 1,655   59.86%   x 0.63
 ipc::receiver_set::send_on_1_of_100                     2,833           4,366                 1,533   54.11%   x 0.65
 ipc::receiver_set::send_on_1_of_20                      2,787           4,686                 1,899   68.14%   x 0.59
 ipc::receiver_set::send_on_1_of_5                       2,767           4,453                 1,686   60.93%   x 0.62
 ipc::receiver_set::send_on_20_of_100                    84,450          116,330              31,880   37.75%   x 0.73
 ipc::receiver_set::send_on_20_of_20                     83,894          116,547              32,653   38.92%   x 0.72
 ipc::receiver_set::send_on_2_of_5                       6,249           9,344                 3,095   49.53%   x 0.67
 ipc::receiver_set::send_on_5_of_100                     16,881          27,673               10,792   63.93%   x 0.61
 ipc::receiver_set::send_on_5_of_20                      16,944          27,459               10,515   62.06%   x 0.62
 ipc::receiver_set::send_on_5_of_5                       16,923          28,898               11,975   70.76%   x 0.59
 ipc::transfer_empty                                     2,133           4,333                 2,200  103.14%   x 0.49
 ipc::transfer_receivers_00                              2,307           11,483                9,176  397.75%   x 0.20
 ipc::transfer_receivers_01                              2,682           11,826                9,144  340.94%   x 0.23
 ipc::transfer_receivers_08                              4,548           14,427                9,879  217.22%   x 0.32
 ipc::transfer_receivers_64                              17,836          32,875               15,039   84.32%   x 0.54
 ipc::transfer_senders_00                                2,201           10,900                8,699  395.23%   x 0.20
 ipc::transfer_senders_01                                2,450           11,220                8,770  357.96%   x 0.22
 ipc::transfer_senders_08                                3,187           12,079                8,892  279.01%   x 0.26
 ipc::transfer_senders_64                                7,151           16,628                9,477  132.53%   x 0.43
 platform::create_channel                                6,186           6,984                   798   12.90%   x 0.89
 platform::transfer_data_00_1                            1,932           10,576                8,644  447.41%   x 0.18
 platform::transfer_data_01_2                            1,942           10,320                8,378  431.41%   x 0.19
 platform::transfer_data_02_4                            1,961           9,135                 7,174  365.83%   x 0.21
 platform::transfer_data_03_8                            1,935           8,994                 7,059  364.81%   x 0.22
 platform::transfer_data_04_16                           1,927           9,002                 7,075  367.15%   x 0.21
 platform::transfer_data_05_32                           1,942           9,070                 7,128  367.04%   x 0.21
 platform::transfer_data_06_64                           2,073           8,893                 6,820  328.99%   x 0.23
 platform::transfer_data_07_128                          2,062           9,066                 7,004  339.67%   x 0.23
 platform::transfer_data_08_256                          2,199           9,099                 6,900  313.78%   x 0.24
 platform::transfer_data_09_512                          2,145           9,252                 7,107  331.33%   x 0.23
 platform::transfer_data_10_1k                           2,168           11,614                9,446  435.70%   x 0.19
 platform::transfer_data_11_2k                           2,291           11,181                8,890  388.04%   x 0.20
 platform::transfer_data_12_4k                           3,409           12,503                9,094  266.76%   x 0.27
 platform::transfer_data_13_8k                           11,665          16,044                4,379   37.54%   x 0.73
 platform::transfer_data_14_16k                          21,808          24,807                2,999   13.75%   x 0.88
 platform::transfer_data_15_32k                          30,883          40,757                9,874   31.97%   x 0.76
 platform::transfer_data_16_64k                          50,192          74,566               24,374   48.56%   x 0.67
 platform::transfer_data_17_128k                         76,741          140,521              63,780   83.11%   x 0.55
 platform::transfer_data_18_256k                         128,156         274,191             146,035  113.95%   x 0.47
 platform::transfer_data_19_512k                         225,774         554,454             328,680  145.58%   x 0.41
 platform::transfer_data_20_1m                           462,109         1,127,664           665,555  144.03%   x 0.41
 platform::transfer_data_21_2m                           1,001,955       2,606,104         1,604,149  160.10%   x 0.38
 platform::transfer_data_22_4m                           4,000,435       6,415,716         2,415,281   60.38%   x 0.62
 platform::transfer_data_23_8m                           8,585,005       12,063,144        3,478,139   40.51%   x 0.71
@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

Seeing these results, is anyone still in favour of going forward with this approach?...

I can think of a bunch of venues for improving efficiency here. One potential way would be to attempt optimising the IpcSharedMemory mechanism in general. The copy and allocation in from_bytes() could be avoided I think -- but that would be a rather invasive change (breaking the public interface I believe), since IpcSharedMemory would have to keep a reference to the original memory. Also, AIUI it would preclude use of the deallocate option while sending; and make cloning SHM regions more tricky -- so I'm not sure it would actually be an optimisation... (OTOH, it might in fact avoid the need for cloning?)

Instead of changing the IpcSharedMemory mechanism, we could do the out-of-line transfer manually -- thus allowing the above optimisation with less effort, and without breaking anything. (Not sure whether there are any other optimisations to be done here...)

Either way, I'm pretty sure performance for small messages would still be very poor -- so regardless of any out-of-line transfer improvements, I believe that method must be used only conditionally. Since the code for in-line transfers is already there, I hope adding the conditional shouldn't be too hard... The most interesting question there is how the receiver recognises whether out-of-line transfer was used or not. (Having a zero in-line data size might work? Not sure.)

@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

@jdm BTW, out of curiosity: what kind of machine did you run this on? The results are significantly worse across the board than on my current GNU/Linux system; and some are even worse than on my previous, much slower machine... Mach IPC is sure not known to be very efficient, compared to newer micro-kernels -- but I seriously wouldn't have expected it to do worse than Unix sockets...

@jdm

This comment has been minimized.

Member

jdm commented May 31, 2018

My original change supported both inline and out-of-line transfers, but I couldn't figure out how to automatically determine if an inline transfer would exceed the undocumented port send limits. I'll try just choosing a size and adding a boolean to the message payload that indicates which kind of transfer is in use.

@jdm

This comment has been minimized.

Member

jdm commented May 31, 2018

As for machine in use, this is a 2015 MBP.

@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

Does the kernel return a clear error when attempting to send an overly big message? If so, maybe you could just fall back to OOL in that case... (IIRC the unix back-end did that at one point.)

If you keep track across calls of the size at which this starts happening, this wouldn't even cause significant overhead on repeated huge sends...

@pcwalton

This comment has been minimized.

Collaborator

pcwalton commented May 31, 2018

So I believe the relevant kernel check is:

https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/ipc/ipc_kmsg.c#L1561

Which references the ipc_kmsg_max_body_space value defined here:

https://github.com/apple/darwin-xnu/blob/master/osfmk/ipc/ipc_init.c#L122

That value calculates to ((64*1024*1024*3)/4) == 50331648 bytes, or 48MB.

I think it'd be best to not rely on this and to instead dynamically determine the maximum size. Maybe we could have a global atomic "max safe size" variable that starts at, say, 45MB and if we get MACH_SEND_TOO_LARGE we go ahead and send out-of-line, then halve the value for the next go-round.

@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

@pcwalton that's more or less what I had in mind -- except that instead of starting with a fixed limit and halving on failure, I'd just record "that's the smallest size yet that failed"; so future sends at or above that size will go for OOL directly.

@jdm

This comment has been minimized.

Member

jdm commented May 31, 2018

Benchmarks for a static cutoff of 45mb:

 name                                                    before ns/iter  after2 ns/iter  diff ns/iter   diff %  speedup
 ipc::receiver_set::add_and_remove_100_closed_receivers  1,502,693       1,464,374            -38,319   -2.55%   x 1.03
 ipc::receiver_set::add_and_remove_10_closed_receivers   117,146         99,343               -17,803  -15.20%   x 1.18
 ipc::receiver_set::add_and_remove_1_closed_receivers    12,959          11,513                -1,446  -11.16%   x 1.13
 ipc::receiver_set::create_and_destroy_empty_set         2,129           1,833                   -296  -13.90%   x 1.16
 ipc::receiver_set::create_and_destroy_set_of_1          10,028          9,706                   -322   -3.21%   x 1.03
 ipc::receiver_set::create_and_destroy_set_of_10         88,978          89,486                   508    0.57%   x 0.99
 ipc::receiver_set::create_and_destroy_set_of_100        1,328,397       1,303,029            -25,368   -1.91%   x 1.02
 ipc::receiver_set::send_on_100_of_100                   813,873         820,226                6,353    0.78%   x 0.99
 ipc::receiver_set::send_on_1_of_1                       2,898           2,909                     11    0.38%   x 1.00
 ipc::receiver_set::send_on_1_of_100                     2,835           2,797                    -38   -1.34%   x 1.01
 ipc::receiver_set::send_on_1_of_20                      2,884           2,848                    -36   -1.25%   x 1.01
 ipc::receiver_set::send_on_1_of_5                       2,837           2,803                    -34   -1.20%   x 1.01
 ipc::receiver_set::send_on_20_of_100                    85,857          88,973                 3,116    3.63%   x 0.96
 ipc::receiver_set::send_on_20_of_20                     84,916          89,276                 4,360    5.13%   x 0.95
 ipc::receiver_set::send_on_2_of_5                       6,293           6,571                    278    4.42%   x 0.96
 ipc::receiver_set::send_on_5_of_100                     17,090          16,940                  -150   -0.88%   x 1.01
 ipc::receiver_set::send_on_5_of_20                      17,166          16,899                  -267   -1.56%   x 1.02
 ipc::receiver_set::send_on_5_of_5                       17,078          17,020                   -58   -0.34%   x 1.00
 ipc::transfer_empty                                     2,156           2,166                     10    0.46%   x 1.00
 ipc::transfer_receivers_00                              2,248           2,255                      7    0.31%   x 1.00
 ipc::transfer_receivers_01                              2,657           2,655                     -2   -0.08%   x 1.00
 ipc::transfer_receivers_08                              4,496           4,512                     16    0.36%   x 1.00
 ipc::transfer_receivers_64                              17,995          18,266                   271    1.51%   x 0.99
 ipc::transfer_senders_00                                2,256           2,316                     60    2.66%   x 0.97
 ipc::transfer_senders_01                                2,496           2,613                    117    4.69%   x 0.96
 ipc::transfer_senders_08                                3,191           3,273                     82    2.57%   x 0.97
 ipc::transfer_senders_64                                7,126           7,280                    154    2.16%   x 0.98
 platform::create_channel                                6,236           6,428                    192    3.08%   x 0.97
 platform::transfer_data_00_1                            1,982           1,954                    -28   -1.41%   x 1.01
 platform::transfer_data_01_2                            2,019           1,971                    -48   -2.38%   x 1.02
 platform::transfer_data_02_4                            2,002           1,961                    -41   -2.05%   x 1.02
 platform::transfer_data_03_8                            1,993           1,957                    -36   -1.81%   x 1.02
 platform::transfer_data_04_16                           2,012           2,061                     49    2.44%   x 0.98
 platform::transfer_data_05_32                           2,023           1,963                    -60   -2.97%   x 1.03
 platform::transfer_data_06_64                           2,156           2,108                    -48   -2.23%   x 1.02
 platform::transfer_data_07_128                          2,174           2,108                    -66   -3.04%   x 1.03
 platform::transfer_data_08_256                          2,167           2,192                     25    1.15%   x 0.99
 platform::transfer_data_09_512                          2,159           2,279                    120    5.56%   x 0.95
 platform::transfer_data_10_1k                           2,220           2,247                     27    1.22%   x 0.99
 platform::transfer_data_11_2k                           2,418           2,306                   -112   -4.63%   x 1.05
 platform::transfer_data_12_4k                           3,356           3,517                    161    4.80%   x 0.95
 platform::transfer_data_13_8k                           8,998           8,990                     -8   -0.09%   x 1.00
 platform::transfer_data_14_16k                          19,431          21,430                 1,999   10.29%   x 0.91
 platform::transfer_data_15_32k                          31,892          29,587                -2,305   -7.23%   x 1.08
 platform::transfer_data_16_64k                          49,643          45,732                -3,911   -7.88%   x 1.09
 platform::transfer_data_17_128k                         76,313          76,559                   246    0.32%   x 1.00
 platform::transfer_data_18_256k                         124,937         138,614               13,677   10.95%   x 0.90
 platform::transfer_data_19_512k                         225,821         223,777               -2,044   -0.91%   x 1.01
 platform::transfer_data_20_1m                           461,280         459,871               -1,409   -0.31%   x 1.00
 platform::transfer_data_21_2m                           1,007,193       991,089              -16,104   -1.60%   x 1.02
 platform::transfer_data_22_4m                           3,916,091       3,849,321            -66,770   -1.71%   x 1.02
 platform::transfer_data_23_8m                           8,557,108       8,188,485           -368,623   -4.31%   x 1.05
@jdm

This comment has been minimized.

Member

jdm commented May 31, 2018

It's not really clear to me why so many benchmarks would claim to get faster when the actual code changes mean that they are branching a bit more and allocating a bit more space for every message.

@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

These benchmarks are unfortunately quite noisy -- the discrepancies seem in line with random variations I have seen myself. You'd probably get the same amount of variation between runs with the same code.

Having said that, I have occasionally seen some unexpected variations when doing code changes... My best guess are compiler optimisation anomalies. (Including interactions with the test bench.)

@jdm jdm force-pushed the mac-ool branch 2 times, most recently from 168b949 to 9806974 May 31, 2018

@antrik

Just a quick remark for now. Properly reviewing all the fragile pointer wrangling will take longer I'm afraid...

@@ -309,11 +310,13 @@ enum SendData<'a> {
OutOfLine(Option<OsIpcSharedMemory>),
}
const MAX_INLINE_SIZE: usize = 45 * 1024 * 1024;
lazy_static! {
static ref MAX_INLINE_SIZE: AtomicUsize = AtomicUsize::new(45 * 1024 * 1024);

This comment has been minimized.

@antrik

antrik May 31, 2018

Contributor

I'd rather just go with usize::max_value() for the starting value... Not least to make sure the auto-shrinking code actually gets exercised.

(On that note, to really test this, the test case would need to do a sequence of sends at sizes above and below 48 MiB...)

@antrik

This comment has been minimized.

Contributor

antrik commented May 31, 2018

Fixups look fine; please autosquash, so I can r+ once I'm done reviewing the pointer wrangling.

(Probably won't be today, though...)

@jdm jdm force-pushed the mac-ool branch from 371ebeb to 426524a May 31, 2018

@jdm

This comment has been minimized.

Member

jdm commented Jun 1, 2018

As written, this is causing surprising behaviour in Servo that I'm investigating. Please hold off merging.

@antrik

This comment has been minimized.

Contributor

antrik commented Jun 1, 2018

@jdm I just realised that the performance difference for large transfers might actually be real: since you are sending an extra bool value before the data (which is 4 bytes presumably?), that changes the alignment of the data. For the Unix back-end, I found that ensuring the data is always 8-byte aligned makes quite a difference in some cases...

@antrik

This comment has been minimized.

Contributor

antrik commented Jun 1, 2018

Another thing I just realised is that we do not seem to have any tests for transferring large amounts of data along with channels/SHM regions in the same message... This could be quite relevant here?

@antrik

This comment has been minimized.

Contributor

antrik commented Jun 1, 2018

Actually, that's not entirely true: there is big_data_with_sender_transfer() and big_data_with_{n}_fds() -- but nothing for SHM I think; and the huge data test you added doesn't have either...

bors-servo added a commit that referenced this pull request Jun 5, 2018

Auto merge of #200 - servo:bench, r=pcwalton
Allow benchmarks to run to completion on macOS

These commits fix a series of panics that I encountered while attempting to run benchmarks on macOS. The benchmarks now complete, as can be observed in #199.
@jdm

This comment has been minimized.

Member

jdm commented Jun 5, 2018

The benchmarks look a little more even changing the bool to a usize:

 name                                                    before ns/iter  after2 ns/iter  diff ns/iter   diff %  speedup
 ipc::receiver_set::add_and_remove_100_closed_receivers  1,483,612       1,494,973             11,361    0.77%   x 0.99
 ipc::receiver_set::add_and_remove_10_closed_receivers   104,406         106,324                1,918    1.84%   x 0.98
 ipc::receiver_set::add_and_remove_1_closed_receivers    11,527          11,499                   -28   -0.24%   x 1.00
 ipc::receiver_set::create_and_destroy_empty_set         1,899           2,095                    196   10.32%   x 0.91
 ipc::receiver_set::create_and_destroy_set_of_1          10,261          10,104                  -157   -1.53%   x 1.02
 ipc::receiver_set::create_and_destroy_set_of_10         91,047          91,539                   492    0.54%   x 0.99
 ipc::receiver_set::create_and_destroy_set_of_100        1,370,846       1,372,504              1,658    0.12%   x 1.00
 ipc::receiver_set::send_on_100_of_100                   845,047         938,717               93,670   11.08%   x 0.90
 ipc::receiver_set::send_on_1_of_1                       3,015           3,206                    191    6.33%   x 0.94
 ipc::receiver_set::send_on_1_of_100                     3,075           3,042                    -33   -1.07%   x 1.01
 ipc::receiver_set::send_on_1_of_20                      2,964           3,255                    291    9.82%   x 0.91
 ipc::receiver_set::send_on_1_of_5                       2,975           3,184                    209    7.03%   x 0.93
 ipc::receiver_set::send_on_20_of_100                    86,980          84,308                -2,672   -3.07%   x 1.03
 ipc::receiver_set::send_on_20_of_20                     91,205          85,310                -5,895   -6.46%   x 1.07
 ipc::receiver_set::send_on_2_of_5                       7,455           7,027                   -428   -5.74%   x 1.06
 ipc::receiver_set::send_on_5_of_100                     20,428          18,301                -2,127  -10.41%   x 1.12
 ipc::receiver_set::send_on_5_of_20                      18,857          19,264                   407    2.16%   x 0.98
 ipc::receiver_set::send_on_5_of_5                       17,391          18,914                 1,523    8.76%   x 0.92
 ipc::transfer_empty                                     2,352           2,479                    127    5.40%   x 0.95
 ipc::transfer_receivers_00                              2,295           2,598                    303   13.20%   x 0.88
 ipc::transfer_receivers_01                              2,721           3,049                    328   12.05%   x 0.89
 ipc::transfer_receivers_08                              4,853           4,927                     74    1.52%   x 0.98
 ipc::transfer_receivers_64                              21,182          19,120                -2,062   -9.73%   x 1.11
 ipc::transfer_senders_00                                2,355           2,349                     -6   -0.25%   x 1.00
 ipc::transfer_senders_01                                2,515           2,672                    157    6.24%   x 0.94
 ipc::transfer_senders_08                                3,480           3,687                    207    5.95%   x 0.94
 ipc::transfer_senders_64                                9,680           7,567                 -2,113  -21.83%   x 1.28
 platform::create_channel                                7,155           6,535                   -620   -8.67%   x 1.09
 platform::transfer_data_00_1                            2,042           2,026                    -16   -0.78%   x 1.01
 platform::transfer_data_01_2                            2,309           2,041                   -268  -11.61%   x 1.13
 platform::transfer_data_02_4                            2,342           2,076                   -266  -11.36%   x 1.13
 platform::transfer_data_03_8                            2,273           2,030                   -243  -10.69%   x 1.12
 platform::transfer_data_04_16                           2,312           2,066                   -246  -10.64%   x 1.12
 platform::transfer_data_05_32                           2,255           2,004                   -251  -11.13%   x 1.13
 platform::transfer_data_06_64                           2,349           2,186                   -163   -6.94%   x 1.07
 platform::transfer_data_07_128                          2,366           2,467                    101    4.27%   x 0.96
 platform::transfer_data_08_256                          2,652           2,501                   -151   -5.69%   x 1.06
 platform::transfer_data_09_512                          2,658           2,498                   -160   -6.02%   x 1.06
 platform::transfer_data_10_1k                           2,638           2,496                   -142   -5.38%   x 1.06
 platform::transfer_data_11_2k                           2,578           2,580                      2    0.08%   x 1.00
 platform::transfer_data_12_4k                           3,671           3,692                     21    0.57%   x 0.99
 platform::transfer_data_13_8k                           10,799          13,335                 2,536   23.48%   x 0.81
 platform::transfer_data_14_16k                          23,440          21,213                -2,227   -9.50%   x 1.10
 platform::transfer_data_15_32k                          36,161          28,575                -7,586  -20.98%   x 1.27
 platform::transfer_data_16_64k                          46,384          45,990                  -394   -0.85%   x 1.01
 platform::transfer_data_17_128k                         86,297          88,275                 1,978    2.29%   x 0.98
 platform::transfer_data_18_256k                         228,817         140,290              -88,527  -38.69%   x 1.63
 platform::transfer_data_19_512k                         408,121         242,042             -166,079  -40.69%   x 1.69
 platform::transfer_data_20_1m                           1,012,572       513,054             -499,518  -49.33%   x 1.97
 platform::transfer_data_21_2m                           1,177,899       1,107,254            -70,645   -6.00%   x 1.06
 platform::transfer_data_22_4m                           4,684,950       4,374,613           -310,337   -6.62%   x 1.07
 platform::transfer_data_23_8m                           9,263,118       9,372,671            109,553    1.18%   x 0.99

@jdm jdm force-pushed the mac-ool branch from 426524a to 49bcb32 Jun 5, 2018

@jdm

This comment has been minimized.

Member

jdm commented Jun 5, 2018

The code is ready for consideration.

@pcwalton

This comment has been minimized.

Collaborator

pcwalton commented Jun 6, 2018

Nice, thanks for doing all this work @jdm!

@antrik

This comment has been minimized.

Contributor

antrik commented Jun 7, 2018

@jdm doesn't really look more even to me... Would need several runs to get any meaningful comparison though, since random fluctuations between runs are clearly much larger than any actual performance changes there might be :-(

(You might get somewhat more consistent results by temporarily increasing ITERATIONS in benches/bench.rs to 10 or 100... Only helps a little though in my experience. Most of the fluctuations are probably related to cache alignment differences between runs, or something like that.)

@antrik

This comment has been minimized.

Contributor

antrik commented Jun 7, 2018

Just to be clear: I don't think it's actually necessary to run more benchmarks, unless you are curious. The current PR shouldn't have any measurable performance impact really as far as I can tell.

let data_size_dest = data_dest as *mut usize;
*data_size_dest = data_size;
let is_inline_dest = data_dest as *mut usize;
*is_inline_dest = data.is_inline() as usize;

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

AIUI, you are using usize now to work around a compiler bug? I'd say that deserves a comment...

This comment has been minimized.

@jdm

jdm Jun 7, 2018

Member

No, that was only for data alignment purposes.

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

I think there was a bit of a misunderstanding here. I only brought up alignment changes as a possible explanation for differences in particular benchmark results -- I didn't mean to suggest that you should try to force alignment. For that, using usize for the flag field wouldn't be enough: the Mach port types use 32 bit values, and we have a variable amount of them in the header -- so while using bool does change alignment, whether the change is for worse or for better depends on the situation. To actually force optimal alignment, we'd have to do some sort of adaptive padding.

(Also, usize wouldn't preserve 8-byte alignment on 32 bit systems -- but that probably doesn't matter for the macos back-end?...)

data_dest = data_dest.offset(mem::size_of::<usize>() as isize);
ptr::copy_nonoverlapping(data.as_ptr(), data_dest, data_size);
data_dest = data_dest.offset(mem::size_of::<usize>() as isize);

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

While this is not wrong, I think it could be pretty hard to read, and even more fragile than necessary... Why not increment is_inline_dest before converting it to data_dest, thus keeping it consistent with the other offset operations?

Or at least add a comment pointing out that this increment is for the "is inline" flag, since that's somewhat non-obvious here...

@@ -483,9 +538,19 @@ impl OsIpcSender {
MACH_MSG_TIMEOUT_NONE,
MACH_PORT_NULL);
libc::free(message as *mut _);
if os_result == MACH_SEND_TOO_LARGE && data.is_inline() {
MAX_INLINE_SIZE.store(data.inline_data().len(), Ordering::Relaxed);

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

It seems to me there is a bit of a race condition here: if another thread stored a smaller size in the mean time, we would increase it again here...

(Which in turn might confuse the other thread while it's doing its resend.)

if os_result != MACH_MSG_SUCCESS {
return Err(MachError::from(os_result))
}
for outgoing_port in ports {
mem::forget(outgoing_port);

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

I believe this changes error handling behaviour: previously, mem::forget() was applied regardless whether the send was successful or not -- now, it's only done for a successful send; while an unsuccessful one will attempt to drop the ports.

If I'm reading the documentation right, neither approach is strictly correct, since depending on the exact error, the kernel might have either freed the ports or not... Yet attempting to free ports that have already been freed should be unproblematic I think, while the other option means potentially leaking ports -- so this change is probably for the better.

However, since it's orthogonal to the purpose of the commit, and it might have undesirable side effects, I don't think it's a good idea to silently do this along with other changes. It should go into a separate commit at least I'd say...

assert!(payload_size <= max_payload_size);
payload_ptr = payload_ptr.offset(mem::size_of::<usize>() as isize);
let payload = slice::from_raw_parts(payload_ptr, payload_size).to_vec();
let has_inline_data_ptr = shared_memory_descriptor as *mut u8;

This comment has been minimized.

@antrik

antrik Jun 7, 2018

Contributor

Similar remark as for the sender side: I think it would be cleaner to cast it to usize, and then cast it again after we are done with this field.

@jdm jdm force-pushed the mac-ool branch from 49bcb32 to fe429b8 Jul 19, 2018

@jdm

This comment has been minimized.

Member

jdm commented Jul 19, 2018

All comments addressed in e191a14, 62a9222, and fe429b8.

@jdm

This comment has been minimized.

Member

jdm commented Aug 15, 2018

@pcwalton Can you sign off on these changes?

@pcwalton

This comment has been minimized.

Collaborator

pcwalton commented Aug 15, 2018

Looks fine to me.

@jdm

This comment has been minimized.

Member

jdm commented Aug 15, 2018

@bors-servo r=pcwalton

@bors-servo

This comment has been minimized.

Contributor

bors-servo commented Aug 15, 2018

📌 Commit fe429b8 has been approved by pcwalton

@highfive highfive assigned pcwalton and unassigned metajack Aug 15, 2018

@bors-servo

This comment has been minimized.

Contributor

bors-servo commented Aug 15, 2018

⌛️ Testing commit fe429b8 with merge 92667fa...

bors-servo added a commit that referenced this pull request Aug 15, 2018

Auto merge of #199 - servo:mac-ool, r=pcwalton
Use OOL transfers for macOS payloads.

Fixes #98. This was much easier than figuring out how to calculate if a message would exceed the maximum limit, and all tests continue to pass.
@bors-servo

This comment has been minimized.

Contributor

bors-servo commented Aug 15, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: pcwalton
Pushing 92667fa to master...

@bors-servo bors-servo merged commit fe429b8 into master Aug 15, 2018

4 of 6 checks passed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
continuous-integration/travis-ci/push The Travis CI build could not complete due to an error
Details
Tidelift Dependencies checked
Details
continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
homu Test successful
Details

@jdm jdm referenced this pull request Sep 28, 2018

Closed

Release minor version 0.11.1. #206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment