Skip to content

Remove message boxing for performance improvement#387

Draft
slawlor wants to merge 1 commit intomainfrom
messaging_perf
Draft

Remove message boxing for performance improvement#387
slawlor wants to merge 1 commit intomainfrom
messaging_perf

Conversation

@slawlor
Copy link
Copy Markdown
Owner

@slawlor slawlor commented Aug 20, 2025

This is an extension of the message boxing approach proposed in PR #370. This however continues the improvements to remove additional type information that isn't necessary.

This diff includes the removal of the BoxedMessage struct in favor of a crate-private RactorMessage trait which adds the necessary extension methods, only within the ractor
context so as the privatize that functionality, as it was an unintentional leak anyways. This of course is a semver break as a public type is going away, but it should have been unused
anyways in all contexts. Additionally BoxedDowncastErr is renamed to DowncastErr as we are no longer boxing the message, but again this error variant shouldn't really be used publicly.

Will re-run the performance benchmarks present in #370 and past as a PR comment once ready

@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 92.90780% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.35%. Comparing base (e682651) to head (e11daa7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ractor_cluster_derive/src/lib.rs 95.38% 12 Missing ⚠️
ractor/src/actor/actor_properties.rs 92.30% 6 Missing ⚠️
ractor/src/factory/job.rs 45.45% 6 Missing ⚠️
ractor/src/factory.rs 33.33% 2 Missing ⚠️
ractor/src/message.rs 92.30% 2 Missing ⚠️
ractor/src/actor/tests.rs 88.88% 1 Missing ⚠️
ractor/src/thread_local/inner.rs 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #387      +/-   ##
==========================================
+ Coverage   85.99%   86.35%   +0.35%     
==========================================
  Files          73       71       -2     
  Lines       13564    13517      -47     
==========================================
+ Hits        11665    11673       +8     
+ Misses       1899     1844      -55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@slawlor slawlor force-pushed the messaging_perf branch 2 times, most recently from 8dd6003 to 0ac9bc9 Compare August 20, 2025 21:35
@slawlor
Copy link
Copy Markdown
Owner Author

slawlor commented Aug 20, 2025

Performance tests

cargo clean && cargo bench -p ractor --profile release --bench actor

Main branch

Excluding the output port tests (because they take forever and should be isolated anyways, which this PR does properly)

Creation of 100 actors  time:   [227.75 µs 234.65 µs 240.96 µs]
Creation of 10000 actors
                        time:   [18.840 ms 19.632 ms 20.495 ms]
Waiting on 100 actors to process first message
                        time:   [865.42 µs 898.55 µs 937.11 µs]
Waiting on 1000 actors to process first message
                        time:   [10.389 ms 10.985 ms 11.657 ms]
Waiting on 100000 messages to be processed
                        time:   [13.157 ms 13.258 ms 13.372 ms]

this branch

Creation of 100 actors  time:   [271.34 µs 282.47 µs 293.46 µs]
Creation of 10000 actors
                        time:   [27.574 ms 28.371 ms 29.157 ms]
Waiting on 100 actors to process first message
                        time:   [940.67 µs 974.26 µs 1.0130 ms]
Waiting on 1000 actors to process first message
                        time:   [9.6253 ms 9.8824 ms 10.179 ms]
Waiting on 100000 messages to be processed
                        time:   [12.740 ms 13.294 ms 14.012 ms]

@slawlor
Copy link
Copy Markdown
Owner Author

slawlor commented Aug 20, 2025

Added a new benchmark for message sending to assess if sending (but not receiving) messages is improved with this change.

This branch

Sending of 1000 messages to an actor
                        time:   [46.212 µs 47.399 µs 48.730 µs]
Sending of 100000 messages to an actor
                        time:   [5.1381 ms 5.3625 ms 5.6080 ms]

main

Patching this benchmark suite on main and running there yields:

Sending of 1000 messages to an actor
                       time:   [70.656 µs 73.087 µs 75.863 µs]
Sending of 100000 messages to an actor
                       time:   [5.6418 ms 5.8865 ms 6.1540 ms]

slawlor added a commit that referenced this pull request Mar 9, 2026
This resolves conflicts between the main branch (with thread-local supervision fixes and version bumps) and PR #387 which refactors message handling to remove boxing complexity for better performance.
slawlor added a commit that referenced this pull request Mar 9, 2026
After merging PR #387 which removed message boxing, the send_message method in ActorRef needs to call send_message directly instead of the removed send_message_unchecked method.
@slawlor slawlor force-pushed the messaging_perf branch 6 times, most recently from 7b08703 to 20f4c26 Compare March 10, 2026 14:36
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

Benchmark Comparison (PR vs main)

Results (collapse to hide)
group                                                                   main                                     new
-----                                                                   ----                                     ---
Creation of 100 actors                                                  1.00   478.5±40.54µs        ? ?/sec      1.17   558.5±41.01µs        ? ?/sec
Creation of 1000 actors                                                 1.00      4.1±0.46ms        ? ?/sec      1.33      5.5±0.34ms        ? ?/sec
Sending of 1000 messages to an actor                                                                             1.00   114.3±11.29µs        ? ?/sec
Sending of 100000 messages to an actor                                                                           1.00     11.5±2.80ms        ? ?/sec
Waiting on 100 actors to process first message                          1.22    98.6±23.27µs        ? ?/sec      1.00    80.6±26.21µs        ? ?/sec
Waiting on 100000 messages to be processed                              1.07     28.8±2.42ms        ? ?/sec      1.00     26.9±2.84ms        ? ?/sec
Waiting on 50 messages with large data in the Future to be processed    1.00     65.4±4.45µs        ? ?/sec      1.45    94.5±24.55µs        ? ?/sec
Waiting on 500 actors to process first message                          1.38   248.5±48.70µs        ? ?/sec      1.00   179.9±48.54µs        ? ?/sec
factory_queuer_dispatch/workers/10                                      1.10  1395.4±74.95µs 349.9 KElem/sec     1.00  1267.7±100.21µs 385.2 KElem/sec
factory_queuer_dispatch/workers/100                                     1.15  1423.5±49.59µs 343.0 KElem/sec     1.00  1238.1±28.54µs 394.4 KElem/sec
large_messages/256_byte_messages                                        1.00     57.8±3.21µs 1688.3 KElem/sec    1.13     65.6±5.99µs 1488.7 KElem/sec
large_messages/64_byte_messages                                         1.01     53.8±3.06µs 1816.2 KElem/sec    1.00     53.4±4.08µs 1829.1 KElem/sec
small_messages/16_byte_messages                                         1.03     52.1±2.70µs 1873.2 KElem/sec    1.00     50.7±5.69µs 1924.9 KElem/sec
small_messages/32_byte_messages                                         1.04     53.1±2.97µs 1837.8 KElem/sec    1.00     51.3±4.15µs 1904.0 KElem/sec
small_messages/8_byte_messages                                          1.05     54.3±2.40µs 1799.2 KElem/sec    1.00     51.6±4.42µs 1893.1 KElem/sec
spawn_throughput/10                                                     1.00     56.6±0.57µs 172.6 KElem/sec     1.17     66.2±3.17µs 147.5 KElem/sec
spawn_throughput/100                                                    1.00    432.2±8.97µs 226.0 KElem/sec     1.36   586.6±14.87µs 166.5 KElem/sec

Note: a negative delta indicates the PR is faster (an
improvement); a positive delta means the PR is slower (a
regression). Benchmarks are grouped (e.g., small_messages,
spawn_throughput); the flattened list below shows every
individual comparison for clearer deltas.

=== per-benchmark breakdown ===

Creation of 100 actors
----------------------
main     1.00    478.5±40.54µs       ? ?/sec
new      1.17    558.5±41.01µs       ? ?/sec

Creation of 1000 actors
-----------------------
main     1.00       4.1±0.46ms       ? ?/sec
new      1.33       5.5±0.34ms       ? ?/sec

Sending of 1000 messages to an actor
------------------------------------
new     1.00    114.3±11.29µs       ? ?/sec

Sending of 100000 messages to an actor
--------------------------------------
new     1.00      11.5±2.80ms       ? ?/sec

Waiting on 100 actors to process first message
----------------------------------------------
new      1.00     80.6±26.21µs       ? ?/sec
main     1.22     98.6±23.27µs       ? ?/sec

Waiting on 100000 messages to be processed
------------------------------------------
new      1.00      26.9±2.84ms       ? ?/sec
main     1.07      28.8±2.42ms       ? ?/sec

Waiting on 50 messages with large data in the Future to be processed
--------------------------------------------------------------------
main     1.00      65.4±4.45µs       ? ?/sec
new      1.45     94.5±24.55µs       ? ?/sec

Waiting on 500 actors to process first message
----------------------------------------------
new      1.00    179.9±48.54µs       ? ?/sec
main     1.38    248.5±48.70µs       ? ?/sec

factory_queuer_dispatch/workers/10
----------------------------------
new      1.00  1267.7±100.21µs  385.2 KElem/sec
main     1.10   1395.4±74.95µs  349.9 KElem/sec

factory_queuer_dispatch/workers/100
-----------------------------------
new      1.00   1238.1±28.54µs  394.4 KElem/sec
main     1.15   1423.5±49.59µs  343.0 KElem/sec

large_messages/256_byte_messages
--------------------------------
main     1.00      57.8±3.21µs  1688.3 KElem/sec
new      1.13      65.6±5.99µs  1488.7 KElem/sec

large_messages/64_byte_messages
-------------------------------
new      1.00      53.4±4.08µs  1829.1 KElem/sec
main     1.01      53.8±3.06µs  1816.2 KElem/sec

small_messages/16_byte_messages
-------------------------------
new      1.00      50.7±5.69µs  1924.9 KElem/sec
main     1.03      52.1±2.70µs  1873.2 KElem/sec

small_messages/32_byte_messages
-------------------------------
new      1.00      51.3±4.15µs  1904.0 KElem/sec
main     1.04      53.1±2.97µs  1837.8 KElem/sec

small_messages/8_byte_messages
------------------------------
new      1.00      51.6±4.42µs  1893.1 KElem/sec
main     1.05      54.3±2.40µs  1799.2 KElem/sec

spawn_throughput/10
-------------------
main     1.00      56.6±0.57µs  172.6 KElem/sec
new      1.17      66.2±3.17µs  147.5 KElem/sec

spawn_throughput/100
--------------------
main     1.00     432.2±8.97µs  226.0 KElem/sec
new      1.36    586.6±14.87µs  166.5 KElem/sec

Benchmarks run on shared CI hardware (ubuntu-latest).
Differences under ~5% may be noise.

@slawlor
Copy link
Copy Markdown
Owner Author

slawlor commented Mar 10, 2026

Claude's new perf analysis

  Performance Comparison: messaging_perf vs main

  ┌─────────────────────────────────────────┬───────────┬────────────────┬───────────┐
  │                Benchmark                │   main    │ messaging_perf │  Change   │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Creation of 100 actors                  │ 142.15 µs │ 126.44 µs      │ −11.1% ✅ │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Creation of 10000 actors                │ 12.193 ms │ 11.210 ms      │ −8.1% ✅  │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Waiting on 100 actors (schedule)        │ 485.62 µs │ 587.53 µs      │ +21% ⚠️   │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Waiting on 1000 actors (schedule)       │ 4.9222 ms │ 5.1070 ms      │ +3.75%    │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Waiting on 100000 messages (throughput) │ 14.647 ms │ 13.387 ms      │ −8.6% ✅  │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Sending of 1000 messages                │ —         │ 32.35 µs       │ new       │
  ├─────────────────────────────────────────┼───────────┼────────────────┼───────────┤
  │ Sending of 100000 messages              │ —         │ 2.96 ms        │ new       │
  └─────────────────────────────────────────┴───────────┴────────────────┴───────────┘

@slawlor slawlor force-pushed the messaging_perf branch 2 times, most recently from b1b1494 to 933fd62 Compare March 10, 2026 17:12
This is an extension of the message boxing approach proposed in PR #370. This however continues the improvements to remove additional type information that isn't necessary.

This diff includes the removal of the BoxedMessage struct in favor of a crate-private RactorMessage trait which adds the necessary extension methods, only within the `ractor`
context so as the privatize that functionality, as it was an unintentional leak anyways. This of course is a semver break as a public type is going away, but it should have been unused
anyways in all contexts. Additionally `BoxedDowncastErr` is renamed to `DowncastErr` as we are no longer boxing the message, but again this error variant shouldn't really be used publicly.

Will re-run the performance benchmarks present in #370 and past as a PR comment once ready
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant