Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mempool] Transaction validation in parallel #12940

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

sitalkedia
Copy link
Contributor

@sitalkedia sitalkedia commented Apr 18, 2024

Description

Transaction validation is happening in serial today, which hurts the latency when validation is expensive for a certain type of transaction. This should help alleviate the issue.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Test with Forge. Average transaction processing latency goes down from 15 ms (https://aptoslabs.grafana.net/goto/3nJ9oVfIg?orgId=1) to 7 ms after this change (https://aptoslabs.grafana.net/goto/3nJ9oVfIg?orgId=1)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Apr 18, 2024

⏱️ 12h 2m total CI duration on this PR
Job Cumulative Duration Recent Runs
forge-e2e-test / forge 2h 14m 🟥🟥🟥🟩🟩 (+1 more)
rust-smoke-tests 1h 55m 🟩🟩🟥🟩
execution-performance / single-node-performance 1h 40m 🟩🟩🟩🟩
rust-images / rust-all 51m 🟩🟩🟩🟩
rust-targeted-unit-tests 49m 🟩🟩🟩🟩
forge-compat-test / forge 46m 🟩🟩🟩🟩
cli-e2e-tests / run-cli-tests 35m 🟥🟥🟥🟩
rust-lints 29m 🟩🟩🟩🟩
rust-unit-tests 27m 🟩
rust-move-tests 25m 🟩🟩🟩🟩
windows-build 25m 🟩
run-tests-main-branch 18m 🟩🟩🟩🟩
rust-build-cached-packages 17m 🟩🟩🟩🟩
check 16m 🟩🟩🟩🟩
check-dynamic-deps 10m 🟩🟩🟩🟩🟩
general-lints 8m 🟩🟩🟩🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 7m 🟩🟩🟥🟩
node-api-compatibility-tests / node-api-compatibility-tests 3m 🟩🟩🟩🟩
semgrep/ci 2m 🟩🟩🟩🟩🟩
file_change_determinator 51s 🟩🟩🟩🟩🟩
file_change_determinator 51s 🟩🟩🟩🟩🟩
file_change_determinator 45s 🟩🟩🟩🟩
execution-performance / file_change_determinator 40s 🟩🟩🟩🟩
permission-check 23s 🟩🟩🟩🟩🟩
permission-check 16s 🟩🟩🟩🟩🟩
permission-check 15s 🟩🟩🟩🟩
permission-check 15s 🟩🟩🟩🟩🟩
determine-docker-build-metadata 12s 🟩🟩🟩🟩
permission-check 8s 🟩🟩🟩🟩

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
forge-compat-test / forge 10m 14m -32%
rust-move-tests 4m 11m -63%

settingsfeedbackdocs ⋅ learn more about trunk.io

@sitalkedia sitalkedia added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Apr 18, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@zekun000
Copy link
Contributor

hmm, it was parallel before, I don't remember whether I intentionally or accidentally changed it back in
4e2832e

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@sitalkedia
Copy link
Contributor Author

Average transaction processing latency goes down from 15 ms (https://aptoslabs.grafana.net/goto/3nJ9oVfIg?orgId=1) to 7 ms after this change (https://aptoslabs.grafana.net/goto/3nJ9oVfIg?orgId=1)

@sitalkedia sitalkedia enabled auto-merge (squash) April 22, 2024 20:24

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.10.1 ==> d12090bac4cac2d1827fa5258a6e2165edc4ab8c

Compatibility test results for aptos-node-v1.10.1 ==> d12090bac4cac2d1827fa5258a6e2165edc4ab8c (PR)
1. Check liveness of validators at old version: aptos-node-v1.10.1
compatibility::simple-validator-upgrade::liveness-check : committed: 6837 txn/s, latency: 4888 ms, (p50: 4800 ms, p90: 8100 ms, p99: 9900 ms), latency samples: 239300
2. Upgrading first Validator to new version: d12090bac4cac2d1827fa5258a6e2165edc4ab8c
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1751 txn/s, latency: 16217 ms, (p50: 19000 ms, p90: 22400 ms, p99: 22900 ms), latency samples: 91100
3. Upgrading rest of first batch to new version: d12090bac4cac2d1827fa5258a6e2165edc4ab8c
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1766 txn/s, latency: 16412 ms, (p50: 18900 ms, p90: 22500 ms, p99: 22800 ms), latency samples: 91840
4. upgrading second batch to new version: d12090bac4cac2d1827fa5258a6e2165edc4ab8c
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3235 txn/s, latency: 9694 ms, (p50: 10200 ms, p90: 12300 ms, p99: 12600 ms), latency samples: 129420
5. check swarm health
Compatibility test for aptos-node-v1.10.1 ==> d12090bac4cac2d1827fa5258a6e2165edc4ab8c passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on d12090bac4cac2d1827fa5258a6e2165edc4ab8c

two traffics test: inner traffic : committed: 7650 txn/s, latency: 5125 ms, (p50: 5100 ms, p90: 6000 ms, p99: 9900 ms), latency samples: 3305180
two traffics test : committed: 100 txn/s, latency: 1973 ms, (p50: 1900 ms, p90: 2200 ms, p99: 7500 ms), latency samples: 1720
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.209, avg: 0.203", "QsPosToProposal: max: 0.260, avg: 0.235", "ConsensusProposalToOrdered: max: 0.457, avg: 0.431", "ConsensusOrderedToCommit: max: 0.427, avg: 0.397", "ConsensusProposalToCommit: max: 0.849, avg: 0.828"]
Max round gap was 1 [limit 4] at version 1586319. Max no progress secs was 4.982531 [limit 15] at version 1586319.
Test Ok

@sitalkedia sitalkedia merged commit 43a0d92 into main Apr 22, 2024
94 of 97 checks passed
@sitalkedia sitalkedia deleted the mempool_parallel_validation branch April 22, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants