Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design & implement stress test tool for aggregator - phase 1 #991

Closed
22 of 23 tasks
Tracked by #904
jpraynaud opened this issue Jun 16, 2023 · 2 comments
Closed
22 of 23 tasks
Tracked by #904

Design & implement stress test tool for aggregator - phase 1 #991

jpraynaud opened this issue Jun 16, 2023 · 2 comments
Assignees
Labels
D-medium Difficulty: medium prototype 🛠️ Prototype/PoC of a feature task

Comments

@jpraynaud
Copy link
Member

jpraynaud commented Jun 16, 2023

Issue

We want to design a simple stress test tool for the aggregator and implement it in its first version.
The idea is to be able to verify that under a load of ~100 simulated signers, the aggregator:

  • Handles all Signer Key Registrations without error in a reasonable time (<30s per request)
  • Handles all Signer Signatures Registrations without error in a reasonable time (<30s per request)
  • Creates the associated certificates and artifacts for each signed entity type (<300s to retrieve the information)
  • All of these expectations are under a load of ~10-100 simulated clients requesting the certificates and artifacts routes, retrieving responses in a reasonable time (<10s per request)

Note: The simulated signers and clients should be represented by the HTTP calls they do on the aggregator REST API

Also, we want to understand and keep track of the limitations of this design which is expected to run on a single computer, and that will be enhanced in future issues. We will probably add retry strategies, ...

Nominal scenario

  • Phase 1: Run Without Clients

  • 1. Create the Signer Fixtures

  • 2. Launch aggregator

  • 3. Wait for the aggregator REST API to be up (/epoch-settings route is answering 200)

  • 4. Create and send the Signer Key Registrations payloads

  • 5. Move the aggregator 1 epoch forward and wait for the aggregator to be at new epoch (/epoch-settings route)

  • 6. Create and send the Signer Key Registrations payloads

  • 7. Compute genesis certificate (no need to stop the aggregator)

  • 8. Move the aggregator 1 epoch forward and wait for the aggregator to be at new epoch (/epoch-settings route)

  • 9. Create and send the Signer Key Registrations payloads

  • 10. Wait for pending certificate to be available for Mithril Stake Distribution (/certificate-pending route is answering 200)

  • 11. Create the Signer Signatures Registrations payloads

  • 12. Send the Signer Signatures Registrations payloads

  • 13. Wait for the certificate to be available for Mithril Stake Distribution (/certificates route is answering 200)

  • 14. Wait for the artifact to be available for Mithril Stake Distribution (/artifact/mithril-stake-distributions route is answering 200)

  • 15. Wait for pending certificate to be available for Cardano Immutable Files (/certificate-pending route is answering 200)

  • 16. Create the Signer Signatures Registrations payloads

  • 17. Send the Signer Signatures Registrations payloads

  • 18. Wait for the certificate to be available for Cardano Immutable Files (/certificates route is answering 200)

  • 19. Wait for the artifact to be available for Cardano Immutable Files (/artifact/snapshots route is answering 200)

  • Phase 2: Run With Clients

  1. Create the Clients payloads
  2. Send in a separate thread the Clients payload with pseudo-random pace
  3. Move the aggregator 1 epoch forward
  4. Create and send the Signer Key Registrations payloads
  5. Start same as Phase 1 step 4

To do

  • Compute statistics for HTTP Client requests (min/max, avg, std, percentiles, ...) (e.g. with rewrk crate)
  • Compute summary statistics for the whole test
  • Measure performance of aggregator process and compute statistics

Other tasks

  • Update the README with command to run the stress test

Definition of Success/Failure

  • The test fails if:
    • Any step fails with a timeout
    • The aggregator crashes
  • The test succeeds otherwise

We will display statistics for each eligible step (error rates, error distributions, min/max, ...) and for clients requests (To be defined)

Design

Here is the design that we have created:
Image

Parent issue

#904

@jpraynaud jpraynaud added dev 💪 prototype 🛠️ Prototype/PoC of a feature D-medium Difficulty: medium task labels Jun 16, 2023
@jpraynaud jpraynaud changed the title PoC Stress test tool with E2E test Design & implement basic stress test tool for aggregator Jun 19, 2023
@jpraynaud
Copy link
Member Author

A test branch has been pushed here with the work from our afternoon session:
https://github.com/input-output-hk/mithril/tree/ensemble/991-stress-test-aggregator
@abailly-iohk @Alenar @ghubertpalo

@jpraynaud
Copy link
Member Author

Identified bottlenecks

  1. During single signatures registrations, the performance are degraded as the multi-signer is re-computed constantly (and the time to create it is quadratic with the number of registered signers)

@jpraynaud jpraynaud changed the title Design & implement basic stress test tool for aggregator Design & implement stress test tool for aggregator Jul 31, 2023
@jpraynaud jpraynaud changed the title Design & implement stress test tool for aggregator Design & implement stress test tool for aggregator - phase 1 Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
D-medium Difficulty: medium prototype 🛠️ Prototype/PoC of a feature task
Projects
None yet
Development

No branches or pull requests

4 participants