Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmarks: create a ticket if zebrad mainnet sync time increases significantly #4554

Closed
dconnolly opened this issue Jun 1, 2022 · 5 comments
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-infrastructure Area: Infrastructure changes S-needs-design Status: Needs a design decision S-needs-triage Status: A bug report needs triage

Comments

@dconnolly
Copy link
Contributor

dconnolly commented Jun 1, 2022

Motivation

Sometimes changes (like dependency updates) can affect total sync time, and we want to know this as early in the cycle as possible, so that we don't merge a performance regression.

We currently do a full mainnet sync every week. We could automatically create a ticket if the time increases significantly.

However, this still keeps the slowdown signal late in the process. But it's still better than what we have now.

Possible Implementations

This GitHub Action supports Rust (cargo bench) and can store and render results on the gh-pages branch or a custom branch as a JSON blob in a .js file, or in a separate repo, or in an arbitrary file that gets populated however you like in the workflow, say by the cache Action. Supports failing a workflow + make a comment on a PR when benchmarks fall outside a configurable threshold: https://github.com/benchmark-action/github-action-benchmark/tree/master#alert-comment-on-commit-page

Security

Pushing the the GH pages (if used) on external PRs may be risky: https://github.com/benchmark-action/github-action-benchmark/tree/master#run-only-on-your-branches

@dconnolly dconnolly added C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage A-devops Area: Pipelines, CI/CD and Dockerfiles A-infrastructure Area: Infrastructure changes P-Medium ⚡ and removed C-enhancement Category: This is an improvement labels Jun 1, 2022
@dconnolly dconnolly changed the title Surface zebrad mainnet sync time in CI benchmarks: surface zebrad mainnet sync time in CI Jun 1, 2022
@teor2345 teor2345 added the S-needs-design Status: Needs a design decision label Jun 1, 2022
@teor2345
Copy link
Contributor

teor2345 commented Jun 1, 2022

We currently do a full sync if the other CI jobs pass and the changes get a passing review, and if the full sync is successful, Mergify will proceed with putting the PR into its respective queue. One thing we can do is track the timing results of these full syncs and fail a Mergify merge if the PR goes over the latest benchmark.

At the moment, we're only running a full sync after each PR merges to the main branch:

if: ${{ (github.event_name == 'push' && github.ref_name == 'main') || !fromJSON(needs.get-available-disks.outputs.zebra_tip_disk) || github.event.inputs.run-full-sync == 'true' }}

We decided not to run a full sync on each PR, because then each PR batch would take 6 hours to merge.

We could time the mandatory checkpoint sync instead, it only takes 3 hours.

@teor2345
Copy link
Contributor

teor2345 commented Jun 1, 2022

I've marked this ticket as needing a design, and being blocked by sync performance improvements.

If we manage to decrease sync time by a few hours, that will change the design decisions we make.

@teor2345
Copy link
Contributor

Mainnet zebrad + lightwalletd full sync times are currently available at:
https://github.com/ZcashFoundation/zebra/actions/workflows/continous-integration-docker.yml?query=branch%3Amain+event%3Aschedule

@teor2345 teor2345 changed the title benchmarks: surface zebrad mainnet sync time in CI benchmarks: create a ticket if zebrad mainnet sync time increases significantly Jun 19, 2023
@teor2345 teor2345 reopened this Jun 19, 2023
@teor2345
Copy link
Contributor

I rewrote the issue description instead of closing it, because we still need some way to find out when the sync is slow.

@teor2345
Copy link
Contributor

teor2345 commented Nov 1, 2023

This has been done already by automatically opening a ticket when the weekly scheduled full sync job fails in PR #7565.

We also have an open ticket for doing a sync with a few tens or hundreds of thousands of blocks, and timing it:

@teor2345 teor2345 closed this as completed Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-infrastructure Area: Infrastructure changes S-needs-design Status: Needs a design decision S-needs-triage Status: A bug report needs triage
Projects
Archived in project
Development

No branches or pull requests

3 participants