Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test_long_chain_with_restart_from_snapshot #10193

Merged
merged 5 commits into from Nov 20, 2023
Merged

Conversation

posvyatokum
Copy link
Member

@posvyatokum posvyatokum commented Nov 17, 2023

The main motivation for the test is to see how garbage collection will behave without caches if we restart from the middle of an epoch.

github-merge-queue bot pushed a commit that referenced this pull request Nov 17, 2023
… prev_epoch_id (#10195)

We had an issue in the `clear_resharding_data` while fetching the
`prev_epoch_id`. The implementation of `prev_epoch_id` relied on
fetching the block_info of the last block of prev_epoch to get the
epoch_id. This unfortunately failed for the case of GC as the block_info
was already garbage collected.

The new implementation here relies on using the block_header to get the
epoch_id instead of block_info.

This was unfortunately only caught in mocknet and not integration tests
as having a small enough epoch_length lead to the block_info being
cached in the epoch_manager (even though it was GC'd)

Zulip post:
https://near.zulipchat.com/#narrow/stream/295558-pagoda.2Fcore/topic/Master.20binary.20Can't.20clear.20old.20data/near/402240517

Pending: Figure out a way to include test from this PR:
#10193
We would need to enable no_cache feature for the test.
@posvyatokum
Copy link
Member Author

Adding test that should fail before the fix mentioned above. But the CI is green.
The fail is supposed to come from log_assert! here

log_assert!(result.is_ok(), "Can't clear old data, {:?}", result);

log_assert! only panics in debug mode, and our CI pipelines use --cargo-profile quick-release
@Ekleog-NEAR do you have ideas? I really want this to fail per-commit. Also, this test can be much shorter, if I can enable no_cache feature in it.

@Ekleog-NEAR
Copy link
Collaborator

Ekleog-NEAR commented Nov 17, 2023

@nagisa I recall we discussed quick-release vs. dev for CI. It seems like quick-release is running without debug assertions? If so, I think we should revert and switch back to using the dev profile for CI, even though it might be slower, because it could miss intentionally-placed debug assertions :/

(Or we could add debug-assertions for quick-release, which might make sense considering it’s not for production usage, but then it’d probably be better to rename it for something like dev-release to be clear it’s not intended for prod)

@Ekleog-NEAR
Copy link
Collaborator

Also, this test can be much shorter, if I can enable no_cache feature in it.

I guess you mean the compile-time no_cache? It’s something that we unfortunately can’t enable in a single test, not without making it a real snowflake that will become hard to maintain (unless we have a lot of tests that require this).

Would it be hard to remove the compile-time no_cache feature, and replace it with a runtime configuration, that could be turned on/off from the test itself?

@nagisa
Copy link
Collaborator

nagisa commented Nov 17, 2023

(Or we could add debug-assertions for quick-release, which might make sense considering it’s not for production usage, but then it’d probably be better to rename it for something like dev-release to be clear it’s not intended for prod)

I’m in favour of making it dev-release. My memory is that unoptimized builds will start taking too long to run to fit within even the timeouts we have right now, especially on MacOS.

@Ekleog-NEAR
Copy link
Collaborator

I just opened #10205 :)

@posvyatokum posvyatokum changed the title Add test_long_chain Add test_long_chain_with_restart_from_snapshot Nov 17, 2023
@posvyatokum posvyatokum marked this pull request as ready for review November 17, 2023 23:15
@posvyatokum posvyatokum requested a review from a team as a code owner November 17, 2023 23:15
Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.archive(false)
.build();

env1.clients[0].config.gc.gc_blocks_limit = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add a small comment on why this is needed?

}

let mut env2 = TestEnv::builder(chain_genesis)
.stores(vec![env1.clients[0].chain.store().store().clone()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!

@posvyatokum posvyatokum added this pull request to the merge queue Nov 20, 2023
Merged via the queue into master with commit d0a771d Nov 20, 2023
16 checks passed
@posvyatokum posvyatokum deleted the long_chain_test branch November 20, 2023 12:32
github-merge-queue bot pushed a commit that referenced this pull request Nov 20, 2023
Tested on updated #10193

`is_next_block_epoch_start` reads `BlockInfo` of the first block of the
epoch, which is not correct in the case of garbage collection.
The previous version of the test was passing, because that `BlockInfo`
was still in the cache of the epoch manager (as we call
`is_next_block_epoch_start` on every block)

Implemented safer version of `is_next_block_epoch_start` --
`is_last_block_in_finished_epoch` (didn't think too much about the name,
open to suggestions).
`is_last_block_in_finished_epoch` works by relying on the fact that if
we processed block, and it was the last block of an epoch according to
`is_next_block_epoch_start`, then we wrote an `EpochInfo` with the hash
of that block, and `EpochInfo` is not garbage collectible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants