Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Various block streamer issues #556

Merged
merged 7 commits into from Feb 14, 2024
Merged

Conversation

morgsmccauley
Copy link
Collaborator

@morgsmccauley morgsmccauley commented Feb 13, 2024

No description provided.

@morgsmccauley morgsmccauley changed the base branch from main to 521-handle-new-start-from-config-options February 13, 2024 20:14
@morgsmccauley morgsmccauley changed the title fix/block streamer issues fix: Various block streamer issues Feb 13, 2024
indexer,
redis_stream.clone(),
)
.await?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original had this swallow the error but have since removed - if we fail half way through flushing blocks, we'll end up double processing. Better to just fail hard than duplicate blocks.

.parse::<near_indexer_primitives::types::BlockHeight>()
.context("Failed to parse Delta Lake metadata")?;

if start_block_height >= last_indexed_block_from_metadata {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially for StartBlock::Latest, we dont need to check delta lake in this case.

@@ -246,6 +246,7 @@ impl DeltaLakeClientImpl {
}
})
.flat_map(|index_file| index_file.heights)
.filter(|block_height| *block_height >= start_block_height)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we would add all heights for the day the height falls within. This was especially an issue for StartBlock::Latest as we would add many blocks initially where there should be essentially none.

@morgsmccauley morgsmccauley marked this pull request as ready for review February 13, 2024 20:19
@morgsmccauley morgsmccauley requested a review from a team as a code owner February 13, 2024 20:19
@morgsmccauley morgsmccauley self-assigned this Feb 13, 2024
@morgsmccauley morgsmccauley added bug Something isn't working component: Streamer labels Feb 13, 2024
Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! I like the refactoring of the block processing.

lake_prefetch_size,
redis_client,
indexer,
redis_stream,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So previously in delta lake, we passed a clone of this. Here we pass it directly. Is the reason that we end the function afterward, and are ok with it consuming the ownership or the function, and the end of this function releases it?

Mainly just trying to understand Rust best practices.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's fine to take ownership here because redis_stream is no longer used from here. We'd need to .clone() if we used it later.

Base automatically changed from 521-handle-new-start-from-config-options to main February 14, 2024 00:48
@morgsmccauley morgsmccauley merged commit abcc48e into main Feb 14, 2024
4 checks passed
@morgsmccauley morgsmccauley deleted the fix/block-streamer-issues branch February 14, 2024 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component: Streamer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants