Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial local loglet implementation #1302

Merged
merged 1 commit into from
Mar 25, 2024
Merged

Initial local loglet implementation #1302

merged 1 commit into from
Mar 25, 2024

Conversation

AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Mar 22, 2024

Initial local loglet implementation

First take on implementing a local loglet on rocksdb. This also sets it as the default. Along with the previous PR in the stack, single node durability is restored.

Copy link

github-actions bot commented Mar 22, 2024

Test Results

 92 files  ±0   92 suites  ±0   9m 24s ⏱️ + 3m 40s
 79 tests ±0   74 ✅ ±0   5 💤 ±0  0 ❌ ±0 
205 runs  ±0  190 ✅ ±0  15 💤 ±0  0 ❌ ±0 

Results for commit fa11705. ± Comparison against base commit ba7893c.

♻️ This comment has been updated with latest results.

@AhmedSoliman
Copy link
Contributor Author

I'll investigate e2e tests.

@AhmedSoliman
Copy link
Contributor Author

@tillrohrmann e2e were failing on timeout, I've reduced the default batching latency and it's now ready for your review.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @AhmedSoliman. Can't wait to see the local loglet in action :-) My only relevant comment is about the place where we decide about the offset index. It looks as if it could lead in very rare cases to missing a record when reading.

Comment on lines 178 to 180
// Most of the changes are highly temporal, we try to delay flushing
// As much as we can to increase the chances to observe a deletion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment really apply to the log workload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, copy pasta :) will remove.

crates/bifrost/src/loglets/local_loglet/log_store.rs Outdated Show resolved Hide resolved
pub struct Options {
pub rocksdb_threads: usize,
pub rocksdb_disable_statistics: bool,
pub rocksdb_disable_wal: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to allow disabling the wal for the local loglet? This might be a lurking footgun?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to leave the option available but what matters more that LogStore supports it (planning your reuse parts of it later)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pub fn new(storage_path: &Path, raw_options: &serde_json::Value) -> Arc<Self> {
let opts =
serde_json::from_value(raw_options.clone()).expect("to be able to deserialize options");
// todo: implement loglet loading error handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have error handling for loglet providers, could we move the log_writer creation to the new function? Then we wouldn't need the OnceLock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer would be no since we need to start the log writer to obtain the handle which we set in the OnceLock.

Comment on lines +94 to +117
pub fn shutdown(&self) {
let start = Instant::now();
if let Err(e) = self.db.flush_wal(true) {
warn!("Failed to flush local loglet rocksdb WAL: {}", e);
}
self.db.cancel_all_background_work(true);
debug!(
"Local loglet clean rocksdb shutdown took {:?}",
start.elapsed(),
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice :-) Should we do something similar for the graceful shutdown of the PP state storage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I think potentially yes. I'm being extra defensive for the logstore part, not sure if this will bring immediate value to PP state storage at the moment though.

crates/bifrost/src/loglets/local_loglet/mod.rs Outdated Show resolved Hide resolved
receive: watch::Receiver<LogletOffset>,
}

impl OffsetWatch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice utility 🤩

Comment on lines +30 to +32
/// SmallVec is used to avoid heap allocation for the common case of small
/// number of updates.
updates: SmallVec<[LogStateUpdate; 1]>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :-)

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR @AhmedSoliman. Making the offset generation and enqueuing into the request channel atomic is good temporary solution for the problem described before. +1 for merging this PR :-)

Comment on lines 227 to 230
// receiver.await.unwrap_or_else(|_| {
// warn!("Unsure if the local loglet record was written, the ack channel was dropped");
// Err(Error::Shutdown(ShutdownError))
// })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be removed.

pub struct Options {
pub rocksdb_threads: usize,
pub rocksdb_disable_statistics: bool,
pub rocksdb_disable_wal: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

First take on implementing a local loglet on rocksdb. This also sets it as the default. Along with the previous PR in the stack, single node durability is restored.
@AhmedSoliman AhmedSoliman merged commit dae1e7c into main Mar 25, 2024
8 checks passed
@AhmedSoliman AhmedSoliman deleted the pr1302 branch March 25, 2024 14:59
@tillrohrmann
Copy link
Contributor

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants