Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TieredStorage] TieredStorageFile -> TieredReadonlyFile and TieredWritableFIle #260

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

yhchiang-sol
Copy link

Problem

TieredStorageFile struct currently offers new_readonly() and new_writable()
to allow both read and write work-load to share the same struct. However,
as we need the writer to use BufWriter to improve performance as well as
enable Hasher on writes. There is a need to refactor TieredStorageFile to
split its usage for read-only and writable.

Summary of Changes

Refactor TieredStorageFile to TieredReadonlyFIle and TieredWritableFile.

Test Plan

Existing tiered-storage tests.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 77.77778% with 10 lines in your changes are missing coverage. Please review.

Project coverage is 81.9%. Comparing base (794cb2f) to head (4cbe867).
Report is 11 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #260   +/-   ##
=======================================
  Coverage    81.9%    81.9%           
=======================================
  Files         836      836           
  Lines      226643   226651    +8     
=======================================
+ Hits       185636   185650   +14     
+ Misses      41007    41001    -6     

Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.
I wonder if it is better split the file.rs into two separate files.

@yhchiang-sol
Copy link
Author

I wonder if it is better split the file.rs into two separate files.

Thank you!

Could be. But let's split the struct into two first so that other PRs can proceed.

@yhchiang-sol yhchiang-sol merged commit 6441209 into anza-xyz:master Mar 18, 2024
33 checks passed
yhchiang-sol added a commit that referenced this pull request Mar 18, 2024
#### Problem
TieredWritableFile currently uses File instead of BufWriter.
This will introduce more syscall when doing file writes.

#### Summary of Changes
This PR makes TieredWritableFile uses BufWriter to allow the
write-call to be more optimized to reduce the number of syscalls.

#### Test Plan
Existing tiered-storage test.
Will run experiments to verify its performance improvement.

#### Dependency
#260
@@ -9,10 +9,10 @@ use {
};

#[derive(Debug)]
pub struct TieredStorageFile(pub File);
pub struct TieredReadableFile(pub File);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The name of the struct is now TieredReadableFile, but the PR title and description both use the incorrect TieredReadonlyFile. It would be preferable for all of these to remain in-sync and correct. It's now too late for the commit message, since the PR has been merged. Please consider this for future PRs.

impl TieredStorageFile {
pub fn new_readonly(file_path: impl AsRef<Path>) -> Self {
impl TieredReadableFile {
pub fn new(file_path: impl AsRef<Path>) -> Self {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github won't let me comment on line 30, which is my intent

I see below there's a TieredReableFile::new_writable() method. In what cases do we need a reader to create a new writable file?

Comment on lines +116 to +122
pub fn seek(&self, offset: u64) -> IoResult<u64> {
(&self.0).seek(SeekFrom::Start(offset))
}

pub fn seek_from_end(&self, offset: i64) -> IoResult<u64> {
(&self.0).seek(SeekFrom::End(offset))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too bad these methods are duplicated. Do both the Readable and Writable have calls to these methods? I was under the impression that Writeable does everything in one pass, from top to bottom, and thus would not need to seek.

willhickey pushed a commit to solana-labs/solana that referenced this pull request Mar 20, 2024
#### Problem
TieredWritableFile currently uses File instead of BufWriter.
This will introduce more syscall when doing file writes.

#### Summary of Changes
This PR makes TieredWritableFile uses BufWriter to allow the
write-call to be more optimized to reduce the number of syscalls.

#### Test Plan
Existing tiered-storage test.
Will run experiments to verify its performance improvement.

#### Dependency
anza-xyz#260
crossdev24 pushed a commit to crossdev24/solana that referenced this pull request Jun 24, 2024
#### Problem
TieredWritableFile currently uses File instead of BufWriter.
This will introduce more syscall when doing file writes.

#### Summary of Changes
This PR makes TieredWritableFile uses BufWriter to allow the
write-call to be more optimized to reduce the number of syscalls.

#### Test Plan
Existing tiered-storage test.
Will run experiments to verify its performance improvement.

#### Dependency
anza-xyz/agave#260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants