Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] #1149: Restrict block number to 1 million per dir #2194

Conversation

SamHSmith
Copy link
Contributor

Description of the Change

Take what was the previous file name, prepend with zeroes, then interpret that as dir + file name. So block files are ex 2/003, 0/032 et cetera.

Issue

#1149

Benefits

Most operating systems will freak out if there are too many files in a directory. This solves that.

Possible Drawbacks

We still iterate over all existing files to figure out which block file we want to edit/create. I think this needs to change in the long run. We should using a simple formula map from block number to block file number. That way you only touch the the file you want to operate on. That is a seperate issue from #1149 though, so the point is irrelevant for this PR.

@github-actions github-actions bot added the iroha2-dev The re-implementation of a BFT hyperledger in RUST label May 11, 2022
@appetrosyan appetrosyan changed the title draft [feature] #1149: Block amount. May 11, 2022
@codecov
Copy link

codecov bot commented May 11, 2022

Codecov Report

Merging #2194 (a395ba5) into iroha2-dev (49407ff) will increase coverage by 0.11%.
The diff coverage is 93.33%.

❗ Current head a395ba5 differs from pull request most recent head e47e4ec. Consider uploading reports for the commit e47e4ec to get more accurate results

@@              Coverage Diff               @@
##           iroha2-dev    #2194      +/-   ##
==============================================
+ Coverage       63.83%   63.95%   +0.11%     
==============================================
  Files             127      127              
  Lines           23592    23692     +100     
==============================================
+ Hits            15060    15152      +92     
- Misses           8532     8540       +8     
Impacted Files Coverage Δ
core/src/kura.rs 93.17% <93.33%> (-0.21%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 49407ff...e47e4ec. Read the comment docs.

@appetrosyan appetrosyan self-assigned this May 11, 2022
@appetrosyan appetrosyan deleted the branch hyperledger:iroha2-dev May 13, 2022 16:12
@appetrosyan appetrosyan reopened this May 16, 2022
@appetrosyan appetrosyan added the Optimization Something isn't working as well as it should label May 20, 2022
@SamHSmith SamHSmith force-pushed the less_than_million_block_files_per_dir branch from 92f2f30 to ffa458f Compare May 20, 2022 11:05
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
@@ -463,15 +505,39 @@ async fn storage_files_base_indices<IO: DiskIO>(
path: &Path,
io: &IO,
) -> Result<BTreeSet<NonZeroU64>> {
let bases = io
let dirs: Vec<(String, IO::Dir)> = io
.read_dir(path.to_path_buf())
.await?
.filter_map(|item| async {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider rewriting like this:

let dirs: Vec<(String, IO::Dir)> = io
        .read_dir(path.to_path_buf())
        .await?
        .filter(Result::ok)
        .map(|item| async {
            let dir_name = item.to_string_lossy().into_owned();
            match io.read_dir(path.join(dir_name.clone())).await {
                Ok(items_stream) => Some((dir_name, items_stream)),
                Err(_) => None,
            }
        })
        .collect()
        .await;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think it looks prettier? I'm talking about using filter and map instead of filter_map.

AFAIK we should use filter_map when we will apply an operation to parameter regardless of its value. Here is not this situation, here we apply operation only on Ok variant.

core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated Show resolved Hide resolved
core/src/kura.rs Outdated
.map(|f_os| {
let f = f_os.to_str();
if f.is_some() && f.unwrap().starts_with(path_str) {
let path_buf : PathBuf = f.unwrap().strip_prefix(path_str).unwrap().into();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip_prefix() docs shows, that you don't need to add / sign before, it should work just fine without it. So you can remove path_srt manipulation above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not why it's important. It's important that the / get's removed so that it doesn't get treated as a root directory. If you start with dir/dir1/file2 and remove dir you get /dir1/file2 which is not what we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I'm talking about. Look at this playground

@Arjentix Arjentix self-assigned this May 20, 2022
@SamHSmith SamHSmith force-pushed the less_than_million_block_files_per_dir branch 2 times, most recently from 4040754 to 721f807 Compare May 20, 2022 18:42
@SamHSmith SamHSmith marked this pull request as ready for review May 20, 2022 18:53
…per directory

Signed-off-by: Sam H. Smith <sam.henning.smith@protonmail.com>
@SamHSmith SamHSmith force-pushed the less_than_million_block_files_per_dir branch from 721f807 to e47e4ec Compare May 20, 2022 18:53
Copy link
Contributor

@Arjentix Arjentix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! But can you, please, also place somewhere in the code documentation about new files structure?

.then(move |e| {
let io = io.clone();
async move { io.open(e.into_os_string()).await }
// async move { io.open(e.into_os_string()).await }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code

@@ -463,15 +505,39 @@ async fn storage_files_base_indices<IO: DiskIO>(
path: &Path,
io: &IO,
) -> Result<BTreeSet<NonZeroU64>> {
let bases = io
let dirs: Vec<(String, IO::Dir)> = io
.read_dir(path.to_path_buf())
.await?
.filter_map(|item| async {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think it looks prettier? I'm talking about using filter and map instead of filter_map.

AFAIK we should use filter_map when we will apply an operation to parameter regardless of its value. Here is not this situation, here we apply operation only on Ok variant.

Comment on lines +932 to +936
fs::DirBuilder::new()
.recursive(true)
.create(&dir.path().join("0"))
.await
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_dir_all()? Should be fine here. But if you like this style, keep it

.map(move |e| dir_path.join(e.to_string()))
let mut files = Vec::new();
for index in &base_indices {
files.push(self.filename_to_path(&index.to_string()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming filename_to_path() function to block_index_to_path()?

Comment on lines 355 to 358
pub async fn get_block_path(&self, block_height: NonZeroU64) -> Result<PathBuf> {
let filename = self.get_block_filename(block_height).await?;
Ok(self.path.join(filename))
let file_string = self.get_block_filename(block_height).await?;
Ok(self.filename_to_path(&file_string))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little confusing for me that get_block_filename() does not return valid path and needs further processing... Maybe rename it to something else or refactor somehow?

@appetrosyan appetrosyan changed the title [feature] #1149: Block amount. [feature] #1149: Restrict block number to 1 million per dir May 21, 2022
@appetrosyan appetrosyan merged commit b25c82a into hyperledger:iroha2-dev May 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iroha2-dev The re-implementation of a BFT hyperledger in RUST Optimization Something isn't working as well as it should
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants