New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lost file data in filesystem store (with test) #439
Fix lost file data in filesystem store (with test) #439
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @aaronmondal and @adam-singer)
native-link-store/src/filesystem_store.rs
line 22 at r2 (raw file):
use std::time::SystemTime; use core::time::Duration;
nit: use std
instead of core
.
native-link-store/src/filesystem_store.rs
line 39 at r2 (raw file):
use tokio::task::spawn_blocking; use tokio::time::{sleep, Sleep}; use tokio::time::timeout;
nit: combine this with the line above.
native-link-store/tests/filesystem_store_test.rs
line 880 at r2 (raw file):
let temp_path = make_temp_path("temp_path"); fs::set_idle_file_descriptor_timeout(Duration::from_millis(100))
nit: This line is not needed for this test.
native-link-store/tests/filesystem_store_test.rs
line 883 at r2 (raw file):
.err_tip(|| "Error setting idle file descriptor timeout")?; assert_eq!(fs::idle_file_descriptor_timeout(), Duration::from_millis(100));
nit: ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I love a sleep less than 1 second.
Also, can you condense these into a single commit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 5 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @MarcusSorealheis)
native-link-store/tests/filesystem_store_test.rs
line 876 at r2 (raw file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
nit (should not hold up PR, because it is a test): I know this is only a test, I'm curious why this variable name was chosen. It's your code so I shouldn't dictate, but my intuition would've been to name this variable
many_x_digest
or something similar to represent the long string of x's for a few reasons.
digest
is a moderately common function name in Rust.digest1
is slightly non-descript. Not totally, it is a digest, and the first one.On the other hand, it's not that important because it doesn't really matter what the value is, as the important point is that it is a large string.
FYI: This PR was not flagged as ready for review, it was just so we could talk about it. It turns out we need a large digest anyway.
In other parts of the code it is usually digest1
then digest2
when there are two digests that are needed for the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @adam-singer and @MarcusSorealheis)
-- commits
line 5 at r2:
Remember to squash the commits.
native-link-store/src/filesystem_store.rs
line 695 at r1 (raw file):
continue; } res = writer.send(buf.split().freeze()) => {
Looks like we're still using the split/freeze inside the loop. Maybe it makes sense to combine this PR with #426.
Edit: Ok it's no longer in the loop 😆 Let's make sure to add @Jirixek as co-author.
This is true. In this case, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally digest1_clone
style name is used when you need to need to pass ownership of an item to another closure or similar.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @adam-singer and @Jirixek)
2b913f4
to
d4b6027
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged as single commit and hopefully gave @Jirixek proper annotation for the fix.
done : digest1_clone
&& digest1
renamed
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @aaronmondal and @Jirixek)
native-link-store/src/filesystem_store.rs
line 695 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Looks like we're still using the split/freeze inside the loop. Maybe it makes sense to combine this PR with #426.
Edit: Ok it's no longer in the loop 😆 Let's make sure to add @Jirixek as co-author.
done , ptal if it is correct in git log, not metadata I've often had to adjust
native-link-store/src/filesystem_store.rs
line 22 at r2 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: use
std
instead ofcore
.
done
native-link-store/src/filesystem_store.rs
line 39 at r2 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: combine this with the line above.
done
native-link-store/tests/filesystem_store_test.rs
line 876 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
The test case was graciously provided by @Jirixek, lifted as is, naming can be adjusted
done
native-link-store/tests/filesystem_store_test.rs
line 880 at r2 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: This line is not needed for this test.
done
native-link-store/tests/filesystem_store_test.rs
line 883 at r2 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: ditto.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @Jirixek)
native-link-store/src/filesystem_store.rs
line 695 at r1 (raw file):
Previously, adam-singer (Adam Singer) wrote…
done , ptal if it is correct in git log, not metadata I've often had to adjust
We could also flag this test as a "do-not-run" then have @Jirixek just enable the test in his PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @aaronmondal and @Jirixek)
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Remember to squash the commits.
Done.
native-link-store/src/filesystem_store.rs
line 695 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
We could also flag this test as a "do-not-run" then have @Jirixek just enable the test in his PR?
sgtm, or can just take it has a patch locally, apply and update the PR, since we have already reviewed turn around time would be low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 9 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @Jirixek)
native-link-store/tests/filesystem_store_test.rs
line 28 at r3 (raw file):
use std::time::SystemTime; use core::time::Duration;
nit: use std
instead of core
native-link-store/tests/filesystem_store_test.rs
line 49 at r3 (raw file):
use tokio::sync::Barrier; use tokio::time::sleep; use tokio::time::Sleep;
nit: Combine these into the same line.
native-link-store/tests/filesystem_store_test.rs
line 880 at r3 (raw file):
let temp_path = make_temp_path("temp_path"); fn sleep_fn(_: Duration) -> Sleep {
nit: Can you try and make this a lambda and inline it? I think rust does support it as long as it is a pure function (no-captures and not mutable).
native-link-store/tests/filesystem_store_test.rs
line 888 at r3 (raw file):
content_path: content_path.clone(), temp_path: temp_path.clone(), eviction_policy: Some(native_link_config::stores::EvictionPolicy {
nit: Just remove the eviction policy, we are not testing it.
native-link-store/tests/filesystem_store_test.rs
line 900 at r3 (raw file):
let store_pin = Pin::new(store.as_ref()); // Insert data into store.
nit: This comment is obvious, so i'd remove it.
native-link-store/tests/filesystem_store_test.rs
line 910 at r3 (raw file):
// Check to ensure our first byte has been received. The future should be stalled here. let first_byte = DropCloserReadHalf::take(&mut reader, 1)
nit: I don't think we need to do this for this test.
native-link-store/tests/filesystem_store_test.rs
line 915 at r3 (raw file):
assert_eq!(first_byte[0], large_value.as_bytes()[0], "Expected first byte to match"); sleep(Duration::from_millis(200)).await;
nit: We can't sleep in tests. You may need to sleep(0)
though, so it triggers the kernel's epoll
, and may also need to use tokio::task::yield_now().await
d18f83b
to
7039cd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @aaronmondal and @Jirixek)
native-link-store/tests/filesystem_store_test.rs
line 28 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: use
std
instead ofcore
done.
native-link-store/tests/filesystem_store_test.rs
line 49 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Combine these into the same line.
done.
native-link-store/tests/filesystem_store_test.rs
line 880 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Can you try and make this a lambda and inline it? I think rust does support it as long as it is a pure function (no-captures and not mutable).
done.
native-link-store/tests/filesystem_store_test.rs
line 888 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Just remove the eviction policy, we are not testing it.
done.
native-link-store/tests/filesystem_store_test.rs
line 900 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: This comment is obvious, so i'd remove it.
done.
native-link-store/tests/filesystem_store_test.rs
line 910 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: I don't think we need to do this for this test.
done.
native-link-store/tests/filesystem_store_test.rs
line 915 at r3 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: We can't sleep in tests. You may need to
sleep(0)
though, so it triggers the kernel'sepoll
, and may also need to usetokio::task::yield_now().await
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed (commit messages unreviewed), 3 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @Jirixek)
native-link-store/tests/filesystem_store_test.rs
line 880 at r3 (raw file):
Previously, adam-singer (Adam Singer) wrote…
done.
last minor nit on this, can you inline this?
native-link-store/tests/filesystem_store_test.rs
line 901 at r4 (raw file):
let digest_clone = digest; tokio::spawn(async move { Pin::new(store_clone.as_ref()).get(digest_clone, writer).await });
nit: Can we wrap this spawn in a JoinHandleDropGuard::new()
and assign it to a dummy variable.
Doing this will ensure the spawn gets dropped when it goes out of scope.
Failing to do this might cause (depending on how tokio test runtime is implemented) the task to be stalled forever in the current runtime (which might be shared between tests). This wrapper simply calls cancel on the spawn if the join future gets dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you everyone :-)
Reviewable status: all files reviewed (commit messages unreviewed), 3 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @Jirixek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @aaronmondal, @adam-singer, and @Jirixek)
Don't bother, just merge this MR as is. You've done the most of the work anyway, I'm happy with the credit you've game me :) |
Co-authored-by: Adam Singer <adam@tracemachina.com> Co-authored-by: Jiří Szkandera @Jirixek
7039cd9
to
8e0c440
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 3 unresolved discussions (waiting on @aaronmondal, @allada, and @MarcusSorealheis)
native-link-store/src/filesystem_store.rs
line 695 at r1 (raw file):
Previously, Jirixek (Jiří Szkandera) wrote…
Don't bother, just merge this MR as is. You've done the most of the work anyway, I'm happy with the credit you've game me :)
Done.
native-link-store/tests/filesystem_store_test.rs
line 872 at r4 (raw file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
yes, this is better.
Done.
native-link-store/tests/filesystem_store_test.rs
line 901 at r4 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Can we wrap this spawn in a
JoinHandleDropGuard::new()
and assign it to a dummy variable.Doing this will ensure the spawn gets dropped when it goes out of scope.
Failing to do this might cause (depending on how tokio test runtime is implemented) the task to be stalled forever in the current runtime (which might be shared between tests). This wrapper simply calls cancel on the spawn if the join future gets dropped.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 3 unresolved discussions (waiting on @aaronmondal, @allada, and @MarcusSorealheis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @MarcusSorealheis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5, all commit messages.
Dismissed @MarcusSorealheis from a discussion.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @adam-singer)
Description
Refactor
filesystem_store::FilesystemStore
to allow for controlling sleep function from outside of implementation.Hoist
BytesMut.split()
to a position before the loop to prevent subsequent calls that result in an empty buffer. Original patch: Fix lost file data in filesystem store #426 / @JirixekFixes # (issue)
#426
Type of change
How Has This Been Tested?
get_part_timeout_test
unit test against non-hoistedBytesMut.split()
will trigger the bug, once split is moved bug is no longer present.Checklist
bazel test //...
passes locallygit amend
see some docsThis change is