Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix windows build #140

Merged
merged 2 commits into from
Mar 21, 2021
Merged

fix windows build #140

merged 2 commits into from
Mar 21, 2021

Conversation

houqp
Copy link
Member

@houqp houqp commented Mar 20, 2021

Description

Fix windows build, which was blocking us from releasing a windows
python client. Added mac and window build to the build matrix.

Related Issue(s)

Also move rename temp file clean up logic into put_obj method in
preparation for moving the temp file creation logic into optimistic
delta commit loop. See #135.

@houqp
Copy link
Member Author

houqp commented Mar 20, 2021

cc @mosyp and @xianwill since this is related to #135.

@houqp houqp force-pushed the qp_win branch 8 times, most recently from b0baef5 to df29418 Compare March 20, 2021 07:24
@@ -74,7 +76,12 @@ impl StorageBackend for FileStorageBackend {
async fn put_obj(&self, path: &str, obj_bytes: &[u8]) -> Result<(), StorageError> {
let tmp_path = create_tmp_file_with_retry(self, path, obj_bytes).await?;

rename::rename(&tmp_path, path)
if let Err(e) = rename::atomic_rename(&tmp_path, path) {
fs::remove_file(tmp_path).await.unwrap_or(());
Copy link
Member Author

@houqp houqp Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched to async remove

@houqp houqp force-pushed the qp_win branch 2 times, most recently from 71912a3 to bea834e Compare March 20, 2021 08:10
also move rename temp file clean up logic into put_obj method in
preparation for moving the temp file creation logic into optimistic
delta commit loop.
r#"^*[/\\]_delta_log[/\\](\d{20})\.checkpoint\.\d{10}\.(\d{10})\.parquet$"#
)
.unwrap();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out we can't use join_path to build regex query in a cross-platform way because the separator / from windows is a regex escape character.

use super::*;
use std::ffi::CString;

#[cfg(target_os = "linux")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux specific code in the unix-specific implementation? I'm quite confused by why these libc functions need to be mapped in? There's a high likelihood this could break on BSDs, non-glibc implementations.

(I haven't tested it of course, so you may know something I don't)

Isn't there a crate that abstracts this behavior which doesn't require the unsafe below? E.g. https://docs.rs/crate/atomicwrites/0.3.0

Of course, if you've already gone down this path and this is the best possible option, then so be it, I would appreciate some commentary in the code explaining why "this must be"

Copy link
Member Author

@houqp houqp Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform specific atomic rename needs to be implemented per platform unfortunately, so BSD and non-glibc users will need to send PRs for the platforms they want to use, just like windows users.

I looked into atomicwrites before, their implementation is not ideal. It requires 3 system calls to achieve what can be done in 1 system call, which is the implementation we have here. I also don't think their implementation is correct, because their implementation started with a hardlink function call, which based on the std rust doc, it doesn't error out if destination path already exists.

The unsafe code we have here is scoped to only system call invocation, which is basically what happens when you call std::fs::rename. So even though atomicwrites has less unsafe code on the surface, it's just hidden under the std lib implementation. In short, I think our implementation doesn't have more unsafe code than atomicwrites. If anything, it might have less because we invoke way less system calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rtyler

why these libc functions need to be mapped in

Because they're missing in libc crate and PR for introducing them is still in open

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well rust-lang/libc#2116 (comment) seems like they're merged this a couple of days ago

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha nice, filed #143 as follow up

unsafe fn platform_specific_rename(from: *const libc::c_char, to: *const libc::c_char) -> i32 {
cfg_if::cfg_if! {
if #[cfg(target_os = "linux")] {
renameat2(libc::AT_FDCWD, from, libc::AT_FDCWD, to, RENAME_NOREPLACE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the manpage for this function this little bit jumped out to me

RENAME_NOREPLACE requires support from the underlying filesystem.

Emphasis mine of course. I don't have a good alternative suggestion, but just would like to express my worry that there couplings to underlying system implementation details that we cannot clearly communicate to users here.

Copy link
Member Author

@houqp houqp Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, added rust doc for filesystem backend to clarify that.

Copy link
Member

@rtyler rtyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I think @fvaleye or @mosyp should take a look before we merge

Copy link
Contributor

@mosyp mosyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@houqp houqp merged commit ca40a6b into main Mar 21, 2021
@houqp houqp deleted the qp_win branch March 21, 2021 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants