Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add a put method on Client that mimic put command in hdfs dfs cli tool #90

Closed
fMeow opened this issue Nov 30, 2022 · 2 comments
Closed

Comments

@fMeow
Copy link
Contributor

fMeow commented Nov 30, 2022

Put files to hdfs in hadoop cli client

When uploading a file, the hadoop hdfs shell client first create a tmp file with a ._COPY_ suffix, and later rename this file. This way we can prevent reading an incomplete file.

The following command will first create a file /hdfs/path/local-file.log._COPY_ on the hdfs and later rename it to /hdfs/path/local-file.log. In this way we always get a completely uploaded file if we ignore files that end with ._COPY_.

$HADOOP_HOME/bin/hdfs dfs -fs hdfs://default/ -put ./local-file.log /hdfs/path/

What about hdrs

hdrs can surely mimic this behaviour and I think it profitable to add an method to Client to implement this behaviour.

Here is a quick MVP:

impl Client {
    pub fn put_file<T: AsRef<[u8]>>(&self, path: &str, content: T) -> io::Result<()> {
        // TODO maybe aditional check that the file does not exist
        // TODO check that the path is not a directory

        let tmp = format!("{}._COPY_", path);
        let mut file = self.open_file().append(false).write(true).open(&tmp)?;
        file.write_all(content.as_ref())?;
        self.rename_file(&tmp, path)?;

        Ok(())
    }
}
@Xuanwo
Copy link
Owner

Xuanwo commented Nov 30, 2022

Thanks for your RFC first!

Adding support for rename_file LGTM.

But hdrs intends to be the hdfs lib that used by hdfs dfs, which means pub_file is not suitable to add. I think a rust port of hdfs dfs is good place for put_file.

@Xuanwo
Copy link
Owner

Xuanwo commented Jan 10, 2023

Thanks for the RFC again, hdrs will not implement this feature.

@Xuanwo Xuanwo closed this as not planned Won't fix, can't repro, duplicate, stale Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants