Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

put_block() with anything other than static data? #1219

Closed
nlfiedler opened this issue Feb 16, 2023 · 10 comments
Closed

put_block() with anything other than static data? #1219

nlfiedler opened this issue Feb 16, 2023 · 10 comments
Assignees
Labels
Storage Storage Service (Queues, Blobs, Files)

Comments

@nlfiedler
Copy link

It appears that put_block() can only be called with data that has a static lifetime. How can I upload a file that I am reading into a buffer? Is this simply not possible at this point?

let mut file_handle = File::open(input_file)?;
let mut buffer: Vec<u8> = Vec::new();
file_handle.read_to_end(&mut buffer)?;
let response = blob_client.put_block("block1", &buffer[..]).into_future().await?;

rustc complains that buffer does not live long enough.

@demoray
Copy link
Contributor

demoray commented Feb 16, 2023

The BlobClient::put_block implementation requires the payload be available such that the underlying layer can handle retries.

Currently, this requires the payload to be Into<Body>. Body is an enum of either a bytes::Bytes or a dyn trait that requires AsyncRead and the ability to seek.

In your example, if you changed the data from &buffer[..] to buffer, the BlobClient::put_block will turn the buffer into a Bytes.

As far as a larger example, AVML supports uploading large blob to Azure Storage using BlobClient::put_block concurrently.

@demoray
Copy link
Contributor

demoray commented Feb 16, 2023

I forgot to include a reference to the AVML implementation, sorry about that. https://docs.rs/avml/latest/avml/struct.BlobUploader.html

@nlfiedler
Copy link
Author

Okay, thank you. It would be fantastic to have a simpler example to follow in the docs. Also maybe one that showed how to get a seekable stream from something other than bytes in memory. But at least I have something to work with now.

@0xGlitchbyte
Copy link

It appears the same applies to put_block_blob. The examples use arbitrary byte objects (strings as bytes) instead of showing how to go from a file to a Bytes object. I haven't found an example for implementing SeekableStream, and I tried using the BytesStream struct that maps a bytes::Bytes object to a stream. Passing in a Bytes object is great for smaller files, but the way to upload large blobs using the SDK is unclear.

For the file to bytes object, it can be as simple as:

let file = File::open(file_path)?;
let mut reader = BufReader::new(file);
let mut buffer = vec![];

// *** //  Credential code goes here, examples for this exists

let blob_client =
    ClientBuilder::new(account, storage_credentials).blob_client(&container, blob_name);

blob_client
    .put_block_blob(buffer)
    .content_type(content_type)
    .await?;

or a little more complex as:

let file = File::open(file_path)?;
let mut cursor = BufReader::new(file);
let mut fbytes = BytesMut::new();
let mut chunk = vec![0; 10 * 1024 * 1024];

loop {
    let bytes_read = match cursor.read(&mut chunk) {
        Ok(0) => break,
        Ok(x) => x,
        Err(_) => break,
    };

    fbytes.put(&chunk[..bytes_read]);
}

// *** //  Credential code goes here, examples for this exists

let blob_client =
    ClientBuilder::new(account, storage_credentials).blob_client(&container, blob_name);

blob_client
    .put_block_blob(fbytes)
    .content_type(content_type)
    .await?;

Both of these work for smaller files, but anything larger (gigabytes worth) will fail. SeekableStream may be the answer for larger files, but I can't say as I have no concrete example of how to implement it, and therefore could not test it myself. It would be nice to be able to upload large files with this SDK.

Any feedback on this would be wonderful. For now, I'll try out the avml solution.

@evbo
Copy link

evbo commented Mar 28, 2023

@demoray Can we please reopen this issue for including an example implementing SeekableStream so that we may use this library for uploading larger files? I looked at AVML but I'd prefer using this repo if it has the built-in features already.

@demoray demoray reopened this Mar 28, 2023
@demoray demoray self-assigned this Mar 28, 2023
@Porges
Copy link
Member

Porges commented May 16, 2023

@demoray Can we please reopen this issue for including an example implementing SeekableStream so that we may use this library for uploading larger files? I looked at AVML but I'd prefer using this repo if it has the built-in features already.

The library should at least provide an implementation for (e.g.) tokio’s File to avoid everyone having to write their own. Or, SeekableStream could be replaced by AsyncSeek which tokio::fs::File already implements?

@thehydroimpulse
Copy link

The current API to asynchronously write a blob is pretty clunky due to the combination of AsyncSeek and SeekableStream. I'd be happy to contribute our implementation on-top of those that allows for any Stream<Item = io::Result<Bytes>> to write to a blob, but it requires a few oddities to handle all the different requirements of the above two traits.

@thehydroimpulse
Copy link

A lot of the complexity here can now be simplified by using Hyper's version of wrap_stream instead of reqwest due to the latest version of Hyper removed the Sync bound.

I'd like to avoid using our own custom put requests using SAS urls to allow byte streams to work.

@Porges thoughts about accepting changes to the SeekableStream trait?

@Porges
Copy link
Member

Porges commented Sep 5, 2023

@thehydroimpulse I don't work on this project; @demoray?

demoray pushed a commit to demoray/azure-sdk-for-rust that referenced this issue Sep 7, 2023
demoray added a commit that referenced this issue Sep 12, 2023
@demoray
Copy link
Contributor

demoray commented Sep 13, 2023

In addition to the Bytes implementation discussed earlier, the next release of the SDK will include a helper called FileStreamBuilder that enables uploading via tokio::fs::File.

For an example, see https://github.com/Azure/azure-sdk-for-rust/blob/main/sdk/storage_blobs/examples/stream_blob_02.rs

@demoray demoray closed this as completed Sep 13, 2023
@heaths heaths added the Storage Storage Service (Queues, Blobs, Files) label Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

8 participants