Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As of v12, what is the recommended API to use for downloading Block Blob files? #22022

Closed
tafs7 opened this issue Jun 21, 2021 · 8 comments
Closed
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@tafs7
Copy link

tafs7 commented Jun 21, 2021

Query/Question
What is THE way someone should be downloading files from Azure Blob Storage (block blobs) using the .NET SDK as of v12.9, in an asynchronous fashion, if the goal is to "stream" it down thru, for example, an ASP.NET controller action (REST endpoint) to a client like a browser, etc. (NOT save to a local file on server)?

There seems to be several available "download" APIs on the BlobClient, but their docs are somewhat vague or ambiguous and the MS Docs don't seem to clarify any further:

  • DownloadAsync() - marked as not browsable, but de facto way, based on all samples/blogs
  • DownloadStreamingAsync()
  • DownloadContentAsync()
  • DownloadToAsync()
  • OpenReadAsync()

When I looked at the decompiled source for BlobBaseClient.DownloadAsync() method, I see that it is decorated with [EditorBrowsable(EditorBrowsableState.Never)], implying that this API might slowly be on its way out, but without breaking existing code or marking as Obsolete?

I have production code that uses the BlobClient.DownloadAsync() method to download a file from Azure Blob Storage using the Azure.Storage.Blobs nuget package v12.8, and it seems to be working just fine. However, I just upgraded the nuget package and was about to write some new code to deal with extracting zip files stored in Azure Blob...but noticed the above mentioned changes in the latest APIs of the Storage SDK.

Additionally, if trying to do some other operation that is not downloading to a browser client via a REST API, for example, if you're unzipping a blob file and the extracted files are also going into blob storage, would one be better not downloading but instead opening it via OpenReadAsync()?

I posted similar question on Stack Overflow and was encouraged to raise this as a question/issue here:
https://stackoverflow.com/questions/68070143/azure-blob-storage-sdk-v12-blobclient-downloadasync-gone

Environment:

  • Name and version of the Library package used: Azure.Storage.Blobs 12.9.0
@ghost ghost added needs-triage This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jun 21, 2021
@jsquire jsquire added Client This issue points to a problem in the data-plane of the library. needs-team-attention This issue needs attention from Azure service team or SDK team Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) labels Jun 22, 2021
@ghost ghost removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Jun 22, 2021
@ghost
Copy link

ghost commented Jun 22, 2021

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Issue Details

Query/Question
What is THE way someone should be downloading files from Azure Blob Storage (block blobs) using the .NET SDK as of v12.9, in an asynchronous fashion, if the goal is to "stream" it down thru, for example, an ASP.NET controller action (REST endpoint) to a client like a browser, etc. (NOT save to a local file on server)?

There seems to be several available "download" APIs on the BlobClient, but their docs are somewhat vague or ambiguous and the MS Docs don't seem to clarify any further:

  • DownloadAsync() - marked as not browsable, but de facto way, based on all samples/blogs
  • DownloadStreamingAsync()
  • DownloadContentAsync()
  • DownloadToAsync()
  • OpenReadAsync()

When I looked at the decompiled source for BlobBaseClient.DownloadAsync() method, I see that it is decorated with [EditorBrowsable(EditorBrowsableState.Never)], implying that this API might slowly be on its way out, but without breaking existing code or marking as Obsolete?

I have production code that uses the BlobClient.DownloadAsync() method to download a file from Azure Blob Storage using the Azure.Storage.Blobs nuget package v12.8, and it seems to be working just fine. However, I just upgraded the nuget package and was about to write some new code to deal with extracting zip files stored in Azure Blob...but noticed the above mentioned changes in the latest APIs of the Storage SDK.

Additionally, if trying to do some other operation that is not downloading to a browser client via a REST API, for example, if you're unzipping a blob file and the extracted files are also going into blob storage, would one be better not downloading but instead opening it via OpenReadAsync()?

I posted similar question on Stack Overflow and was encouraged to raise this as a question/issue here:
https://stackoverflow.com/questions/68070143/azure-blob-storage-sdk-v12-blobclient-downloadasync-gone

Environment:

  • Name and version of the Library package used: Azure.Storage.Blobs 12.9.0
Author: tafs7
Assignees: -
Labels:

Client, Service Attention, Storage, customer-reported, needs-team-attention, needs-triage, question

Milestone: -

@amishra-dev
Copy link
Contributor

Thanks for reaching out @tafs7 (Thiago). I do not think you should be worried that the API will be removed. We periodically move older APIs out if we have better overloads, but we have tests in place to make sure that we never regress even though we marked the API as editorbrowsable false.
Your ask for why we marked it as EditorBrowsable false is fair, let me get back to you on that.

@kasobol-msft
Copy link
Contributor

@tafs7 The DownloadStreamingAsync is replacement for DownloadAsync. Except slightly different return type please think about this as a rename or new alias for the same functionality. We've introduced DownloadContentAsync for scenarios where small sized blobs are used for formats supported by BinaryData type (e.g. json files) thus we wanted to rename existing API to make the download family less ambiguous.

The difference between DownloadStreamingAsync and OpenReadAsync is that the former gives you a network stream (wrapped with few layers but effectively think about it as network stream) which holds on to single connection, the later on the other hand fetches payload in chunks and buffers issuing multiple requests to fetch content.
Picking one over the other one depends on the scenario, i.e. if the consuming code is fast and you have good broad network link to storage account then former might be better choice as you avoid multiple req-res exchanges but if the consumer is slow then later might be a good idea as it releases a connection back to the pool right after reading and buffering next chunk. We recommend to perf test your app with both to reveal which is best choice if it's not obvious.

@kasobol-msft
Copy link
Contributor

@tafs7 I went through samples in this repo and couldn't find any usage of hidden API. Could you please share where did you see it?

@tafs7
Copy link
Author

tafs7 commented Jun 30, 2021

@kasobol-msft Thanks for the info. So to clarify, for net new code, we should look to use:

  • DownloadStreamingAsync instead of DownloadAsync, when you have a decent/fast connection
  • DownloadContentAsync instead of DownloadToAsync when you are dealing with small binary downloads (not sure how json applies to this)

Did I get that right? If so, it might be worth updating the XML docs on these different methods so it's clear what they do and when they should be used.

Regarding the "hidden" API, what I meant is that the BlobBaseClient.DownloadAsync() API is not available via Intellisense any more as it is decorated with the [EditorBrowsable(EditorBrowsable.Never)] attribute - at least it is when I use a decompiler to step into that nuget pkg from Visual Studio.

You can see it here:

[EditorBrowsable(EditorBrowsableState.Never)]

@kasobol-msft
Copy link
Contributor

That sounds right. The DownloadContentAsync returns wrapped BinaryData instead of Stream. BinaryData is a lightweight abstraction for payloads that can fit into memory and provides some utilities to deserialize it - see here. That's why I mentioned json - but it can be anything.

The hidden API is a part of binary to assure backwards compatibility - i.e. users bumping just version number won't see a difference but new users or users who modify code will get prompted to seek new alternative. We used that strategy instead of marking it as obsolete as often users configure builds in such a way that using obsolete api makes it fail.

@tafs7
Copy link
Author

tafs7 commented Jul 1, 2021

We used that strategy instead of marking it as obsolete as often users configure builds in such a way that using obsolete api makes it fail.

@kasobol-msft, that makes sense. it just seemed confusing to see that the build worked for existing code using DownloadAsync, but when I wrote new code and "dotted" into the blob client object, I no longer saw that method available to me - which is why I suggested perhaps updating the XML docs on that method to point people to use the new DownloadStreamingAsync equivalent, and to hopefully update MS Docs that still tell people to use DownloadAsync to download blob files.

Thanks again!

@billti
Copy link

billti commented Feb 5, 2022

I just spent quite a while working though the blob APIs and options here to stream blobs through Azure Front Door (via ASP.NET Core) which requires range-request support. I wrote up what I think is the best solution at https://ticehurst.com/2022/01/30/blob-streaming.html (final code at the end of the article).

This did take way more research and experimenting than I expected, so if anyone can give it a quick read for any errors or misunderstandings I'd appreciate it. Thanks.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

5 participants