Azure Storage Blob Service, upload and download with asyncio

This repository contains code to download and upload files of any size to Azure Storage Blob Service, using its REST api with asyncio and aiohttp. This code was shared, in relation to this thread in GitHub. In this context, the official Python SDK is used only to generate shared access signatures.

Example of concurrent files upload

async with SSLClientSession() as http_client:
    client = BlobsClient(http_client, BlockBlobService(ACCOUNT_NAME, ACCOUNT_KEY))

    # NB: this code does not create containers automatically!

    destination_container_name = 'test'

    files = [
        r'one.ext',
        r'two.ext',
        r'three.ext',
        r'four.ext'
    ]

    await asyncio.gather(*[client.upload_file(file_path, destination_container_name) for file_path in files])

Note

When uploading big files to blob service, it is necessary to do several web requests for each file: one for every chunk and a last one to commit the file. My code intentionally doesn't start parallel web requests to upload chunks of the same file, because it was designed having in mind a scenario where many files are read from file system and uploaded concurrently (concurrent upload of different files, not concurrent uploads of chunks of every single file!).

Moreover, concurrent chunk uploads, or concurrent file uploads without limits, could cause to handle too many bytes in memory at the same time - potentially defeating the purpose of chunking big files on the client side. For this reason, it is recommended to use a semaphore, to limit the concurrency of upload operations. There is no perfect 'one-size-fits-all' solution.

If you need a scenario where you handle a few files at a given time, then you could benefit, from changing the code to support parallel uploads of chunks of every single file.

Example to download files in chunks

To download a file saving it to file system:

async with SSLClientSession() as http_client:
    client = BlobsClient(http_client, BlockBlobService(ACCOUNT_NAME, ACCOUNT_KEY))
    
    source_container_name = 'test'
    blob_name = 'some_blob_in_container.txt'
    await client.download_file(source_container_name, blob_name)

To download a file handling chunks in memory:

async with SSLClientSession() as http_client:
    client = BlobsClient(http_client, BlockBlobService(ACCOUNT_NAME, ACCOUNT_KEY))
    
    source_container_name = 'test'
    blob_name = 'some_blob_in_container.txt'

    async for chunk in client.read_blob(source_container_name, blob_name):
        # handle chunk in memory
        pass

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
service		service
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.py		example.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

service

service

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

example.py

example.py

requirements.txt

requirements.txt

Repository files navigation

Azure Storage Blob Service, upload and download with asyncio

Example of concurrent files upload

Note

Example to download files in chunks

About

Releases

Packages

Languages

License

RobertoPrevato/AzureBlobAsyncUpload

Folders and files

Latest commit

History

Repository files navigation

Azure Storage Blob Service, upload and download with asyncio

Example of concurrent files upload

Note

Example to download files in chunks

About

Topics

Resources

License

Stars

Watchers

Forks

Languages