Skip to content

A library for downloading an S3 folder recursively and unzipping its contents. Extensible to other sources and post-processes.

License

Notifications You must be signed in to change notification settings

samjgalbraith/dlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dlow

A library for downloading an S3 folder recursively and unzipping its contents. Extensible to other sources and post-processes.

Installation

This library is available via pip from PyPI

pip install dlow

Example use

import logging
from dlow import ResourceDownloadOrchestrator
from dlow.s3 import S3FolderDownloader, S3FolderResourceDescriptor
from dlow.post_processing import FileUnzipper

s3_bucket_name = 'SOME_BUCKET_NAME'
s3_folder = '/some_folder'

download_orchestrator = ResourceDownloadOrchestrator(dest_dir='/resources',
                                                        resource_downloader=S3FolderDownloader(s3_bucket_name, s3_folder),
                                                        resource_descriptor=S3FolderResourceDescriptor(s3_bucket_name, s3_folder),
                                                        post_download_processors=[FileUnzipper(delete_archive=True)],
                                                        clear_dest_dir=True)

download_orchestrator.ensure_resources_ready(logging.getLogger('someLogger'))

AWS authentication for S3

This library uses the boto3 library to access S3. This reads the same configuration files on your host filesystem as the AWS CLI. See http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html

About

A library for downloading an S3 folder recursively and unzipping its contents. Extensible to other sources and post-processes.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages