Skip to content

wrobstory/redifest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

.------..------..------..------..------..------..------..------.
|R.--. ||E.--. ||D.--. ||I.--. ||F.--. ||E.--. ||S.--. ||T.--. |
| :(): || (\/) || :/\: || (\/) || :(): || (\/) || :/\: || :/\: |
| ()() || :\/: || (__) || :\/: || ()() || :\/: || :\/: || (__) |
| '--'R|| '--'E|| '--'D|| '--'I|| '--'F|| '--'E|| '--'S|| '--'T|
`------'`------'`------'`------'`------'`------'`------'`------'

Does what it says on the tin: Generates a Redshift .manifest file given a list of S3 buckets. Can write said manifest to file, or back to S3.

API

Create a manifest generator with your AWS creds:

>>> gen = ManifestGenerator('aws_access_key_id', 'aws_secret_access_key')

If following the boto convention for creds in env variables, you do not need to pass them in.

Generate a manifest:

>>> manifest = gen.generate_manifest(['mybucket/folder1/folder2', 
                                      'mybucket/folder3'])

{'entries': [{'mandatory': True,
              'url': u's3://mybucket/folder1/folder2/foo.json'},
             {'mandatory': True,
              'url': u's3://mybucket/folder1/folder2/bar.json'},
             {'mandatory': True,
               'url': u's3://mybucket/folder3/bar.json'}]}

Generate a manifest with filtering on S3 keys (e.g. any file that contains .bzip2 in its name):

>>> manifest = gen.generate_manifest(['mybucket/folder1/folder2', 
                                      'mybucket/folder3'], filter=".bzip2")

You can provide an optional path to write the manfifest back to S3 as part of the call to generate the manifest:

>>> gen.generate_manifest(['mybucket/folder1/folder2', 'mybucket/folder3'],
                          target='mybucket/manifest_files/qux.manifest')

Or write the dict you generated in generate_manifest:

>>> gen.write_manifest(manifest, 'mybucket/manifest_files/qux.manifest')

You can also write to file:

>>> gen.generate_manifest(['mybucket/folder1/folder2', 'mybucket/folder3'],
                          path='qux.manifest')

About

Generate a Redshift .manifest file for a given S3 bucket

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages