Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to restore Glacier to S3 given folder data #380

Closed
dduleep opened this issue Nov 26, 2015 · 5 comments
Closed

how to restore Glacier to S3 given folder data #380

dduleep opened this issue Nov 26, 2015 · 5 comments
Assignees
Labels
documentation This is a problem with documentation. feature-request This issue requests a feature. glacier s3

Comments

@dduleep
Copy link

dduleep commented Nov 26, 2015

I create bucket policy for after 1 day data move s3 to Glacier.
in my bucket has more than 100K objects.
How can i restore Glacier object to S3? and what is most efficient way to download all object in given folder(directly glacier or move glacier data to s3 then download).

In the S3 Restoring Objects document only give Java and .NET example (http://docs.aws.amazon.com/AmazonS3/latest/dev/restoring-objects.html). Is there is method in boto3 please explain how to restore S3 object.

        buckt_ob = self.s3.Bucket('mybucket')
        for obj in buckt_ob.objects.filter(Prefix = 'folder'):
            storage_class = obj.storage_class
            restore = obj.restore

This code what i try but there is no such obj.restore

after restore complete how can i get notification like all folder is available for download

@JordonPhillips JordonPhillips added the documentation This is a problem with documentation. label Nov 30, 2015
@kyleknap
Copy link
Contributor

Yeah so this is a little tricky. So for the objects collection on the bucket resource, it will return ObjectSummary resources which do not have the restore property. There is a difference between ObjectSummary and Object resources because ObjectSummaries are loaded from different s3 operations. ObjectSummary is loaded from a ListObjects which has less information than the HeadObject operation which is used to load the Object resource. To get the restore attribute value you would need to do something like this as the HeadObject operation is the only api method that can get this information:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('glacier_test')
for obj_sum in bucket.objects.all():
    obj = s3.Object(obj_sum.bucket_name, obj_sum.key)
    storage_class = obj.storage_class
    restore = obj.restore

Furthermore, the restore attribute just provides information on the status of a glacier restoration. It does not actually restore the object. To initiate the restoration process you need to access the client's restore_object method. So something like this should work:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('glacier_test')
for obj_sum in bucket.objects.all():
    resp = bucket.meta.client.restore_object(
        Bucket=obj_sum.bucket_name,
        Key=obj_sum.key,
        RestoreRequest={'Days': 1}
    )

Once you actual send the restore request, you will notice that the restore attribute will no longer be None:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('glacier_test')
for obj_sum in bucket.objects.all():
    obj = s3.Object(obj_sum.bucket_name, obj_sum.key)
    storage_class = obj.storage_class
    restore = obj.restore
    print(obj.key, obj.storage_class, obj.restore)

Output:
test.txt GLACIER ongoing-request="true"
testfile GLACIER ongoing-request="true"

To know when the object is ready to download the value will be ongoing-request="true". So you will need to reload() the object to see if it is ready to download.

As a note to myself, based on my walkthrough, it seems that it would be good if we added the following to boto3:

  1. Documentation on how to restore glacier objects
  2. Waiters to pull for when a glacier object is ready to be downloaded
  3. Possibly add the restore() method to the Object resource so we do not have to drop down to the
    client to restore the object.

@dduleep Let me know what you think of these suggestions.

@kyleknap kyleknap added feature-request This issue requests a feature. response-requested Waiting on additional information or feedback. labels Nov 30, 2015
@dduleep
Copy link
Author

dduleep commented Dec 4, 2015

Thanks for your complete description with example.
In your above 3 suggestion are good enough for restore object.
in mean time please add some example to documentation

if there is method like s3.object.download.force().
that method will do

  1. if object in s3 then object directly download
  2. if object in glacier then restore object(keep in s3 until file complete the download) and download

that kind of method hide all low level client call and that will easy to use and understand

but i don't have clear idea is possible or not with current boto3

@kyleknap kyleknap added needs-sample and removed response-requested Waiting on additional information or feedback. labels Dec 8, 2015
@kyleknap kyleknap self-assigned this Dec 14, 2015
@kyleknap
Copy link
Contributor

Yeah I think that documentation would be good for this. Not sure if force() method is the way to go here. I much rather have waiter exposed for this (on the resource as well) as it would be more explicit of what is going on.

@kyleknap
Copy link
Contributor

So I did some researching on the waiter implementation for waiting until a s3 object is restored, and I came to the conclusion that adding a waiter implementation is not the best approach. It would requires pulling of state for roughly 3 to 5 hours to see if a restoration is complete. In addition, for each pull of state a HeadObject request would be needed and in-between each call would require practically a sleeping for tens of minutes. This makes the implementation pretty inefficient in terms of time and cost.

The better implementation would be to expose a SNS notfication event for glacier restorations. There is already a concept in s3 via bucket notifications: http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html. We will talk with S3 to see if they could expose such a feature.

Otherwise, I think my PR addresses the other points that I listed.

@inv-senchuthomas
Copy link

Hai kyleknap,

I am creating a PHP based web application using Amazon's S3 and glacier services.Now I want to give my site users a feature that they can choose any file and make it archive (means move file from S3 to Glacier) and unarchive (means move file from Glacier to S3). Can you provide an exple demonstrating this actions or any api's for doing this
Thanks in advance..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This is a problem with documentation. feature-request This issue requests a feature. glacier s3
Projects
None yet
Development

No branches or pull requests

5 participants