New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3: Implement restore object with select job type #7504
S3: Implement restore object with select job type #7504
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7504 +/- ##
==========================================
- Coverage 95.88% 94.06% -1.82%
==========================================
Files 843 1070 +227
Lines 82578 91829 +9251
==========================================
+ Hits 79178 86381 +7203
- Misses 3400 5448 +2048
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsugumi-sys I think using put_object
makes most sense here, as opposed to deepcopy
.
Do you know what the default filename is that AWS uses? If I read the docs correctly, it sounds like they create a new file with some sort of standardized name, maybe something like {optional_prefix_}output_of_restore_request_{uuid}
. But it would be nice to validate that.
I'd be happy to test this in AWS as well at some point and try it out, just to validate the theory - but that may take me some time.
restored_key.set_storage_class(_storage) | ||
|
||
# set value | ||
contents = self.select_object_content( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The select_object_content
should be done first, or at least after the validation. This operation can fail because of an invalid query, so we should only try to create new S3 objects after we know that succeeds
moto/s3/models.py
Outdated
_bucket_name = output_location_s3.get("BucketName") | ||
if _bucket_name is not None: | ||
# check if bucket is exists here. | ||
bucket = self.get_bucket(_bucket_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should be done as early as possible as well
Hummm, I cannot execute a request for restore object with select job type :( First, I putted the following json file into "Glacier Deep Archive" storage class. {
"employee": {
"name": "sonoo",
"salary": 56000,
"married": true
}
} Then, I execute the following script; import boto3
client = boto3.client("s3")
bucket_name = "mybacket"
filekey = "sample.json"
resp = client.restore_object(
Bucket=bucket_name,
Key=filekey,
RestoreRequest={
"Type": "SELECT",
"SelectParameters": {
"ExpressionType": "SQL",
"Expression": "SELECT * FROM S3Object",
"InputSerialization": {
"JSON": {"Type": "Document"},
},
"OutputSerialization": {"JSON": {}},
},
"OutputLocation": {
"S3": {
"BucketName": bucket_name,
"Prefix": "restored.json",
"StorageClass": "STANDARD",
}
},
},
)
print(resp) Now I got botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the RestoreObject operation: The XML you provided was not well-formed or did not validate against our published schema I couldn't figure out why this causes error.... memo: resp = client.select_object_content(
Bucket=bucket_name,
Key=filekey,
ExpressionType="SQL",
Expression="SELECT * FROM S3Object",
InputSerialization={"JSON": {"Type": "Document"}},
OutputSerialization={"JSON": {}},
)
print(resp) |
@tsugumi-sys I've contacted AWS support to figure out why it's throwing the MalformedXML-exception. TBC |
@tsugumi-sys AWS confirmed this is currently broken, but do not have a timeline to fix this. Reading between the lines, it sounds like they don't intend to fix this at all. So I don't think we should add this to Moto either. It's unfortunate, because it is a really cool feature! |
@bblommers
Thats' true |
boto3 documantation: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/restore_object.html
I'm not sure we should use
put_object
or deep copy when create an object to restore. What do you think @?