using Boto3 in s3.py #21529

s-hertel · 2017-02-16T18:48:50Z

ISSUE TYPE

Feature Pull Request

COMPONENT NAME

lib/ansible/modules/cloud/amazon/s3.py

ANSIBLE VERSION

ansible 2.3.0 (boto3-s3 13631c256d) last updated 2017/02/16 13:36:21 (GMT -400)
  config file =
  configured module search path = Default w/o overrides

SUMMARY

Updating S3 since all new AWS module pull requests are expected to use boto3. This will also fix signature version bugs (eg #21200)

Closes #23757

mattclay · 2017-02-25T01:21:32Z

CI failure due to PEP 8 issue:

2017-02-24 22:01:00 ERROR: PEP 8: lib/ansible/modules/cloud/amazon/s3.py: Passes current rule set. Remove from legacy list (test/sanity/pep8/legacy-files.txt).

This means you've resolved outstanding PEP 8 issues in your PR and the file no longer needs to be listed in legacy files list. Just remove it from that file as part of your PR.

You can run PEP 8 tests locally with make pep8.

mattclay · 2017-02-28T04:52:33Z

lib/ansible/modules/cloud/amazon/s3.py

+                        endpoint=walrus
+                        **aws_connect_kwargs
+        )
+


CI failure due to PEP 8 issue:

2017-02-27 20:21:42 ERROR: PEP 8: lib/ansible/modules/cloud/amazon/s3.py:807:1: W293 blank line contains whitespace (current)

The PEP 8 tests can be run locally with make pep8.

mattclay · 2017-02-28T04:58:37Z

lib/ansible/modules/cloud/amazon/s3.py


 try:
-    import boto


Since this module no longer depends on boto, you should update the unit tests to reflect that.

You'll also want to remove boto from the unit test requirements.

@mattclay Fixed!

willthames · 2017-08-07T06:33:59Z

lib/ansible/modules/cloud/amazon/s3.py


+def list_keys(module, s3, bucket, prefix, marker, max_keys):
+    paginator = s3.get_paginator('list_objects')


Use list_objects_v2: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2

Good catch!!

willthames · 2017-08-07T06:37:26Z

lib/ansible/modules/cloud/amazon/s3.py


+def list_keys(module, s3, bucket, prefix, marker, max_keys):
+    paginator = s3.get_paginator('list_objects')
+    all_keys = [page for page in paginator.paginate(Bucket=bucket)][0].get('Contents', [])


These two lines can be simplified (untested, but should work):

keys = [data['Key'] for page in paginator.paginate(Bucket=bucket) for data in page.get('Contents', [])]

As @ryansb appeared to mention in a comment I can't currently see, this seems only to get the first page.

Edit: I can see the comment in the conversation tab, but not in the code tab.

willthames · 2017-08-07T06:37:53Z

lib/ansible/modules/cloud/amazon/s3.py

-            module.fail_json(msg=str(e), exception=traceback.format_exc())
-    try:
-        bucket.delete()
+        paginator = s3.get_paginator('list_objects')


Comments as for list_keys

make s3 pep8 and remove from legacy files fix s3 unit tests

…ad file

…umentation fix incorrectly documented defaults

Fix logic and use head_object instead of get_object for efficiency. Fix typo in unit test.

Fix incorrect conditional. Remove redundant variable assignment. Fix s3 list_object pagination to return all pages

s-hertel · 2017-08-07T13:54:06Z

ready_for_review
Thanks for all the feedback! Pagination is now working correctly and the integration tests are passing locally.

willthames

This can be merged without further changes, but it might be worth considering retries at the very least.

willthames · 2017-08-08T12:14:39Z

lib/ansible/modules/cloud/amazon/s3.py

-    all_keys = bucket_object.get_all_keys(prefix=prefix, marker=marker, max_keys=max_keys)
+def paginated_list(s3, bucket):
+    pg = s3.get_paginator('list_objects_v2')
+    for page in pg.paginate(Bucket=bucket):


You can do return pg.paginate(Bucket=bucket).build_full_result()

Not a blocker though

Do we need an AWSRetry wrapper around this method?

I think I'm going to keep the pagination here as-is so I don't have to filter the results both places where I'm calling this function (since all I want are the key names). Good idea about AWSRetry! I'm making a follow-up PR for that since I don't want to overload this one.

willthames · 2017-08-08T12:19:41Z

lib/ansible/modules/cloud/amazon/s3.py


+def list_keys(module, s3, bucket, prefix, marker, max_keys):
+    keys = [key for key in paginated_list(s3, bucket)]


Should this have some exception handling? (I suggest here rather than paginated_list as paginated_list might not be able to handle exceptions if it does the retry)

Totally missed that - good catch.

Also remembered to allow marker/prefix/max_keys to modify what keys are listed

willthames · 2017-08-09T23:21:15Z

lib/ansible/modules/cloud/amazon/s3.py

-    try:
-        bucket.delete()
+        # if there are contents then we need to delete them before we can delete the bucket
+        keys = [{'Key': key} for key in paginated_list(s3, **{'Bucket': bucket})]


**{'Bucket': bucket} is equivalent to Bucket=bucket. Please use the latter :)

willthames · 2017-08-09T23:27:28Z

lib/ansible/modules/cloud/amazon/s3.py


-    module.exit_json(msg="LIST operation complete", s3_keys=keys)
+def list_keys(module, s3, bucket, prefix, marker, max_keys):


This function seems to be much more complicated than it needs to be. Does anything call this function with non-trivial values for prefix, marker or max_keys? (I'm guessing previously the function called itself to get the next page).

I would argue for using paginator with build_full_result in list_keys_with_backoff and then the calling functions (delete_keys etc.) can just use that directly rather than having to manage the page combination themselves.

Hrm... I'm not sure if I understand what I should be changing.

The user can specify whatever they want for marker or prefix or max_keys, right? I'm not sure what a non-trivial value would be. Previously this function did not call itself to get the next page - pagination wasn't supported at all. So it's true this function has become more complicated. If there's a more elegant way though I'd definitely like to understand.

If I use build_full_result() then I'll have to iterate through that anyway and pull out all the keys so it doesn't seem very different than what I'm doing now. I will implement that if you have a strong preference, but I'm not sure what the benefit is.

I'm a little confused about the comment about delete_keys. list_keys() and delete_bucket() are calling a function that does the pagination. Is the issue that I'm only getting all the keys rather than all the contents?

By non-trivial I just mean values that aren't None or empty strings. I'm not sure how much user control we expect over those settings but I might not have read the parameters carefully enough.

The following untested somewhat pseudocode illustrates the simpler approach:

@AWSRetry(**backoff_params) def list_keys_with_backoff(connection, bucket): pg = connection.get_paginator('list_objects_v2') return [obj['Key'] for obj in pg.paginate(Bucket=bucket).build_full_result()['Objects']] def list_keys(connection, bucket): try: return list_keys_with_backoff(connection, bucket) except botocore.exceptions.ClientError as e: etc...

s-hertel changed the title ~~[WIP] using Boto3 with s3.py~~ [WIP] using Boto3 in s3.py Feb 16, 2017

ansibot added the test_pull_requests label Feb 16, 2017

ryansb removed the needs_triage Needs a first human triage before being processed. label Feb 16, 2017

s-hertel changed the title ~~[WIP] using Boto3 in s3.py~~ using Boto3 in s3.py Feb 23, 2017

s-hertel removed the WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers. label Feb 23, 2017

ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed committer_review In order to be merged, this PR must follow the certified review workflow. labels Feb 24, 2017

mattclay added the ci_verified Changes made in this PR are causing tests to fail. label Feb 25, 2017

mattclay requested changes Feb 28, 2017

View reviewed changes

mattclay added the ci_verified Changes made in this PR are causing tests to fail. label Feb 28, 2017

s-hertel force-pushed the boto3-s3 branch from 404de48 to f047fbe Compare February 28, 2017 17:59

ansibot added needs_ci This PR requires CI testing to be performed. Please close and re-open this PR to trigger CI. and removed ci_verified Changes made in this PR are causing tests to fail. labels Feb 28, 2017

s-hertel force-pushed the boto3-s3 branch 3 times, most recently from ad226ab to 89facbc Compare February 28, 2017 18:35

willthames suggested changes Aug 7, 2017

View reviewed changes

s-hertel and others added 10 commits August 7, 2017 09:38

replace boto with boto3 for the s3 module

f7aab17

make s3 pep8 and remove from legacy files fix s3 unit tests

fix indentation

696d882

s3 module - if we can't create an MD5 sum return None and always uplo…

5b20424

…ad file

remove Location.DEFAULT which isn't used in boto3 and tidy up the docs

b14aa0d

pep8

3e0e1f3

s3: remove default: null, empty aliases, and required: false from doc…

5efde51

…umentation fix incorrectly documented defaults

Porting s3 to boto3. Simplify some logic and remove unused imports

5e13e7e

Fix s3 module variables

fc3ecd8

Fix a typo in s3 module and remove from pep8 legacy files

c583170

s3: add pagination for listing objects.

1da5622

Fix logic and use head_object instead of get_object for efficiency. Fix typo in unit test.

s-hertel force-pushed the boto3-s3 branch from f16c5d0 to 6556ea9 Compare August 7, 2017 13:42

Fix pagination to maintain backwards compatibility.

6556ea9

Fix incorrect conditional. Remove redundant variable assignment. Fix s3 list_object pagination to return all pages

Use the revised List Objects API as recommended.

43963f8

willthames approved these changes Aug 8, 2017

View reviewed changes

Wrap call to paginated_list in a try/except

32bf856

Also remembered to allow marker/prefix/max_keys to modify what keys are listed

mikedlr mentioned this pull request Aug 9, 2017

Public Core Meeting Agenda - August 2017 ansible/community#220

Closed

willthames suggested changes Aug 9, 2017

View reviewed changes

willthames reviewed Aug 9, 2017

View reviewed changes

Simplify argument

c69d003

ryansb approved these changes Aug 11, 2017

View reviewed changes

ryansb merged commit 1de91a9 into ansible:devel Aug 11, 2017

ryansb mentioned this pull request Aug 11, 2017

Unable to use AWS KMS service when uploading the file to S3 bucket #21200

Closed

s-hertel deleted the boto3-s3 branch September 28, 2017 19:16

ansibot added feature This issue/PR relates to a feature request. and removed feature_pull_request labels Mar 4, 2018

dagwieers added the fortios Fortios community label Feb 22, 2019

ansible locked and limited conversation to collaborators Apr 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using Boto3 in s3.py #21529

using Boto3 in s3.py #21529

s-hertel commented Feb 16, 2017 •

edited by ryansb

Loading

mattclay commented Feb 25, 2017

mattclay Feb 28, 2017

mattclay Feb 28, 2017

s-hertel Feb 28, 2017 •

edited

Loading

willthames Aug 7, 2017

s-hertel Aug 7, 2017

willthames Aug 7, 2017 •

edited

Loading

willthames Aug 7, 2017

s-hertel commented Aug 7, 2017

willthames left a comment

willthames Aug 8, 2017

willthames Aug 8, 2017

s-hertel Aug 8, 2017 •

edited

Loading

willthames Aug 8, 2017

s-hertel Aug 8, 2017

willthames Aug 9, 2017

s-hertel Aug 11, 2017

willthames Aug 9, 2017

s-hertel Aug 11, 2017 •

edited

Loading

willthames Aug 11, 2017


		def list_keys(module, s3, bucket, prefix, marker, max_keys):
		paginator = s3.get_paginator('list_objects')


		def list_keys(module, s3, bucket, prefix, marker, max_keys):
		keys = [key for key in paginated_list(s3, bucket)]


		module.exit_json(msg="LIST operation complete", s3_keys=keys)
		def list_keys(module, s3, bucket, prefix, marker, max_keys):

using Boto3 in s3.py #21529

using Boto3 in s3.py #21529

Conversation

s-hertel commented Feb 16, 2017 • edited by ryansb Loading

ISSUE TYPE

COMPONENT NAME

ANSIBLE VERSION

SUMMARY

mattclay commented Feb 25, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-hertel Feb 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willthames Aug 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-hertel commented Aug 7, 2017

willthames left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-hertel Aug 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-hertel Aug 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-hertel commented Feb 16, 2017 •

edited by ryansb

Loading

s-hertel Feb 28, 2017 •

edited

Loading

willthames Aug 7, 2017 •

edited

Loading

s-hertel Aug 8, 2017 •

edited

Loading

s-hertel Aug 11, 2017 •

edited

Loading