Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(streaming): add new s3 streaming utility #1719

Merged
merged 43 commits into from
Nov 24, 2022

Conversation

rubenfonseca
Copy link
Contributor

@rubenfonseca rubenfonseca commented Nov 15, 2022

Issue number: #1692

Summary

This PR adds a new utility to stream data from S3.

Changes

Please provide a summary of what's being changed

New utility called streaming and a new S3Object class. Includes common data transformations and a way
to implement custom data transformations and add them to the data pipeline.

User experience

Please share what the user experience looks like before and after this change

from typing import Dict

from aws_lambda_powertools.utilities.streaming.s3 import S3Object
from aws_lambda_powertools.utilities.typing import LambdaContext


def lambda_handler(event: Dict[str, str], context: LambdaContext):
    s3 = S3Object(bucket=event["bucket"], key=event["key"], is_gzip=True)
    for line in s3:
        print(line)

Checklist

If your change doesn't seem to apply, please leave them unchecked.

Is this a breaking change?

RFC issue number:

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.


View rendered docs/utilities/streaming.md

@boring-cyborg boring-cyborg bot added the documentation Improvements or additions to documentation label Nov 15, 2022
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 15, 2022
@rubenfonseca rubenfonseca linked an issue Nov 15, 2022 that may be closed by this pull request
2 tasks
@github-actions github-actions bot added the feature New feature or functionality label Nov 15, 2022
@boring-cyborg boring-cyborg bot added the github-actions Pull requests that update Github_actions code label Nov 15, 2022
@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 15, 2022
@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2022

Codecov Report

Base: 99.27% // Head: 98.22% // Decreases project coverage by -1.05% ⚠️

Coverage data is based on head (e1b7e05) compared to base (d7fc2b8).
Patch coverage: 68.80% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1719      +/-   ##
===========================================
- Coverage    99.27%   98.22%   -1.06%     
===========================================
  Files          129      137       +8     
  Lines         6086     6304     +218     
  Branches       407      427      +20     
===========================================
+ Hits          6042     6192     +150     
- Misses          20       81      +61     
- Partials        24       31       +7     
Impacted Files Coverage Δ
...lambda_powertools/utilities/streaming/s3_object.py 42.50% <42.50%> (ø)
...ertools/utilities/streaming/transformations/csv.py 63.63% <63.63%> (ø)
...rtools/utilities/streaming/transformations/base.py 70.00% <70.00%> (ø)
...rtools/utilities/streaming/transformations/gzip.py 83.33% <83.33%> (ø)
...ertools/utilities/streaming/transformations/zip.py 83.33% <83.33%> (ø)
..._powertools/utilities/streaming/_s3_seekable_io.py 86.73% <86.73%> (ø)
..._lambda_powertools/utilities/streaming/__init__.py 100.00% <100.00%> (ø)
...ls/utilities/streaming/transformations/__init__.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@boring-cyborg boring-cyborg bot added the tests label Nov 16, 2022
@heitorlessa heitorlessa changed the title feat: add s3 streaming utility feat(streaming): add s3 streaming utility Nov 17, 2022
@boring-cyborg boring-cyborg bot added dependencies Pull requests that update a dependency file typing Static typing definition related issues (mypy, pyright, etc.) labels Nov 17, 2022
Copy link
Contributor

@heitorlessa heitorlessa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superb! Added some initial comments before I have more bandwidth for a full review.

mkdocs.yml Outdated Show resolved Hide resolved
tests/functional/streaming/test_s3_object.py Outdated Show resolved Hide resolved
tests/functional/streaming/test_s3_object.py Show resolved Hide resolved
tests/functional/streaming/test_s3_object.py Outdated Show resolved Hide resolved
aws_lambda_powertools/utilities/streaming/s3_object.py Outdated Show resolved Hide resolved
aws_lambda_powertools/utilities/streaming/s3_object.py Outdated Show resolved Hide resolved
aws_lambda_powertools/utilities/streaming/s3_object.py Outdated Show resolved Hide resolved
aws_lambda_powertools/utilities/streaming/s3_object.py Outdated Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 18, 2022
@rubenfonseca rubenfonseca marked this pull request as ready for review November 23, 2022 16:44
@rubenfonseca rubenfonseca requested a review from a team as a code owner November 23, 2022 16:44
@rubenfonseca rubenfonseca requested review from heitorlessa and removed request for a team November 23, 2022 16:44
@heitorlessa
Copy link
Contributor

Pushed the final docs changes, including docs index and project readme to highlight as a new feature too.

@heitorlessa heitorlessa changed the title feat(streaming): add s3 streaming utility feat(streaming): add new s3 streaming utility Nov 23, 2022
@rubenfonseca rubenfonseca merged commit 7af50e5 into aws-powertools:develop Nov 24, 2022
@rubenfonseca rubenfonseca deleted the feat/s3-streaming branch November 24, 2022 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation feature New feature or functionality github-actions Pull requests that update Github_actions code size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tests typing Static typing definition related issues (mypy, pyright, etc.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: S3 Streaming utility
3 participants