Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support aws s3 storage #720

Merged
merged 2 commits into from
Aug 29, 2024
Merged

support aws s3 storage #720

merged 2 commits into from
Aug 29, 2024

Conversation

wintonzheng
Copy link
Contributor

@wintonzheng wintonzheng commented Aug 23, 2024

TODOs:
[ ] add configurations so that you can configure s3 as the storage

For now, if you want to use s3:

  1. set your AWS_ACCESS_KEY_ID, AWS_DEFAULT_REGION, and AWS_SECRET_ACCESS_KEY in your environment.
  2. make sure you have two s3 buckets created under your aws account: skyvern-artifacts and skyvern-screenshots. If you want to use customized bucket names, make sure they're set for these two environment variables: AWS_S3_BUCKET_ARTIFACTS and AWS_S3_BUCKET_SCREENSHOTS
  3. add StorageFactory.set_storage(S3Storage()) before https://github.com/Skyvern-AI/skyvern-cloud/blame/f9b7a50a9b83244d68226d27c0e8b5cd582bc763/skyvern/forge/app.py#L25

🚀 This description was created by Ellipsis for commit e5f2ce1

Summary:

This PR adds AWS S3 storage support to Skyvern, introducing the S3Storage class, configuration updates, and setup instructions.

Key points:

  • Adds AWS S3 storage support to Skyvern.
  • Introduces S3Storage class for S3 operations.
  • Implements methods: build_uri, store_artifact, retrieve_artifact, get_share_link, get_share_links, store_artifact_from_path, save_streaming_file, get_streaming_file.
  • Utilizes AsyncAWSClient for async S3 operations.
  • Updates skyvern/config.py with SKYVERN_STORAGE_TYPE and S3 bucket settings.
  • Modifies skyvern/forge/app.py to set storage to S3 if SKYVERN_STORAGE_TYPE is s3.
  • Adds AWS_S3_BUCKET_ARTIFACTS and AWS_S3_BUCKET_SCREENSHOTS settings.
  • Provides instructions for configuring S3 storage and setting environment variables.
  • Updates documentation in docs/running-tasks/visualizing-results.mdx for S3 configuration.
  • TODO: Add configurations for S3 as storage.
  • Setup instructions for using S3 with environment variables and bucket names.

Generated with ❤️ by ellipsis.dev

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 426a71c in 11 seconds

More details
  • Looked at 63 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. skyvern/forge/sdk/artifact/storage/s3.py:17
  • Draft comment:
    Consider using a more concise timestamp format instead of datetime.utcnow().isoformat() to avoid long S3 keys. For example, datetime.utcnow().strftime('%Y%m%dT%H%M%S').
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The method build_uri uses datetime.utcnow().isoformat() which includes microseconds. This can lead to very long and potentially unwieldy S3 keys. It's often better to use a more concise timestamp format.
2. skyvern/forge/sdk/artifact/storage/s3.py:19
  • Draft comment:
    Consider adding validation or error handling to ensure artifact.uri is valid before uploading in store_artifact and store_artifact_from_path. This can prevent potential issues with incorrect URIs.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The store_artifact and store_artifact_from_path methods assume that the artifact.uri is always valid and correctly formatted. It might be beneficial to add validation or error handling to ensure the URI is correct before attempting to upload.
3. skyvern/forge/sdk/artifact/storage/s3.py:29
  • Draft comment:
    Consider returning an empty list instead of None when no share links are created in get_share_links for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The get_share_links method returns None if no URLs are created, which might not be the expected behavior. It could be more consistent to return an empty list instead.

Workflow ID: wflow_doQzJtUBNXUJElZh


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 93a2822 in 28 seconds

More details
  • Looked at 62 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. skyvern/forge/sdk/artifact/storage/s3.py:29
  • Draft comment:
    Consider returning an empty list instead of None when there are no share URLs to ensure consistent return types.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is about a change in the diff, specifically the get_share_links method. It suggests a code quality improvement by ensuring consistent return types, which is actionable and clear. This aligns with the rules for review comments.
    The comment could be seen as unimportant if the current implementation is intentional and handled correctly elsewhere in the code. However, ensuring consistent return types is generally a good practice.
    Even if the current implementation is intentional, suggesting a consistent return type is a valid code quality improvement. It can prevent potential issues in the future.
    The comment is valid as it suggests a clear and actionable code quality improvement related to a change in the diff. It should be kept.
2. skyvern/forge/sdk/artifact/storage/s3.py:35
  • Draft comment:
    Consider adding exception handling for the file upload process to ensure robustness.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is about a change made in the diff, specifically the save_streaming_file method. Adding exception handling is a valid suggestion for improving code robustness, as it can prevent the application from crashing due to unhandled exceptions during file uploads.
    The comment could be seen as obvious since exception handling is a common practice. However, it is still a valid suggestion for improving the robustness of the code.
    While exception handling is a common practice, it is not always implemented, and suggesting it can be beneficial for code quality. The comment is actionable and clear, making it a useful suggestion.
    Keep the comment as it provides a valid and actionable suggestion for improving code robustness by adding exception handling.
3. skyvern/forge/sdk/artifact/storage/s3.py:40
  • Draft comment:
    Consider adding exception handling for the file download process to ensure robustness.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is about a potential improvement in error handling, which is a valid concern for robustness. Since the file is new, the comment is relevant to the changes made. Exception handling is not something that would be caught by a build, and it is a logical concern rather than a UI or styling issue.
    The comment might be considered speculative if the download_file method already handles exceptions internally. Without knowing the implementation of download_file, it's hard to be certain.
    Even if download_file handles exceptions, explicitly handling them in get_streaming_file could provide more control over error responses, making the suggestion still valuable.
    Keep the comment as it suggests a valid improvement in error handling for the new method get_streaming_file.

Workflow ID: wflow_L7mVzjkN1Cn5QmSu


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on bf2ad76 in 16 seconds

More details
  • Looked at 53 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. docs/running-tasks/visualizing-results.mdx:50
  • Draft comment:
    Typo in 'artifacts'. It should be 'artifacts' instead of 'atrifacts'.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The PR introduces a new configuration for S3 storage, but the documentation in visualizing-results.mdx contains a typo in the word 'artifacts'.

Workflow ID: wflow_z1Dgkh1Yh2sGrTJ8


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 886a561 in 14 seconds

More details
  • Looked at 53 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. docs/running-tasks/visualizing-results.mdx:50
  • Draft comment:
    Typo in 'artifacts'. It should be 'artifacts' instead of 'atrifacts'.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new feature to support AWS S3 storage. However, the documentation in docs/running-tasks/visualizing-results.mdx contains a typo in the word 'artifacts'. This should be corrected for clarity and professionalism.

Workflow ID: wflow_4rQMm2Q9sxwhHL4m


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on eb15e36 in 18 seconds

More details
  • Looked at 55 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. docs/running-tasks/visualizing-results.mdx:50
  • Draft comment:
    Typo in 'artifacts'. It should be 'artifacts' instead of 'atrifacts'.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The PR introduces a new feature to support AWS S3 storage. However, the documentation in visualizing-results.mdx contains a typo in the word 'artifacts'. This should be corrected for clarity and professionalism.

Workflow ID: wflow_zmDtr8MtznfCa0Uk


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@wintonzheng wintonzheng merged commit ffc4b35 into main Aug 29, 2024
2 checks passed
@wintonzheng wintonzheng deleted the shu/s3_storage_support branch August 29, 2024 22:01
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on e5f2ce1 in 15 seconds

More details
  • Looked at 55 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. docs/running-tasks/visualizing-results.mdx:50
  • Draft comment:
    Typo in 'artifacts'. It should be 'artifacts' instead of 'atrifacts'. This typo is also present in line 49.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The documentation in visualizing-results.mdx contains a typo in the word 'artifacts'. This typo is repeated in the file.

Workflow ID: wflow_sb9hajgi7DB6GDBW


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant