Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(refactor): Make room for additional distributed engines #1646

Merged
merged 8 commits into from
Oct 5, 2022

Conversation

jaidisido
Copy link
Contributor

Feature or Bugfix

  • Refactoring

Detail

Currently, the codebase assumes that there is a single distributed execution engine referred to with the distributed keyword. This is highly restrictive as it closes the door on adding new execution engines (e.g. pyspark, dask...) in the future.

A major change in this PR is splitting the distributed dependency installation and configuration into two (modin AND ray instead of distributed only). I believe this has two benefits. 1) it's explicit, that is the user knows exactly what they are installing 2) it's flexible, allowing more combinations in the future such as modin on dask or mars on ray.

This change includes:

  • Modify the extra dependency installation from pip install awswrangler['distributed'] to pip install awswrangler['modin', 'ray'] instead
  • Modify the configuration to use two items (execution_engine and memory_format)
  • Modify the conditionals across the codebase as a result
  • Move the distributed modules under the subdirectory distributed/ray

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jaidisido jaidisido self-assigned this Sep 30, 2022
@jaidisido jaidisido requested a review from a team September 30, 2022 13:11
@jaidisido jaidisido added this to the 3.0.0 milestone Sep 30, 2022
@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 226a0e5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido jaidisido added enhancement New feature or request major release Will be addressed in the next major release dependencies Pull requests that update a dependency file labels Sep 30, 2022
@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 226a0e5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 226a0e5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

awswrangler/_config.py Outdated Show resolved Hide resolved
@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: b6d65e6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: d033ce5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: aa8ad15
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: aa8ad15
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: aa8ad15
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: aa8ad15
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

@malachi-constant malachi-constant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, besides failing load test. It seems to be caused by changes in this PR as runs agains release-3.0.0 do not replicate error.

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 910d682
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 910d682
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 910d682
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 9825359
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

awswrangler/__init__.py Outdated Show resolved Hide resolved
@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 9825359
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 9825359
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 98467f1
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 98467f1
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 98467f1
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: db352a2
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: db352a2
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: db352a2
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jaidisido jaidisido merged commit cfc5548 into release-3.0.0 Oct 5, 2022
@jaidisido jaidisido deleted the refactor/enable-additional-engines branch October 5, 2022 20:42
@malachi-constant malachi-constant moved this from In Review to Done in AWS SDK for pandas roadmap Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request major release Will be addressed in the next major release
Development

Successfully merging this pull request may close these issues.

None yet

4 participants