Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to parameterize S3 Saves #1354

Merged
merged 10 commits into from
May 23, 2023

Conversation

kenxu95
Copy link
Contributor

@kenxu95 kenxu95 commented May 23, 2023

Describe your changes and why you are making these changes

This PR allows a user to parameterize the filepath of an S3 save like this (as shown in the demo):

demo = client.resource("Demo")
s3 = client.resource("test_s3")

table = demo.sql("Select * from hotel_reviews")
dir_name_param = client.create_param("dir_name", default="test")
file_name_param = client.create_param("file_name", default="hotel_reviews")
s3.save(table, "{dir_name}/{file_name}", format="parquet")

client.publish_flow("Test S3 Save", artifacts=[table])

Considerations:

  • Using {} interpolation by name is a bit tricky, because it is possible for there to be multiple explicit parameters with the same name. I decided to resolve this by picking the one that was most recently created. Implicit parameters are excluded from this search for obvious reasons.

Limitations and bugs:

Related issue number (if any)

ENG-2998

Loom demo (if any)

https://www.loom.com/share/1e616042edd843e9a7845bd619f696c7

Checklist before requesting a review

  • I have created a descriptive PR title. The PR title should complete the sentence "This PR...".
  • I have performed a self-review of my code.
  • I have included a small demo of the changes. For the UI, this would be a screenshot or a Loom video.
  • If this is a new feature, I have added unit tests and integration tests.
  • I have run the integration tests locally and they are passing.
  • I have run the linter script locally (See python3 scripts/run_linters.py -h for usage).
  • All features on the UI continue to work correctly.
  • Added one of the following CI labels:
    • run_integration_test: Runs integration tests
    • skip_integration_test: Skips integration tests (Should be used when changes are ONLY documentation/UI)

@kenxu95 kenxu95 requested review from likawind and cw75 May 23, 2023 00:19
Copy link
Contributor

@likawind likawind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, since it will soon have customer use case, how hard it is to include an integration test for this?

assert (
len(spec.parameters.table) == 0
), "A parameterized relational save spec should have an empty table name."
spec.parameters.table = param_input_vals[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming we are not supporting parametrizing a portion of the table name here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

@kenxu95
Copy link
Contributor Author

kenxu95 commented May 23, 2023

@likawind added a data resource test for it.

@kenxu95 kenxu95 added the run_integration_test Triggers integration tests label May 23, 2023
@kenxu95 kenxu95 merged commit 9303dcc into main May 23, 2023
19 checks passed
@kenxu95 kenxu95 deleted the eng-2998-parameterize-table-name-for-s3-saves branch May 23, 2023 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_integration_test Triggers integration tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants