Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] TupleGCSStoreBackend::get_all #9703

Merged
merged 7 commits into from Apr 4, 2024

Conversation

tyler-hoffman
Copy link
Contributor

@tyler-hoffman tyler-hoffman commented Apr 3, 2024

Implements TupleGCSStoreBackend::get_all. Note that we could improve the performance if/when we bump google-cloud-storage to 2.12, using the transfer_manager.

The test coverage here uses fakes to mimic what GCS does, inferred from the preexisting code, but I also did a bit of manual testing. The steps if you want to repro:

  • get set up with GCS (we have it in a sandbox)
  • add a bucket
  • follow google's docs on getting authorization working
  • follow our docs to get a validation_result store hosted in that bucket
  • run a checkpoint
  • run the following code and make sure things look reasonable:
store = context.stores["validations_store"].store_backend
all_of_them = store.get_all() # check that this looks good.

# get an individual validation result to make sure it looks like the first in the above list if you want to check
keys = store.list_keys()
first = store.get(keys[0])
  • Description of PR changes above includes a link to an existing GitHub issue
  • PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
  • Code is linted - run invoke lint (uses ruff format + ruff check)
  • Appropriate tests and docs have been updated

For more information about contributing, see Contribute.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

Copy link

netlify bot commented Apr 3, 2024

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit 6072a99
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/660ebfa6567b1f0008b6485b

@tyler-hoffman tyler-hoffman marked this pull request as ready for review April 3, 2024 21:31
Copy link

codecov bot commented Apr 3, 2024

Codecov Report

Attention: Patch coverage is 45.45455% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 82.54%. Comparing base (045d3e1) to head (6072a99).

Files Patch % Lines
...ctations/data_context/store/tuple_store_backend.py 45.45% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9703      +/-   ##
===========================================
- Coverage    82.56%   82.54%   -0.03%     
===========================================
  Files          511      511              
  Lines        46450    46458       +8     
===========================================
- Hits         38353    38347       -6     
- Misses        8097     8111      +14     
Flag Coverage Δ
3.10 64.61% <27.27%> (-0.01%) ⬇️
3.10 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds ?
3.10 aws_deps ?
3.10 big ?
3.10 databricks ?
3.10 filesystem ?
3.10 mssql ?
3.10 mysql ?
3.10 postgresql ?
3.10 snowflake ?
3.10 spark ?
3.10 trino ?
3.11 64.61% <27.27%> (-0.01%) ⬇️
3.11 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds 53.95% <27.27%> (-0.01%) ⬇️
3.11 aws_deps 48.97% <45.45%> (-0.02%) ⬇️
3.11 big 63.94% <45.45%> (-0.02%) ⬇️
3.11 databricks 48.19% <27.27%> (-0.01%) ⬇️
3.11 filesystem 63.84% <27.27%> (-0.02%) ⬇️
3.11 mssql 47.41% <27.27%> (-0.01%) ⬇️
3.11 mysql 47.47% <27.27%> (-0.01%) ⬇️
3.11 postgresql 54.24% <27.27%> (-0.01%) ⬇️
3.11 snowflake 48.72% <27.27%> (-0.01%) ⬇️
3.11 spark 60.63% <27.27%> (-0.01%) ⬇️
3.11 trino 53.87% <27.27%> (-0.01%) ⬇️
3.8 64.62% <27.27%> (-0.01%) ⬇️
3.8 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds 53.95% <27.27%> (-0.01%) ⬇️
3.8 aws_deps 48.98% <45.45%> (-0.02%) ⬇️
3.8 big 63.95% <45.45%> (-0.02%) ⬇️
3.8 databricks 48.21% <27.27%> (-0.01%) ⬇️
3.8 filesystem 63.84% <27.27%> (-0.02%) ⬇️
3.8 mssql 47.40% <27.27%> (-0.01%) ⬇️
3.8 mysql 47.45% <27.27%> (-0.01%) ⬇️
3.8 postgresql 54.23% <27.27%> (-0.01%) ⬇️
3.8 snowflake 48.74% <27.27%> (-0.01%) ⬇️
3.8 spark 60.59% <27.27%> (-0.01%) ⬇️
3.8 trino 53.86% <27.27%> (-0.01%) ⬇️
3.9 64.60% <27.27%> (-0.03%) ⬇️
3.9 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds ?
3.9 aws_deps ?
3.9 big ?
3.9 databricks ?
3.9 filesystem ?
3.9 mssql ?
3.9 mysql ?
3.9 postgresql ?
3.9 snowflake ?
3.9 spark ?
3.9 trino ?
cloud 0.00% <0.00%> (ø)
docs-basic 54.47% <27.27%> (-0.01%) ⬇️
docs-creds-needed 55.04% <27.27%> (-0.01%) ⬇️
docs-spark 54.57% <27.27%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tyler-hoffman tyler-hoffman changed the title F/v1 231/tuple gcs store backend get all [FEATURE] TupleGCSStoreBackend::get_all Apr 4, 2024
Copy link
Member

@joshua-stauffer joshua-stauffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the current pattern, this looks good 👍 I would prefer to see the google client injected as a dependency to the store backend so we could just pass in a mock and assert against it, but definitely out of scope for this PR.

@tyler-hoffman
Copy link
Contributor Author

given the current pattern, this looks good 👍 I would prefer to see the google client injected as a dependency to the store backend so we could just pass in a mock and assert against it, but definitely out of scope for this PR.

@joshua-stauffer yeah, totally agree. Unfortunately, I think we're going to be in the same place with azure blob stores

@tyler-hoffman tyler-hoffman marked this pull request as draft April 4, 2024 14:57
auto-merge was automatically disabled April 4, 2024 14:57

Pull request was converted to draft

@tyler-hoffman tyler-hoffman marked this pull request as ready for review April 4, 2024 14:57
@tyler-hoffman tyler-hoffman added this pull request to the merge queue Apr 4, 2024
Merged via the queue into develop with commit 5707d2c Apr 4, 2024
69 of 91 checks passed
@tyler-hoffman tyler-hoffman deleted the f/v1-231/TupleGCSStoreBackend-get_all branch April 4, 2024 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants