Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test loading S3 parquet folder in dataset format #3235

Closed
mzwiessele opened this issue Sep 20, 2022 · 1 comment
Closed

Test loading S3 parquet folder in dataset format #3235

mzwiessele opened this issue Sep 20, 2022 · 1 comment

Comments

@mzwiessele
Copy link
Contributor

mzwiessele commented Sep 20, 2022

Test that a parquet dataset saved in dataset format can be loaded and accessed using get_historical_features.

Steps:

  • upload a parquet dataset to the test s3 folder
  • point a source to the parquet dataset s3 folder (not the file)
  • load the source features using get_historical_features

Ideally, there'd be a test for this running, but you'd be blocked on making a file within the Feast integration tests S3 bucket. Mind making an issue for that separately? @mzwiessele

Originally posted by @adchia in #3217 (comment)

@mzwiessele mzwiessele changed the title Ideally, there'd be a test for this running, but you'd be blocked on making a file within the Feast integration tests S3 bucket. Mind making an issue for that separately? @mzwiessele Test loading S3 parquet folder in dataset format Sep 20, 2022
@felixwang9817
Copy link
Collaborator

felixwang9817 commented Sep 24, 2022

I think there are probably two ways to do this:

  • there is an S3FileDataSourceCreator in sdk/python/tests/integration/universal/feature_repos/universal/data_sources/file.py, which uses minio - it hasn't been used in a while, but it could be updated and used
  • instead of using minio, we could also just create a dataset and upload data directly to an S3 bucket - we could just use the feast-integration-tests S3 bucket, which is used in universal/data_sources/redshift.py

@mzwiessele if you want to take this on, I'd be happy to help you and review any PRs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants