-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/s3 ingestion enhancement to update schema from latest partition #7410
Feat/s3 ingestion enhancement to update schema from latest partition #7410
Conversation
…ion file within S3 bucket
…eat/s3_ingestion_enhancement_to_update_schema_from_latest_partition
…ent_to_update_schema_from_latest_partition
…ent_to_update_schema_from_latest_partition
…hema_from_latest_partition
…hema_from_latest_partition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests seems to be failing
…eat/s3_ingestion_enhancement_to_update_schema_from_latest_partition
…est_partition' of https://github.com/nt-nuodata/datahub into feat/s3_ingestion_enhancement_to_update_schema_from_latest_partition
Hey Tamas! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…atahub-project#7410) Co-authored-by: Prashant Singh Thakur <prashant.thakur@nucleusteq.com>
…7410) Co-authored-by: Prashant Singh Thakur <prashant.thakur@nucleusteq.com>
Checklist
This PR is intended to enhance the S3 ingestion wherein data exists in the form of partitions. Currently, if there is any file that is present in the latest partition (day,month or year), the first file in the oldest partition gets picked. Due to this limitation, if there are any schema updates in the latest files, they are not visible on DataHub. We are trying to enhance this capability via this PR.