Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix silent skip for folder outside of a stage #62

Closed
chuwy opened this issue Sep 3, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@chuwy
Copy link
Collaborator

commented Sep 3, 2018

If we specify that data resides in snowplow_stage, but in fact data is outside it, warehouse simply skips folder and loader marks folder as loaded.

@chuwy chuwy added the bug label Sep 3, 2018

@alexanderdean

This comment has been minimized.

Copy link
Member

commented Sep 3, 2018

What does it mean that data is "outside" it @chuwy?

@chuwy

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 3, 2018

If stage is s3://snowplow-data/snowflake-stage/, but folders are in s3://snowplow-data/other-folder/, they can be considered outside of the stage.

This bug should not affect normally operating pipelines, but can cause "blank loads" for complex recovery/historical load scenarios.

@chuwy chuwy added this to the Version 0.4.0 milestone Oct 19, 2018

@chuwy chuwy closed this in 10957d7 Jan 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.