Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP-1011 archive_load_files feature #178

Merged
merged 3 commits into from
Jun 14, 2021
Merged

Conversation

llehtinen
Copy link
Contributor

@llehtinen llehtinen commented Jun 11, 2021

Problem

https://transferwise.atlassian.net/browse/AP-1011

Proposed changes

When enabled, this feature will result in PipelineWise load files being copied to a separate folder for archival.

When used in conjunction with incremental replication, the replication key and its min/max value in each archived file are included in S3 metadata.

Original grooming document (slightly outdated now):
https://docs.google.com/document/d/11UTlmWVJS9aGickmyxpXOUhrv4RTPnm8zip_IqeR2J0/edit?ts=609e4106

(Old WIP PR: #174)

Types of changes

What types of changes does your code introduce to PipelineWise?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • Description above provides context of the change
  • I have added tests that prove my fix is effective or that my feature works
  • Unit tests for changes (not needed for documentation changes)
  • CI checks pass with my changes
  • Bumping version in setup.py is an individual PR and not mixed with feature or bugfix PRs
  • Commit message/PR title starts with [AP-NNNN] (if applicable. AP-NNNN = JIRA ID)
  • Branch name starts with AP-NNN (if applicable. AP-NNN = JIRA ID)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions

@llehtinen llehtinen added the enhancement New feature or request label Jun 11, 2021
@llehtinen llehtinen force-pushed the ap1011_archive_load_files_feature branch from e75665d to 3552f07 Compare June 11, 2021 10:31
@llehtinen llehtinen changed the title Ap1011 archive load files feature AP-1011 archive_load_files feature Jun 11, 2021
target_snowflake/db_sync.py Outdated Show resolved Hide resolved
tests/unit/test_db_sync.py Outdated Show resolved Hide resolved
Copy link
Contributor

@koszti koszti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please change the multi level archive_load_files.enabled option to be simply archive_load_files? Every option in the config.json is single level and keeping it consistent would make life easier. I added suggestions to the PR above.

Also, can you please document the new archive_load_files optional parameter in README.md in the Config settings section?

@llehtinen
Copy link
Contributor Author

@koszti Requested changes have been made. In addition, as discussed on Slack, the archive-load-files-primary-column term was replaced with incremental-key in metadata, variable names, and stream_utils method name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants