Skip to content

[DOP-22133] Implement increment for transfers with file sources#209

Merged
IlyasDevelopment merged 1 commit intodevelopfrom
feature/DOP-22133
Feb 27, 2025
Merged

[DOP-22133] Implement increment for transfers with file sources#209
IlyasDevelopment merged 1 commit intodevelopfrom
feature/DOP-22133

Conversation

@IlyasDevelopment
Copy link
Contributor

@IlyasDevelopment IlyasDevelopment commented Feb 24, 2025

Change Summary

  • Implemented FileDownloader for incremental transfer with FileModifiedTimeHWM, using a unique name composed of transfer.id, transfer.source_connection.name, and transfer.source_params.directory_path
  • Before writing data to the target, added a check in HWMStore for existing HWM. If absent or strategy=full, FileDFWriter is initialized with if_exists="replace_overlapping_partitions" or if_exists="replace_entire_table" . Otherwise, if_exists="append" is used to prevent duplicates
  • Wrapped the entire transfer logic (reading and writing) within the IncrementalStrategy() context
  • Added worker configuration parameters for Horizon, including its URL, authentication credentials, and a custom namespace
  • Added a container with Horizon
  • Developed integration tests to verify incremental reading, first fetching all source data, then a new subset
  • Used Strategy.increment_by=file_name with FileListHWM for FTP/FTPS

Checklist

  • Commit message and PR title is comprehensive
  • Keep the change as small as possible
  • Unit and integration tests for the changes exist
  • Tests pass on CI and coverage does not decrease
  • Documentation reflects the changes where applicable
  • docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
    (see CONTRIBUTING.rst for details.)
  • My PR is ready to review.

@codecov
Copy link

codecov bot commented Feb 24, 2025

Codecov Report

Attention: Patch coverage is 92.18750% with 10 lines in your changes missing coverage. Please review.

Project coverage is 91.97%. Comparing base (c3473e0) to head (b9c0c72).
Report is 155 commits behind head on develop.

Files with missing lines Patch % Lines
syncmaster/worker/handlers/file/remote_df.py 85.18% 2 Missing and 2 partials ⚠️
syncmaster/dto/transfers_strategy.py 88.88% 1 Missing and 1 partial ⚠️
syncmaster/worker/handlers/db/clickhouse.py 33.33% 1 Missing and 1 partial ⚠️
syncmaster/dto/transfers.py 92.85% 0 Missing and 1 partial ⚠️
syncmaster/worker/handlers/file/local_df.py 88.88% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #209      +/-   ##
===========================================
+ Coverage    91.95%   91.97%   +0.01%     
===========================================
  Files          189      192       +3     
  Lines         4611     4695      +84     
  Branches       335      345      +10     
===========================================
+ Hits          4240     4318      +78     
- Misses         293      295       +2     
- Partials        78       82       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dolfinus
Copy link
Member

LGTM, but SFTP tests are failing, please check

@IlyasDevelopment IlyasDevelopment merged commit c56f3ce into develop Feb 27, 2025
26 checks passed
@IlyasDevelopment IlyasDevelopment deleted the feature/DOP-22133 branch February 27, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants