Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tap): Utilize Joblib to run parallel streams during sync_all #2295

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

BuzzCutNorman
Copy link
Contributor

@BuzzCutNorman BuzzCutNorman commented Mar 7, 2024

Overview:
This PR attempts to utilize Joblib to allow sync_all to run streams in parallel. A new Tap class method sync_one was introduced to give a parallel loop a target for the streams. There is a new property called max_parallelism that takes in a integer value which is passed to parallel_config argument n_jobs. The default value of max_parallelism is None. A tap will only attempt a parallel run if value is present in max_parallelism . The capability of TAP_MAX_PARALLELISM_CONFIG was added to the Tap class so a tap can be passed a max_parallelism value via the meltano.yml.

Examples:

max_parallelism: -1 #All cpu cores
max_parallelism: Null #Null, 0, 1 are a no Parallel runs

Comments:
I need assistance with ideas on how to create pytests to cover these changes. Also if you run pytest when parallelism is enabled a lot of tests will fail, especially mapper test. The seem to only get the state message then nothing else.

Resources:


📚 Documentation preview 📚: https://meltano-sdk--2295.org.readthedocs.build/en/2295/

Copy link

codspeed-hq bot commented Mar 7, 2024

CodSpeed Performance Report

Merging #2295 will not alter performance

Comparing BuzzCutNorman:feat-tap-parallel-streams (3323b45) with main (b6fa56a)

Summary

✅ 6 untouched benchmarks

Copy link

codecov bot commented Mar 7, 2024

Codecov Report

Attention: Patch coverage is 46.51163% with 23 lines in your changes are missing coverage. Please review.

Project coverage is 88.84%. Comparing base (9d0c08b) to head (73342d4).

Files Patch % Lines
singer_sdk/tap_base.py 45.23% 19 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2295      +/-   ##
==========================================
- Coverage   89.18%   88.84%   -0.34%     
==========================================
  Files          54       54              
  Lines        4788     4822      +34     
  Branches      936      944       +8     
==========================================
+ Hits         4270     4284      +14     
- Misses        361      378      +17     
- Partials      157      160       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants