Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion: Replace multiprocessing.Pool with concurrent.futures.ProcessPoolExecutor #202

Merged
merged 2 commits into from
Apr 28, 2023

Conversation

tasansal
Copy link
Collaborator

@tasansal tasansal commented Apr 28, 2023

This PR replaces multiprocessing.Pool with concurrent.futures.ProcessPoolExecutor for improved exception handling and code simplification. The main issue addressed is the hanging of the ingestion process due to unhandled exceptions in subprocesses.

The changes include:

  1. Removal of trace_worker_map and header_scan_worker_map wrapper functions in _workers.py, as they are no longer necessary.
  2. Refactoring of blocked_io.py to use ProcessPoolExecutor.map instead of Pool.starmap. The parallel_inputs iterator has been moved directly into the map call.
  3. Updating parsers.py to use ProcessPoolExecutor instead of Pool. The executor.map method is now used, with the header_scan_worker function and the input iterators passed directly as arguments.

@tasansal tasansal added the enhancement New feature or request label Apr 28, 2023
@tasansal tasansal self-assigned this Apr 28, 2023
@tasansal tasansal changed the title Improve Ingestion Multiprocessing Logic Ingestion: Replace multiprocessing.Pool with concurrent.futures.ProcessPoolExecutor Apr 28, 2023
@tasansal tasansal marked this pull request as ready for review April 28, 2023 17:45
@tasansal tasansal merged commit 793e968 into main Apr 28, 2023
18 checks passed
@tasansal tasansal deleted the enh/ingestion_exception_handling branch April 28, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant