Skip to content

Conversation

@rvhonorato
Copy link
Member

@rvhonorato rvhonorato commented Sep 2, 2025

===autogenerated===

This pull request introduces parallel processing to the CLI workflow in src/prodigy_prot/cli.py, allowing multiple structure models to be processed concurrently. The main changes include adding a command-line argument to control processor usage, refactoring the execution logic to use a process pool, and encapsulating model processing in a dedicated function. These updates aim to improve performance and scalability when handling multiple input files or models.

Parallelization and CLI enhancements:

  • Added a new command-line argument --number-of-processors (-np) to allow users to specify how many processors to use for parallel execution.
  • Refactored the main execution logic to collect tasks and run them in parallel using ProcessPoolExecutor, dynamically adjusting the number of workers based on available tasks.
  • Introduced a new process_model function to encapsulate the processing of a single model, capturing and returning its output for sequential printing after parallel execution.

Imports and setup for parallel execution:

  • Added imports for ProcessPoolExecutor, as_completed, and StringIO to support parallel processing and output capture.

@rvhonorato rvhonorato linked an issue Sep 2, 2025 that may be closed by this pull request
@rvhonorato rvhonorato self-assigned this Sep 2, 2025
@rvhonorato rvhonorato marked this pull request as ready for review September 2, 2025 11:14
@rvhonorato rvhonorato merged commit 905d3dd into main Sep 2, 2025
8 checks passed
@rvhonorato rvhonorato deleted the 48-parallelize-execution branch September 2, 2025 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parallelize execution

2 participants