Our pipeline scripts and their library code currently reside in src/. Nextflow has a feature called bundled scripts which add a repository's bin/ to the path before executing the workflow. This semantic change would shorten code paths within the workflow, better integrate with the caching/resuming mechanism, and enforce better separation between user-facing functions/scripts and reusable library code.
In addition to this relocation, the individual scripts should be refactored to extract as much common functionality as possible. In particular, parquet transcoding and SQL template rendering have been implemented in several locations. Investigate how Nextflow uses the lib/ directory for this purpose.
Finally, Reconsider the use of pip as the primary dependency manager. The KBase version already uses conda, and Nextflow has support at the process level for specific conda environments. One idea is to use one package manager for the development environment and the other for the run environment.
List view
0 issues of 1 selected
- Status: Open.#53 In EnzymeFunctionInitiative/EST;