Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generated pipelines incompatible with nextflow's pipeline sharing #194

Closed
cimendes opened this issue Feb 18, 2019 · 4 comments
Closed

generated pipelines incompatible with nextflow's pipeline sharing #194

cimendes opened this issue Feb 18, 2019 · 4 comments
Assignees
Labels
discussion enhancement New feature or request help wanted Extra attention is needed

Comments

@cimendes
Copy link
Member

cimendes commented Feb 18, 2019

Nextflow seamlessly integrates with BitBucket, GitHub, and GitLab hosted code repositories and sharing platforms. In theory, a nextflow pipeline generated with flowcraft could be hosted in one of these code repos and then be directly pulled through this nextflow feature. Unfortunatly I've identified a couple of issues that make this impossible with our pipelines:

1). Nextflow only perses the nextflow.config file for the manifest field when the pipeline isn't called "main.nf", ignoring the $IncludeConfig parameter. I've opened an issue on Nextflow's repository (nextflow-io/nextflow#1032), but so far it's flagged as low priority. A work-around is to instead of writing the manifest data in a new config file (mainfest.config), it can be appended to the current nextflow.config. I don't like this solution as the nextflow.config file shouldn't change from repository to repository,

2). Even if the manifest information is included directly in the nextflow.config file, or the script is saved in main.nf, the nextflow will pull all the files into the $HOME/.nextflow/assets directory. The pipeline will break upon execution as the ".forkTree.json" and ".treeDag.json" aren't in the execution directory.

I would like to know your opinion in how we can support the sharing functionality of nextflow as idealy, a pipeline generated with flowcraft could be hosted on GitHub, for example, and then directly pulled with nextflow.

@cimendes cimendes added enhancement New feature or request help wanted Extra attention is needed discussion labels Feb 18, 2019
@cimendes cimendes self-assigned this Feb 18, 2019
@cimendes
Copy link
Member Author

cimendes commented Jun 8, 2019

A way to solve the first part of the issue is to append the manifest information (currently in the manifest.config) to the beginning of the nextflow.config file.

cimendes added a commit that referenced this issue Jun 11, 2019
… remote execution (#204) - Partial solve to #194 issue

- Deprecation of the `manifest.config´ file
- Add the manifest information to the `nextflow.config` file
@cimendes
Copy link
Member Author

One of the sources of issues with this is that hidden files (like ".forkTree.json" and ".treeDag.json") aren't staged by git unless you explicitly state them in the git add command. I suggest making these files visible and moving them to the resources folder, that already contains the "main.js.zip" file for the report generation (offline mode). What do you think @tiagofilipe12 ?

@tiagofilipe12
Copy link
Collaborator

Just have in mind that report and inspect mode require the file location to properly send DAG to the web service.

cimendes added a commit that referenced this issue Jun 18, 2019
* remove submodule from dev install

* fix typo

* Added bwa component

* Added cpus to bwa command

* added manifest information to the `nextflow.config` file to allow for remote execution (#204) - Partial solve to #194 issue

- Deprecation of the `manifest.config´ file
- Add the manifest information to the `nextflow.config` file

* Added component for haplotypecaller

* Added merge vcfs to haplotypecaller component

* Added mark duplicates component

* Added bam index to mark duplicates

* Added base_recalibrator component

* Removed publishDir for haplotypecaller

* Added apply_bqsr process to base_recalibrator component

* Updated changelog

* Added description to haplotypecaller

* Add check for the location of specific dot files

* Updated changelog

* Updated version
@cimendes cimendes mentioned this issue Jun 18, 2019
cimendes added a commit that referenced this issue Jul 4, 2019
* move DAG JSON files to the resources directory

* added manifest information to the `nextflow.config` file to allow for remote execution (#204) - Partial solve to #194 issue
- Deprecation of the `manifest.config´ file
@cimendes
Copy link
Member Author

cimendes commented Jul 5, 2019

I'm closing this issues as the two lastest PRs (#204 and #209) have addressed the problems discussed here. I've been able to run flowcraft's pipelines remotely with Nextflow without any issue in local machines and in the cloud. Feel free to re-open this issue is you run into this problem in any particular circumstance.

@cimendes cimendes closed this as completed Jul 5, 2019
cimendes added a commit that referenced this issue Sep 16, 2019
* Dag files (#209)

* move DAG JSON files to the resources directory

* added manifest information to the `nextflow.config` file to allow for remote execution (#204) - Partial solve to #194 issue
- Deprecation of the `manifest.config´ file

* Set phred encoding when it fails to be determined - trimmomatic (#211)

* fix bug publishdir (downsample_fastq component)

* add pphred33 when encoding fails to be determined, if still fails retry with phred64 encoding (trimmomatic component)

* Fix downsample (#222)

* edited file names for downsample fastqs
* stringified depth for file name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants