Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding viloca to the workflow #150

Merged
merged 89 commits into from
Jun 7, 2024
Merged

adding viloca to the workflow #150

merged 89 commits into from
Jun 7, 2024

Conversation

LaraFuhrmann
Copy link
Collaborator

No description provided.

Copy link

github-actions bot commented Feb 5, 2024

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ BASH shellcheck 11 0 0.51s
✅ DOCKERFILE hadolint 1 0 0.32s
✅ JUPYTER jupyfmt 11 8 0 6.36s
✅ MARKDOWN markdownlint 15 2 0 1.29s
✅ PERL perlcritic 1 0 1.27s
✅ PYTHON black 52 0 0 1.92s
✅ SNAKEMAKE snakefmt 25 0 0 9.83s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by OX Security

@LaraFuhrmann
Copy link
Collaborator Author

Tested VILOCA on the hiv-tutorial data CAP188 4 using the following config files to test uniform-tiling strategy and amplicon tiling strategy.

general:
    virus_base_config: ""
    aligner: bwa
    snv_caller: viloca

input:
    reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
    metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
    gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
    # NOTE: this input datadir isn't standard
    datadir: resources/samples/
    read_length: 301
    samples_file: samples.tsv
    paired: true

viloca:
    consensus: false

output:
    snv: true
    local: false
    global: false
    visualization: false
    QA: false
general:
    virus_base_config: ""
    aligner: bwa
    snv_caller: viloca

input:
    reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
    metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
    gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
    # NOTE: this input datadir isn't standard
    datadir: resources/samples/
    read_length: 301
    samples_file: samples.tsv
    paired: true

viloca:
    consensus: false
    insert_bedfile: resources/samples/insert.bed

output:
    snv: true
    local: false
    global: false
    visualization: false
    QA: false

@LaraFuhrmann LaraFuhrmann marked this pull request as ready for review March 5, 2024 12:01
@LaraFuhrmann
Copy link
Collaborator Author

Now we are adding the smallgenomeutilities scritp to merge the paried-end reads into one read. This is since viloca can then be applied to the merged reads with longer local regions / windows.

An example config file could look like this:

general:
    virus_base_config: ""
    aligner: bwa
    snv_caller: viloca

input:
    reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
    metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
    gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
    # NOTE: this input datadir isn't standard
    datadir: resources/samples/
    read_length: 301
    samples_file: samples.tsv
    paired: true

viloca:
    merge_paired_end_reads: true    
    consensus: false

output:
    snv: true
    local: false
    global: false
    visualization: false
    QA: false

@LaraFuhrmann
Copy link
Collaborator Author

We are also adding the paired_end_read_merger from smallgenomeutilities to the workflow.

The workflow as tested with a ww SARS-CoV-2 sample using the following config file:
(note we cheated by providing already an aligned bam file to V-pipe here, but the paired-end-read-merger and viloca were executed.)

general:
    virus_base_config: ""
    aligner: bwa
    snv_caller: viloca

input:
    reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
    metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
    gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
    # NOTE: this input datadir isn't standard
    datadir: resources/samples/
    read_length: 301
    samples_file: samples.tsv
    paired: true

applications:
    paired_end_read_merger: /Users/lfuhrmann/Downloads/smallgenomeutilities-dev/scripts/paired_end_read_merger

viloca:
    merge_paired_end_reads: true
    consensus: false

output:
    snv: true
    local: false
    global: false
    visualization: false
    QA: false

There is a bug in the paired_end_read_merger as VILOCA is reporting ValueError: Soft clipping only possible on the edges of a read.

@LaraFuhrmann
Copy link
Collaborator Author

For completeness, VILOCA also runs when we don't choose to the read merging.

general:
    virus_base_config: ""
    aligner: bwa
    snv_caller: viloca

input:
    reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
    metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
    gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
    # NOTE: this input datadir isn't standard
    datadir: resources/samples/
    read_length: 301
    samples_file: samples.tsv
    paired: true

applications:
    paired_end_read_merger: /Users/lfuhrmann/Downloads/smallgenomeutilities-dev/scripts/paired_end_read_merger

viloca:
    merge_paired_end_reads: false
    consensus: false

output:
    snv: true
    local: false
    global: false
    visualization: false
    QA: false

@LaraFuhrmann LaraFuhrmann requested a review from DrYak June 6, 2024 13:54
@DrYak DrYak merged commit c948d2e into master Jun 7, 2024
27 of 30 checks passed
@LaraFuhrmann LaraFuhrmann deleted the add_viloca branch June 10, 2024 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants