Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Busco subworkflow #28

Closed
wants to merge 143 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
4bfbbe4
subworkflow: BUSCO_DIAMOND; includes: GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
b9958fd
added script description header
alxndrdiaz Aug 31, 2022
81350bf
added local module GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
c593f27
added BUSCO_DIAMOND, including GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
3e16ed9
added params for GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
5ccd52f
removed uncommented line
alxndrdiaz Aug 31, 2022
bdc4795
updated fasta file input: INPUT_CHECK.out.genome
alxndrdiaz Aug 31, 2022
3f2d23a
def name = fasta.simpleName
alxndrdiaz Aug 31, 2022
156673e
updated name = fasta.simpleName
alxndrdiaz Aug 31, 2022
3ce4a04
updated name = fasta.getSimpleName()
alxndrdiaz Aug 31, 2022
cbd4068
updated name = fasta.map { f -> f.simpleName }
alxndrdiaz Aug 31, 2022
39f58da
added ext.args for GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
cd82840
single quotes in goat_taxon_search params
alxndrdiaz Aug 31, 2022
e4b2cfb
double quotes for input within script
alxndrdiaz Aug 31, 2022
110bb40
installed busco nf-core module
alxndrdiaz Sep 1, 2022
043dbab
added BUSCO module
alxndrdiaz Sep 1, 2022
621136f
added BUSCO ext.args
alxndrdiaz Sep 1, 2022
fcbb381
defined new channel lineages_list
alxndrdiaz Sep 1, 2022
4545974
added variable for busco meta.id
alxndrdiaz Sep 1, 2022
a97172e
used toString method to convert channel emission to string
alxndrdiaz Sep 1, 2022
85f82ab
use simple string for id for BUSCO meta.id
alxndrdiaz Sep 1, 2022
bc41e77
updated meta using flatten and map
alxndrdiaz Sep 1, 2022
7599aa9
updated BUSCO input using fasta.map
alxndrdiaz Sep 2, 2022
3a4f97c
removed unused variable
alxndrdiaz Sep 2, 2022
64be240
added ch_lineages forbusco
alxndrdiaz Sep 5, 2022
ae83fb7
added spaces in Channel.fromList
alxndrdiaz Sep 5, 2022
092724f
added nextflow.enable.dsl = 2
alxndrdiaz Sep 5, 2022
8fa2602
updated ch_lineages = Channel.of( lineages_list )
alxndrdiaz Sep 5, 2022
ffe31ce
removed ch_lineages variable
alxndrdiaz Sep 5, 2022
f554e82
updated nf-core modules
alxndrdiaz Sep 5, 2022
11c7447
restored commend line below CONFIG FILES
alxndrdiaz Sep 6, 2022
e791849
lineage input: lineages_list.join(,)
alxndrdiaz Sep 6, 2022
401a572
updated lineage input: {lineages_list.join()}
alxndrdiaz Sep 6, 2022
edc91c2
adde channel to use with join
alxndrdiaz Sep 6, 2022
df34fe2
use lineages.join()
alxndrdiaz Sep 6, 2022
1421d16
reverted lineage input
alxndrdiaz Sep 6, 2022
1f1cd3c
removed lineages channel
alxndrdiaz Sep 6, 2022
050c2eb
reverted lineage input: lineages_list.join
alxndrdiaz Sep 6, 2022
ae30a7d
restored lineage input : lineages_list
alxndrdiaz Sep 6, 2022
ad8efe4
used join() to definelineages list
alxndrdiaz Sep 8, 2022
0e1b1df
restored busco_subworkflow branch
alxndrdiaz Sep 8, 2022
8fbef6f
fixed conflicts
alxndrdiaz Sep 8, 2022
7b4e7e3
removed samtools/viewmodule
alxndrdiaz Sep 9, 2022
178a8f2
Fixed running BUSCO
muffato Sep 11, 2022
afd328b
removed samtools/view module
alxndrdiaz Sep 12, 2022
905139b
converted lineages path to channel
alxndrdiaz Sep 12, 2022
d97d32b
converted params.taxa_file to channel
alxndrdiaz Sep 14, 2022
a129609
converted params.taxon to channel
alxndrdiaz Sep 14, 2022
10f0933
restored params.taxon and params.taxa_file
alxndrdiaz Sep 14, 2022
548de2e
updated lineages_path channel
alxndrdiaz Sep 14, 2022
5df7f6e
updated busco_lineages_path
alxndrdiaz Sep 14, 2022
b7128ff
max_cpus=4, max_mem=12GB
alxndrdiaz Sep 14, 2022
dc68ff2
restored lineages_path to params.busco_lineages_path
alxndrdiaz Sep 14, 2022
1baa58a
updated test data: gfLaeSulp1, chicken of the woods
alxndrdiaz Sep 15, 2022
b07355d
restored mMelMel3 test dataset
alxndrdiaz Sep 15, 2022
2f8bd02
updated test set: gfLaeSulp1
alxndrdiaz Sep 16, 2022
2ace6b3
updated results directory
alxndrdiaz Sep 16, 2022
efe3398
restored results folder
alxndrdiaz Sep 16, 2022
4fac860
restored test data set Meles meles, full busco lineages path
alxndrdiaz Sep 16, 2022
0a89c32
added EXTRACT_BUSCO_GENES module
alxndrdiaz Sep 20, 2022
ff9b2ae
removed unused channel for busco_lineages_path
alxndrdiaz Sep 20, 2022
db0bf9a
added script for module EXTRACT_BUSCO_GENES
alxndrdiaz Sep 20, 2022
95778be
updated tag to meta.id
alxndrdiaz Sep 20, 2022
e0f9d39
updated input tuple: meta, tables
alxndrdiaz Sep 20, 2022
9225b02
updated container version
alxndrdiaz Sep 20, 2022
1e2c570
updated output tuple: meta, busco_genes.fasta
alxndrdiaz Sep 20, 2022
45b424e
added meta.id prefix
alxndrdiaz Sep 20, 2022
993f358
removed unused log file
alxndrdiaz Sep 20, 2022
c2da4ba
fixed btk version
alxndrdiaz Sep 20, 2022
dd50ee4
updated input for EXTRACT_BUSCO_GENES module
alxndrdiaz Sep 20, 2022
1538041
converted dir to a list ob BUSCO output paths
alxndrdiaz Sep 23, 2022
75ba460
added description for filtering BUSCO tables
alxndrdiaz Sep 23, 2022
48550d2
removed nf-core modules folder
alxndrdiaz Oct 11, 2022
684effd
installed new busco nf-core module
alxndrdiaz Oct 11, 2022
0870bf4
installed new version of nf-core module multiqc
alxndrdiaz Oct 11, 2022
69a1f44
installed new version of nf-core module dumpsoftwareversions
alxndrdiaz Oct 11, 2022
9957d8d
installed new version of nf-core module goat/taxonsearch
alxndrdiaz Oct 11, 2022
9878ab4
installed new version of nf-core module diamond/blastp
alxndrdiaz Oct 11, 2022
80fff6b
include nf-core module diamond/blastp
alxndrdiaz Oct 11, 2022
efb4345
updated path to nf-core modules
alxndrdiaz Oct 11, 2022
e0660a4
removed map transformation from busco_dir input channel
alxndrdiaz Oct 12, 2022
230f57c
updated busco results paths for basal lineages
alxndrdiaz Oct 12, 2022
5d8fc53
updated transformation for busco full tables
alxndrdiaz Oct 12, 2022
4ff30de
removed [] inside busco tables channel transformation
alxndrdiaz Oct 13, 2022
58cb0e2
used toList instead of collect for busco_tables channel
alxndrdiaz Oct 13, 2022
6bc1772
filter and transform paths to busco tables
alxndrdiaz Oct 14, 2022
16c042c
added meta to map transformation for busco tables paths
alxndrdiaz Oct 14, 2022
301fa8d
updated input to tuple of meta and paths to busco tables
alxndrdiaz Oct 15, 2022
1e6307d
EXTRACT_BUSCO_GENES input is not a tuple with meta and paths to full …
alxndrdiaz Oct 15, 2022
cc9a5d7
changed id for meta in paths to full tables
alxndrdiaz Oct 19, 2022
2d0146f
added new channel: [meta, file_name]
alxndrdiaz Oct 19, 2022
35f96db
updated fasta_filename channel
alxndrdiaz Oct 19, 2022
88495ff
removed double quotes from paths to busco tables
alxndrdiaz Oct 19, 2022
a29a8a9
updated channels for paths to busco tables
alxndrdiaz Oct 19, 2022
939e99e
rename full busco tables to avoid filename collision
alxndrdiaz Oct 19, 2022
dae3427
removed blank spaces within script
alxndrdiaz Oct 19, 2022
a017a26
fixed ouput path: *_busco_genes.fasta
alxndrdiaz Oct 19, 2022
3a1be38
removed file renaming, should be done before passing files as input
alxndrdiaz Oct 19, 2022
9360d60
rename busco tables before passing them to EXTRACT_BUSCO_GENES
alxndrdiaz Oct 20, 2022
01f1f59
use meta instead of id
alxndrdiaz Oct 20, 2022
0e19534
added transformation for busco tables paths
alxndrdiaz Oct 21, 2022
386aefd
removed variable parent from paths to busco tables
alxndrdiaz Oct 21, 2022
ab65ae0
removed misplaced - preceding map operator
alxndrdiaz Oct 24, 2022
3722aad
Sep 20 - mosedepth module
zb32 Sep 20, 2022
9802d97
created subworkflow coverage_stats with mosdepth
zb32 Oct 5, 2022
05e3c8b
pipeline cleanup (new samtools module)
zb32 Oct 10, 2022
b4f92e2
nf-core modules
zb32 Oct 11, 2022
07fc3db
new json file
zb32 Oct 11, 2022
8376b56
new nf-core modules
zb32 Oct 11, 2022
f38104c
remove local samtools view
zb32 Oct 11, 2022
91fd40d
test profile updated to S3 and requested changes
zb32 Oct 13, 2022
208e5c3
updated to show versions
zb32 Oct 13, 2022
eed44c7
updated samplesheet
zb32 Oct 13, 2022
fe5402c
input to pipeline now project directory
zb32 Oct 14, 2022
d2aa6f2
modules json fix
zb32 Oct 14, 2022
d9a36ec
prettier
zb32 Oct 14, 2022
ce30e21
schema changes
zb32 Oct 18, 2022
123e1c6
Update subworkflows/local/input_check.nf
zb32 Oct 19, 2022
30e0b3b
fastawindows nf-core module installed
zb32 Oct 19, 2022
10be5ad
module to create bed from fastawindows
zb32 Oct 20, 2022
a210a28
update to YAML file
zb32 Oct 20, 2022
ee0d388
mosdepth
zb32 Oct 20, 2022
fbd2409
add versions file
zb32 Oct 21, 2022
cbca827
added args to command
zb32 Oct 21, 2022
e84ec13
fix
zb32 Oct 21, 2022
2ea4ddc
added BUSCO_DIAMOND, including GOAT_TAXONSEARCH
alxndrdiaz Aug 31, 2022
72b4b23
installed busco nf-core module
alxndrdiaz Sep 1, 2022
6aa8065
updated nf-core modules
alxndrdiaz Sep 5, 2022
d2b330c
restored busco_subworkflow branch
alxndrdiaz Sep 8, 2022
ed6a1b4
removed samtools/viewmodule
alxndrdiaz Sep 9, 2022
8658908
removed samtools/view module
alxndrdiaz Sep 12, 2022
12b5cee
converted params.taxa_file to channel
alxndrdiaz Sep 14, 2022
d84dc3d
updated test data: gfLaeSulp1, chicken of the woods
alxndrdiaz Sep 15, 2022
d0c04b4
restored mMelMel3 test dataset
alxndrdiaz Sep 15, 2022
e559e86
updated test set: gfLaeSulp1
alxndrdiaz Sep 16, 2022
adfbbd1
restored test data set Meles meles, full busco lineages path
alxndrdiaz Sep 16, 2022
63a2a5c
installed new busco nf-core module
alxndrdiaz Oct 11, 2022
ce3ee0f
installed new version of nf-core module multiqc
alxndrdiaz Oct 11, 2022
d01f4ef
Main Workflow Fix
zb32 Oct 25, 2022
b5e73f6
main workflow fix 2
zb32 Oct 25, 2022
ce59104
main workflow fix
zb32 Oct 25, 2022
0f0cc17
lineage path reduced busco dataset
zb32 Oct 25, 2022
fdc2072
fixed input for EXTRACT_BUSCO_GENES module
alxndrdiaz Oct 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
sample,fastq_1,fastq_2
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,
sample,datatype,datafile
mMelMel3,hic,https://tolit.cog.sanger.ac.uk/test-data/Meles_meles/analysis/mMelMel3.2_paternal_haplotype/read_mapping/hic/GCA_922984935.2.subset.unmasked.hic.mMelMel3.cram
mMelMel1,illumina,https://tolit.cog.sanger.ac.uk/test-data/Meles_meles/analysis/mMelMel3.2_paternal_haplotype/read_mapping/illumina/GCA_922984935.2.subset.unmasked.illumina.mMelMel1.cram
mMelMel2,illumina,https://tolit.cog.sanger.ac.uk/test-data/Meles_meles/analysis/mMelMel3.2_paternal_haplotype/read_mapping/illumina/GCA_922984935.2.subset.unmasked.illumina.mMelMel2.cram
mMelMel3,ont,https://tolit.cog.sanger.ac.uk/test-data/Meles_meles/analysis/mMelMel3.2_paternal_haplotype/read_mapping/ont/GCA_922984935.2.subset.unmasked.ont.mMelMel3.cram
26 changes: 10 additions & 16 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,28 +9,22 @@
"properties": {
"sample": {
"type": "string",
"description": "Sample Name",
"pattern": "^\\S+$",
"errorMessage": "Sample name must be provided and cannot contain spaces"
},
"fastq_1": {
"datatype": {
"type": "string",
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
"pattern": "^\\S+$",
"enum": ["hic", "illumina", "ont", "pacbio"],
"errorMessage": "Data type, and must be one of: 'hic' or 'illumina' or 'ont' or 'pacbio'"
},
"fastq_2": {
"errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'",
"anyOf": [
{
"type": "string",
"pattern": "^\\S+\\.f(ast)?q\\.gz$"
},
{
"type": "string",
"maxLength": 0
}
]
"datafile": {
"type": "string",
"pattern": "^\\S+\\.cram$",
"errorMessage": "Data file for reads cannot contain spaces and must have extension 'cram'"
}
},
"required": ["sample", "fastq_1"]
"required": ["datafile", "datatype", "sample"]
}
}
9 changes: 3 additions & 6 deletions bin/tol_input.sh
Original file line number Diff line number Diff line change
@@ -1,16 +1,13 @@
#!/bin/bash

PROJECT_BASEDIR=/lustre/scratch123/tol/projects

if [ $# -ne 2 ]; then echo -e "Script to create a samplesheet for a species.\nUsage: $0 <tol_id> <tol_project>.\nVersion: 1.0"; exit 1; fi
if [ $# -ne 2 ]; then echo -e "Script to create a samplesheet for a species.\nUsage: $0 <tol_id> <tol_projectDir>.\nVersion: 1.0"; exit 1; fi

id="$1"
project="$2"
data="$PROJECT_BASEDIR/$project/data"
data="$2/data"

if [[ ! -d "$data" ]]
then
echo "Project "$project" cannot be found under $PROJECT_BASEDIR"
echo "Project directory "$data" does not exist"
exit 1
fi

Expand Down
10 changes: 10 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ process {
withName: FASTQC {
ext.args = '--quiet'
}
withName: GOAT_TAXONSEARCH {
ext.args = '-l -b'
}

withName: BUSCO {
ext.args = '--mode genome'
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the --tar option for BUSCO


withName: CUSTOM_DUMPSOFTWAREVERSIONS {
publishDir = [
Expand All @@ -38,4 +45,7 @@ process {
]
}

withName: 'SAMTOOLS_VIEW' {
ext.args = "--output-fmt bam --write-index"
}
}
28 changes: 9 additions & 19 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -12,33 +12,23 @@
// Resource settings - normally provided through pipeline `nextflow.config`
params {
// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = '6.GB'
max_cpus = 4
max_memory = '12.GB'
max_time = '6.h'

// Set default values
outdir = './results'
tracedir = "${params.outdir}/pipeline_info"

// Input test data
input = "mMelMel3"
project = ".sandbox"
}
input = "${projectDir}/assets/samplesheet.csv"
fasta = "https://tolit.cog.sanger.ac.uk/test-data/Meles_meles/assembly/release/mMelMel3.1_paternal_haplotype/GCA_922984935.2.subset.fasta.gz"

// Singularity settings - normally provided through `conf/sanger.config`
singularity.enabled = true
singularity.autoMounts = true
singularity.runOptions = '--bind /lustre --bind /nfs'
// goat_taxon_search
taxon = 'Meles meles'
taxa_file = ''

// Publishing settings - normally provided through pipeline `conf/modules.config`
process {
publishDir = { "${params.outdir}/pipeline_info" }
// BUSCO
busco_lineages_path = '/lustre/scratch123/tol/resources/nextflow/test-data/busco_2021_06_reduced'

withName: SAMTOOLS_VIEW {
publishDir = [
path: { "${params.outdir}/testing" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
}
40 changes: 34 additions & 6 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,43 @@
"name": "nf-core/blobtoolkit",
"homePage": "https://github.com/nf-core/blobtoolkit",
"repos": {
<<<<<<< HEAD
"nf-core/modules": {
"custom/dumpsoftwareversions": {
"git_sha": "e745e167c1020928ef20ea1397b6b4d230681b4d"
<<<<<<< HEAD
"nf-core/custom/dumpsoftwareversions": {
"git_sha": "8022c68e7403eecbd8ba9c49496f69f8c49d50f0"
},
"multiqc": {
"git_sha": "5138acca0985ca01c38a1c4fba917d83772b1106"
"busco": {
"git_sha": "89a84538bede7c6919f7c042fdb4c79e5e2d9d2a"
},
"samtools/view": {
"git_sha": "6b64f9cb6c3dd3577931cc3cd032d6fb730000ce"
"nf-core/fastawindows": {
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"nf-core/gunzip": {
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"nf-core/mosdepth": {
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"nf-core/samtools/view": {
"git_sha": "202683bfc98ddecdd456ea73268e330bca2e5c5a"
=======
"git_url": "https://github.com/nf-core/modules.git",
=======
"https://github.com/nf-core/modules.git": {
>>>>>>> installed new busco nf-core module
"modules": {
"nf-core": {
"busco": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"multiqc": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
}
}
>>>>>>> removed samtools/viewmodule
}
}
}
Expand Down
31 changes: 31 additions & 0 deletions modules/local/create_bed.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
process CREATE_BED {
tag "$meta.id"
label 'process_single'

conda (params.enable_conda ? "conda-forge::gawk=5.1.0" : null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/gawk:5.1.0' :
'quay.io/biocontainers/gawk:5.1.0' }"

input:
tuple val(meta), path(tsv) //path to tsv output from fasta windows

output:
path '*.bed' , emit: bed
path "versions.yml", emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
cut -f 1,2,3 $tsv | sed '1d' $args > ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
create_bed: 1.01
END_VERSIONS
"""
}
25 changes: 25 additions & 0 deletions modules/local/extract_busco_genes.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
process EXTRACT_BUSCO_GENES {
tag "$meta.id"

container "genomehubs/blobtoolkit-blobtools:3.3.4"

input:
tuple val(meta), path(arc), path(bac), path(euk)

output:
tuple val(meta), path('*_busco_genes.fasta') , emit: fasta
path "versions.yml" , emit: versions

script:
def prefix = task.ext.prefix ?: "${meta.id}"
def tables = ["\"$arc\"", "\"$bac\"", "\"$euk\""]
"""
btk pipeline extract-busco-genes \\
--busco $tables \\
--out ${prefix}_busco_genes.fasta
cat <<-END_VERSIONS > versions.yml
"${task.process}":
blobtoolkit: \$(btk --version | cut -d' ' -f2 | sed 's/v//')
END_VERSIONS
"""
}
40 changes: 40 additions & 0 deletions modules/local/goat_taxon_search.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

process GOAT_TAXONSEARCH {
tag "$meta.id"

conda (params.enable_conda ? "bioconda::goat=0.2.0" : null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/goat:0.2.0--h92d785c_0':
'quay.io/biocontainers/goat:0.2.0--h92d785c_0' }"

input:
tuple val(meta), val(taxon), path(taxa_file)

output:
path "*.tsv" , emit: taxonsearch
tuple val(meta), path("*.txt"), emit: busco_lineages
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
input = taxa_file ? "-f ${taxa_file}" : "-t ${taxon}"
if (!taxon && !taxa_file) error "No input. Valid input: single taxon identifier or a .txt file with identifiers"
if (taxon && taxa_file ) error "Only one input is required: a single taxon identifier or a .txt file with identifiers"
// ${prefix}.txt contains the list of BUSCO (odb10) lineages, one lineage per line without empty lines
"""
goat-cli taxon search \\
$args \\
"$input" > ${prefix}.tsv
cat ${prefix}.tsv | cut -f5 | sed '1d' | grep . > ${prefix}.txt
echo "bacteria_odb10" >> ${prefix}.txt
echo "archaea_odb10" >> ${prefix}.txt
Comment on lines +32 to +34
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how convenient that is, and simpler to do it in shell, but I'd want this to eventually move to pure Nextflow channel modifications instructions, so that the nf-core module of goat/taxonsearch can be used

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I don't know exactly how to make those same transformations with Nextflow, but I think a good first step will be to get the output file from goat_taxonsearch which is a .tsv file and convert it to .csv in order to use splitCsv operator to get the lines from the lineage column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use splitCsv with tsp as well, just change the default value of sep

cat <<-END_VERSIONS > versions.yml
"${task.process}":
goat: \$(goat-cli --version | cut -d' ' -f2)
END_VERSIONS
"""
}
39 changes: 0 additions & 39 deletions modules/local/samtools_view.nf

This file was deleted.

Loading