Error #8

mvtejesvi · 2021-09-27T11:24:52Z

snakemake: error: unrecognized arguments: --threads=8 --mem=60 --large_mem=250 --large_threads=8 --assembly_threads=8 --assembly_memory=250 --tmpdir=/local_scratch/student198 --database_dir=/scratch/project_2004930/databases --data_type=metagenome --interleaved_fastqs=False --deduplicate=True --duplicates_only_optical=False --duplicates_allow_substitutions=2 --preprocess_adapters=/scratch/project_2004930/databases/adapters.fa --preprocess_minimum_base_quality=10 --preprocess_minimum_passing_read_length=51 --preprocess_minimum_base_frequency=0.05 --preprocess_adapter_min_k=8 --preprocess_allowable_kmer_mismatches=1 --preprocess_reference_kmer_match_length=27 --error_correction_overlapping_pairs=True --contaminant_max_indel=20 --contaminant_min_ratio=0.65 --contaminant_kmer_length=13 --contaminant_minimum_hits=1 --contaminant_ambiguous=best --error_correction_before_assembly=True --merge_pairs_before_assembly=True --merging_k=62 --merging_extend2=40 --merging_flags=ecct iterations=5 --assembler=spades --megahit_min_count=2 --megahit_k_min=21 --megahit_k_max=121 --megahit_k_step=20 --megahit_merge_level=20,0.98 --megahit_prune_level=2 --megahit_low_local_ratio=0.2 --megahit_preset=default --spades_skip_BayesHammer=True --spades_use_scaffolds=True --spades_k=auto --spades_preset=meta --spades_extra= --longread_type=none --filter_contigs=True --prefilter_minimum_contig_length=300 --contig_trim_bp=0 --minimum_average_coverage=1 --minimum_percent_covered_bases=20 --minimum_mapped_reads=0 --minimum_contig_length=500 --contig_min_id=0.9 --contig_map_paired_only=True --contig_max_distance_between_pairs=1000 --maximum_counted_map_sites=10 --final_binner=DASTool --binner=metabat --binner=maxbin --metabat={'sensitivity': 'sensitive', 'min_contig_length': 1500} --maxbin={'max_iteration': 50, 'prob_threshold': 0.9, 'min_contig_length': 1000} --DASTool={'search_engine': 'diamond', 'score_threshold': 0.5} --genome_dereplication={'ANI': 0.95, 'overlap': 0.6, 'opt_parameters': '', 'filter': {'noFilter': False, 'length': 5000, 'completeness': 50, 'contamination': 10}, 'score': {'completeness': 1, 'contamination': 5, 'N50': 0.5, 'length': 0}} --rename_mags_contigs=True --annotations=gtdb_tree --annotations=gtdb_taxonomy --annotations=genes --genecatalog={'source': 'contigs', 'clustermethod': 'linclust', 'minlength_nt': 100, 'minid': 0.95, 'coverage': 0.9, 'extra': '', 'SubsetSize': 500000} --eggNOG_use_virtual_disk=False --virtual_disk=/dev/shm assembly
[2021-09-27 14:20 CRITICAL] Command 'snakemake --snakefile /scratch/project_2004930/miniconda3/envs/atlasenv/lib/python3.8/site-packages/atlas/Snakefile --directory /users/student198/First_Run --jobs 40 --rerun-incomplete --configfile '/users/student198/First_Run/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /scratch/project_2004930/databases/conda_envs --scheduler greedy assembly ' returned non-zero exit status 2.
(atlasenv) [student198@puhti-login1 First_Run]$

SilasK · 2021-09-27T11:28:46Z

Can you send me the content of ~/.config/snakemake/cluster/config.yaml

And ~/.config/snakemake/cluster/cluster_config.yaml

mvtejesvi · 2021-09-27T11:29:46Z

###################################################################

_______ _ _____

/\ | | | | /\ / ____|

/ \ | | | | / \ | (___

/ /\ \ | | | | / /\ \ ___ \

/ \ | | | | / \ ) |

// _\ || || // _\ |/

###################################################################

For more details about the config values see:

https://metagenome-atlas.rtfd.io

########################

Execution parameters

########################

threads and memory (GB) for most jobs especially from BBtools, which are memory demanding

threads: 8
mem: 60

threads and memory for jobs needing high amount of memory. e.g GTDB-tk,checkm or assembly

large_mem: 250
large_threads: 8
assembly_threads: 8
assembly_memory: 250

#Runtime only for cluster execution
runtime: #in h
default: 5
assembly: 48
long: 24

Local directory for temp files, useful for cluster execution without shared file system

tmpdir: /local_scratch/student198

directory where databases are downloaded with 'atlas download'

database_dir: /scratch/project_2004930/databases

########################

Quality control

########################
data_type: metagenome # metagenome or metatranscriptome
interleaved_fastqs: false

remove (PCR)-duplicated reads using clumpify

deduplicate: true
duplicates_only_optical: false
duplicates_allow_substitutions: 2

used to trim adapters from reads and read ends

preprocess_adapters: /scratch/project_2004930/databases/adapters.fa
preprocess_minimum_base_quality: 10
preprocess_minimum_passing_read_length: 51

0.05 requires at least 5 percent of each nucleotide per sequence

preprocess_minimum_base_frequency: 0.05
preprocess_adapter_min_k: 8
preprocess_allowable_kmer_mismatches: 1
preprocess_reference_kmer_match_length: 27

error correction where PE reads overlap

error_correction_overlapping_pairs: true
#contamination references can be added such that -- key: /path/to/fasta
contaminant_references:
PhiX: /scratch/project_2004930/databases/phiX174_virus.fa
human: /scratch/project_2004930/databases/human_genome.fasta
contaminant_max_indel: 20
contaminant_min_ratio: 0.65
contaminant_kmer_length: 13
contaminant_minimum_hits: 1
contaminant_ambiguous: best

########################

Pre-assembly-processing

########################

error_correction_before_assembly: true

join R1 and R2 at overlap; unjoined reads are still utilized

merge_pairs_before_assembly: true
merging_k: 62

extend reads while merging to this many nucleotides

merging_extend2: 40

Iterations are performed until extend2 x iterations

merging_flags: ecct iterations=5

########################

Assembly

########################

megahit OR spades

assembler: spades

Megahit

#-----------

2 is for metagenomes, 3 for genomes with 30x coverage

megahit_min_count: 2
megahit_k_min: 21
megahit_k_max: 121
megahit_k_step: 20
megahit_merge_level: 20,0.98
megahit_prune_level: 2
megahit_low_local_ratio: 0.2

['default','meta-large','meta-sensitive']

megahit_preset: default

Spades

#------------
spades_skip_BayesHammer: true
spades_use_scaffolds: true # if false use contigs
#Comma-separated list of k-mer sizes to be used (all values must be odd, less than 128 and listed in ascending order).
spades_k: auto
spades_preset: meta # meta, ,normal, rna single end libraries doesn't work for metaspades
spades_extra: ''
longread_type: none # [none,"pacbio", "nanopore", "sanger", "trusted-contigs", "untrusted-contigs"]

Preprocessed long reads can be defined in the sample table with 'longreads' , for more info see the spades manual

Filtering

#------------

filter out assembled noise

this is more important for assemblys from megahit

filter_contigs: true
prefilter_minimum_contig_length: 300

trim contig tips

contig_trim_bp: 0

require contigs to have read support

minimum_average_coverage: 1
minimum_percent_covered_bases: 20
minimum_mapped_reads: 0

after filtering

minimum_contig_length: 500

########################

Quantification

########################

Mapping reads to contigs

#--------------------------
contig_min_id: 0.9
contig_map_paired_only: true
contig_max_distance_between_pairs: 1000
maximum_counted_map_sites: 10

########################

Binning

########################

final_binner: DASTool # [DASTool or one of the binner, e.g. maxbin]

binner: # If DASTool is used as final_binner, use predictions of this binners

metabat
maxbin

metabat:
sensitivity: sensitive
min_contig_length: 1500 # metabat needs >1500

maxbin:
max_iteration: 50
prob_threshold: 0.9
min_contig_length: 1000

DASTool:
search_engine: diamond
score_threshold: 0.5 #Score threshold until selection algorithm will keep selecting bins [0..1].

genome_dereplication:
ANI: 0.95
overlap: 0.6
opt_parameters: ''
filter:
noFilter: false
length: 5000
completeness: 50
contamination: 10
score:
completeness: 1
contamination: 5
N50: 0.5
length: 0

rename_mags_contigs: true #Rename contigs of representative MAGs

########################

Annotations

#######################

annotations:

gtdb_tree
gtdb_taxonomy
genes

- checkm_taxonomy

- checkm_tree

########################

Gene catalog

#######################
genecatalog:
source: contigs # [contigs, genomes] Predict genes from all contigs or only from the representative genomes
clustermethod: linclust # [cd-hit-est or mmseqs or linclust] see mmseqs for more details
minlength_nt: 100
minid: 0.95 # min id for gene clustering for the main gene catalog used for annotation
coverage: 0.9
extra: ''
SubsetSize: 500000

eggNOG_use_virtual_disk: false # coping the eggNOG DB to a virtual disk can sppeed up the annotation
virtual_disk: /dev/shm # But you need 37G extra ram

mvtejesvi · 2021-09-27T11:31:43Z

This is a yaml file, defining options for specific rules or by default.

The '#' defines a comment.

the two spaces at the beginning of rows below rulenames are important.

## For more information see https://snakemake.readthedocs.io/en/stable/executing/cluster-cloud.html#cluster-execution

Overwrite/Define arguments for all rules

default:
account: project_2004930

rename_contigs:
threads: 2

queue: normal

You can overwrite values for specific rules

rulename:
queue: long
account: ""
time_min: # min
threads:

preckrasna · 2021-09-27T12:15:57Z

What is this error about?

snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. In case of YAML, make sure to not mix whitespace and tab indentation.
[2021-09-27 15:12 CRITICAL] Command 'snakemake --snakefile /scratch/project_2004930/miniconda3/envs/atlasenv/lib/python3.8/site-packages/atlas/Snakefile --directory /users/student215/First_Run --jobs 40 --rerun-incomplete --configfile '/users/student215/First_Run/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /scratch/project_2004930/databases/conda_envs --scheduler greedy assembly ' returned non-zero exit status 1.

SilasK mentioned this issue Sep 27, 2021

What is this error about? #12

Closed

SilasK closed this as completed Sep 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error #8

Error #8

mvtejesvi commented Sep 27, 2021

SilasK commented Sep 27, 2021

mvtejesvi commented Sep 27, 2021

mvtejesvi commented Sep 27, 2021

preckrasna commented Sep 27, 2021

Error #8

Error #8

Comments

mvtejesvi commented Sep 27, 2021

SilasK commented Sep 27, 2021

mvtejesvi commented Sep 27, 2021

_______ _ _____

/\ |__ __| | | /\ / ____|

/ \ | | | | / \ | (___

/ /\ \ | | | | / /\ \ ___ \

/ ____ \ | | | |____ / ____ \ ____) |

// _\ || || // _\ |/

For more details about the config values see:

https://metagenome-atlas.rtfd.io

Execution parameters

threads and memory (GB) for most jobs especially from BBtools, which are memory demanding

threads and memory for jobs needing high amount of memory. e.g GTDB-tk,checkm or assembly

Local directory for temp files, useful for cluster execution without shared file system

directory where databases are downloaded with 'atlas download'

Quality control

remove (PCR)-duplicated reads using clumpify

used to trim adapters from reads and read ends

0.05 requires at least 5 percent of each nucleotide per sequence

error correction where PE reads overlap

Pre-assembly-processing

join R1 and R2 at overlap; unjoined reads are still utilized

extend reads while merging to this many nucleotides

Iterations are performed until extend2 x iterations

Assembly

megahit OR spades

Megahit

2 is for metagenomes, 3 for genomes with 30x coverage

['default','meta-large','meta-sensitive']

Spades

Preprocessed long reads can be defined in the sample table with 'longreads' , for more info see the spades manual

Filtering

filter out assembled noise

this is more important for assemblys from megahit

trim contig tips

require contigs to have read support

after filtering

Quantification

Mapping reads to contigs

Binning

Annotations

- checkm_taxonomy

- checkm_tree

Gene catalog

mvtejesvi commented Sep 27, 2021

This is a yaml file, defining options for specific rules or by default.

The '#' defines a comment.

the two spaces at the beginning of rows below rulenames are important.

Overwrite/Define arguments for all rules

queue: normal

You can overwrite values for specific rules

preckrasna commented Sep 27, 2021

/\ | | | | /\ / ____|

/ \ | | | | / \ ) |