## Lab 9: Sequence Assembly

This report and the data used for it can be found at https://github.com/abondrn/compbio-labs/tree/master/assembly.

Using SPAdes, you will assemble a bacterial genome de novo using a combination of long PacBio reads and short Illumina reads. Next week, you will analyze your genome to determine the species and obtain an overview of its metabolism. You are expected to keep a thorough record of everything you did in your notebook.

References
 - [Sequence assembly](https://en.wikipedia.org/wiki/Sequence_assembly)
 - [SPAdes](http://cab.spbu.ru/files/release3.12.0/manual.html)
 - [PacBio SMRT Sequencing](https://en.wikipedia.org/wiki/Single_molecule_real_time_sequencing)
 - [Illumina sequencing](https://en.wikipedia.org/wiki/Illumina_dye_sequencing)

## Background

Genome sequencing and assembly arecommon techniques in biology. To obtain the sequence of a long genome, DNA must be chopped into small pieces that can be read by a sequencer. These short reads must then be stitched back together to form acomplete genome.Often, the genome cannot be fully assembled because there are multiple equally plausible ways of stitching the reads together. Ideally, each chromosome is assembled into a single, long sequence. In practice, chromosomes are often assembled into multiple “contigs,” or contiguous sequences. A genome assembly is generally considered complete only when all (or nearly all) the sequences are accounted for. Otherwise, it is considered a draft genome.

In this lab, you will assemble and analysis a bacterial genome, using Illumina and PacBio reads. This week, you will take the reads and assemble them into a complete genome. Next week, you will analyze the contents of your genome.

## Locating the data

DNA from an unknown bacterium was sequenced using PacBio and Illumina technologies. The resulting reads are uploaded onto bCourses. You will need to download these files. You will need to upload the files if you are using DataHub.

`illumina_reads_R1.fastq` – first paired-end read \
`illumina_reads_R2.fastq` – second paired-end read \
`pacbio_reads.fastq` – long PacBio reads 

In [7]:
!bzip2 -dk reads/*.fastq.bz2

bzip2: Output file reads/illumina_reads_R1.fastq already exists.
bzip2: Output file reads/illumina_reads_R2.fastq already exists.


## Running SPAdes

SPAdes is a hybrid genome assembler, meaning that it takes multiple sources of information as input and combines them to produce an optimal assembly. Assemblies using only short reads tend to be highly fragmented (i.e., many contigs). Assemblies using a high-quality short read set and a higher error rate long-read set (like PacBio) tend to be the best.

**Why do we expect short reads to produce a more fragmented assembly than long reads?**

> We would expect short reads to result in more fragmented assemblies because the shortness of the input leads to a higher amount of contigs versus long reads, which would increase the likelihood of there being gaps between reads, thus splitting up the assembly.

**Why does a single-molecule sequencing like PacBio have a higher error rate than Illumina?**

> Illuma uses many small reads versus the individual long strands used in SMRT. Thus, in the case of inevitable misreads, Illumina can use multiple overlaping contigs for the same reading frame thus having a greater coverage than the equivalent PacBio sequencing and reducing error.

We need to come up with a SPAdes command. At a minimum, you will need to specify the output directory with -o, the path to the first Illumina read with -1, the path to the second Illumina read with -2, and the path to your PacBio reads with --pacbio. Note: SPAdes must be run from the command line, and can take a while.

Genome assembly requires a relatively large amount of computer memory. Sometimes up to 1TB. Datahub instances have 64GB of memory. We have significantly subsampled the reads in order to run the analysis on DataHub.

SPAdes typically uses multi-threading to speed up assembly. Each thread requires memory. You may need to add -t 4 to your command so that it uses only 4 threads on the system rather than 16. If the program crashes, reduce the number of threads.

In [10]:
!spades.py -1 reads/illumina_reads_R1.fastq -2 reads/illumina_reads_R2.fastq --pacbio reads/pacbio_reads.fastq -o spades -t 4





Command line: /opt/conda/bin/spades.py	-1	/home/jovyan/compbio-labs/assembly/reads/illumina_reads_R1.fastq	-2	/home/jovyan/compbio-labs/assembly/reads/illumina_reads_R2.fastq	--pacbio	/home/jovyan/compbio-labs/assembly/reads/pacbio_reads.fastq	-o	/home/jovyan/compbio-labs/assembly/spades	-t	4	

System information:
  SPAdes version: 3.14.1
  Python version: 3.8.6
  OS: Linux-5.3.0-1036-gke-x86_64-with-glibc2.10

Output dir: /home/jovyan/compbio-labs/assembly/spades
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Standard mode
  For multi-cell/isolate data we recommend to use '--isolate' option; for single-cell MDA data use '--sc'; for metagenomic data use '--meta'; for RNA-Seq use '--rna'.
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/home/jovyan/compbio-labs/assembly/reads/illumina_reads_R1.fastq']
      right reads: ['/home/jovyan/compbio-labs/assembly/reads/illumina_reads_R2.fastq']
 

  0:07:34.353     2G / 3G    INFO    General                 (main.cpp                  : 178)   Finished clustering.
  0:07:34.353     2G / 3G    INFO    General                 (main.cpp                  : 197)   Starting solid k-mers expansion in 4 threads.
  0:07:38.559     2G / 3G    INFO    General                 (main.cpp                  : 218)   Solid k-mers iteration 0 produced 1909 new k-mers.
  0:07:42.595     2G / 3G    INFO    General                 (main.cpp                  : 218)   Solid k-mers iteration 1 produced 0 new k-mers.
  0:07:42.595     2G / 3G    INFO    General                 (main.cpp                  : 222)   Solid k-mers finalized
  0:07:42.595     2G / 3G    INFO    General                 (hammer_tools.cpp          : 222)   Starting read correction in 4 threads.
  0:07:42.596     2G / 3G    INFO    General                 (hammer_tools.cpp          : 235)   Correcting pair of reads: /home/jovyan/compbio-labs/assembly/reads/illumina_reads_R1.fastq an

  0:00:07.483   111M / 405M  INFO    General                 (kmer_index_builder.hpp    : 127)   K-mer counting done. There are 21443728 kmers in total.
  0:00:07.483   111M / 405M  INFO    General                 (kmer_index_builder.hpp    : 133)   Merging temporary buckets.
  0:00:08.697   111M / 405M  INFO    General                 (stage.cpp                 : 113)   PROCEDURE == Extension index construction
  0:00:08.700   111M / 405M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:00:08.700   111M / 405M  INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kmer instances into 64 files using 4 threads. This might take a while.
  0:00:08.703   111M / 405M  INFO    General                 (file_limit.hpp            :  32)   Open file limit set to 1048576
  0:00:08.703   111M / 405M  INFO    General                 (kmer_splitters.hpp        :  89)   Memory available for splitting buffers: 4.24993 Gb
  0:00:08.703

  0:00:39.024   203M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 4117 times
  0:00:39.024   203M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:01:25.345   201M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 16994 times
  0:01:25.345   201M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:30.145   199M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 287960 times
  0:01:30.145   199M / 621M  INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 2
  0:01:30.145   199M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:01:30.206   198M / 621M  INFO   Simplification           (parallel_processin

  0:01:31.408   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 3 times
  0:01:31.408   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:31.416   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:01:31.416   190M / 621M  INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 12
  0:01:31.416   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:01:31.417   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 0 times
  0:01:31.417   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:01:31.417   190M / 621M  INFO   Simplification           (parallel_processing.hpp   : 1

  0:00:08.593    11M / 645M  INFO    General                 (stage.cpp                 : 113)   PROCEDURE == Extension index construction
  0:00:08.596    11M / 645M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:00:08.596    11M / 645M  INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kmer instances into 64 files using 4 threads. This might take a while.
  0:00:08.598    11M / 645M  INFO    General                 (file_limit.hpp            :  32)   Open file limit set to 1048576
  0:00:08.598    11M / 645M  INFO    General                 (kmer_splitters.hpp        :  89)   Memory available for splitting buffers: 4.24992 Gb
  0:00:08.598    11M / 645M  INFO    General                 (kmer_splitters.hpp        :  97)   Using cell size of 524288
  0:00:13.509     3G / 3G    INFO    General                 (kmer_splitters.hpp        : 364)   Processed 22694253 kmers
  0:00:13.509     3G / 3G    INFO    General 

  0:00:44.437    85M / 838M  INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 20353 times
  0:00:44.437    85M / 838M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:00:47.110    73M / 838M  INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 7956 times
  0:00:47.110    73M / 838M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:00:47.745    73M / 838M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 42079 times
  0:00:47.745    73M / 838M  INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 2
  0:00:47.745    73M / 838M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:00:47.774    73M / 838M  INFO   Simplification           (parallel_processing

  0:00:48.217    71M / 838M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:00:48.220    71M / 838M  INFO   StageManager             (stage.cpp                 : 166)   STAGE == Simplification Cleanup
  0:00:48.220    71M / 838M  INFO    General                 (simplification.cpp        : 196)   PROCEDURE == Post simplification
  0:00:48.220    71M / 838M  INFO    General                 (graph_simplification.hpp  : 456)   Disconnection of relatively low covered edges disabled
  0:00:48.220    71M / 838M  INFO    General                 (graph_simplification.hpp  : 493)   Complex tip clipping disabled
  0:00:48.220    71M / 838M  INFO    General                 (graph_simplification.hpp  : 643)   Creating parallel br instance
  0:00:48.220    71M / 838M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:00:48.224    71M / 838M  INFO   Simplification           (parallel_proces

  0:00:13.750    11M / 862M  INFO    General                 (kmer_index_builder.hpp    : 120)   Starting k-mer counting.
  0:00:16.116    11M / 862M  INFO    General                 (kmer_index_builder.hpp    : 127)   K-mer counting done. There are 23215403 kmers in total.
  0:00:16.116    11M / 862M  INFO    General                 (kmer_index_builder.hpp    : 133)   Merging temporary buckets.
  0:00:18.699    11M / 862M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 314)   Building perfect hash indices
  0:00:20.377    29M / 862M  INFO    General                 (kmer_index_builder.hpp    : 150)   Merging final buckets.
  0:00:22.997    29M / 862M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 336)   Index built. Total 10773464 bytes occupied (3.71252 bits per kmer).
  0:00:23.018    53M / 862M  INFO   DeBruijnExtensionIndexBu (kmer_extension_index_build:  99)   Building k-mer extensions from k+1-mers
  0:00:27.089    53M / 862M  INFO   DeBruijnExtensio

  0:01:04.196   130M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 100114 times
  0:01:04.196   130M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:01:04.634   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 2241 times
  0:01:04.634   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:04.758   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 6495 times
  0:01:04.758   131M / 862M  INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 2
  0:01:04.758   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:01:04.777   131M / 862M  INFO   Simplification           (parallel_processing

  0:01:04.805   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 14 times
  0:01:04.805   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:01:04.809   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 0 times
  0:01:04.810   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:04.814   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:01:04.814   131M / 862M  INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 12
  0:01:04.814   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:01:04.814   131M / 862M  INFO   Simplification           (parallel_processing.hpp   : 

  0:00:04.472     3G / 3G    INFO    General                 (kmer_splitters.hpp        : 287)   Processed 234454 reads
  0:00:04.518    13M / 809M  INFO    General                 (kmer_splitters.hpp        : 293)   Used 234454 reads
  0:00:04.518    13M / 809M  INFO    General                 (kmer_index_builder.hpp    : 120)   Starting k-mer counting.
  0:00:08.270    13M / 809M  INFO    General                 (kmer_index_builder.hpp    : 127)   K-mer counting done. There are 22872915 kmers in total.
  0:00:08.270    13M / 809M  INFO    General                 (kmer_index_builder.hpp    : 133)   Merging temporary buckets.
  0:00:12.955    13M / 809M  INFO    General                 (stage.cpp                 : 113)   PROCEDURE == Extension index construction
  0:00:12.958    13M / 809M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:00:12.958    13M / 809M  INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kme

  0:01:10.862   102M / 2G    INFO    General                 (kmer_index_builder.hpp    : 150)   Merging final buckets.
  0:01:14.623   102M / 2G    INFO   K-mer Index Building     (kmer_index_builder.hpp    : 336)   Index built. Total 10614296 bytes occupied (3.71244 bits per kmer).
  0:01:14.847   486M / 2G    INFO    General                 (edge_index_builders.hpp   : 107)   Collecting edge information from graph, this takes a while.
  0:01:17.132   486M / 2G    INFO    General                 (edge_index.hpp            :  92)   Index refilled
  0:01:17.138   486M / 2G    INFO    General                 (gap_closer.cpp            : 147)   Preparing shift maps
  0:01:17.474   511M / 2G    INFO    General                 (gap_closer.cpp            : 107)   Processing paired reads (takes a while)
  0:01:17.826   510M / 2G    INFO    General                 (gap_closer.cpp            : 126)   Used 55070 paired reads
  0:01:17.827   510M / 2G    INFO    General                 (gap_clos

  0:01:19.256   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Tip clipper triggered 58 times
  0:01:19.256   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 165)   Running Bulge remover
  0:01:19.265   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 0 times
  0:01:19.266   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:19.275   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:01:19.275   127M / 2G    INFO    General                 (simplification.cpp        : 388)   PROCEDURE == Simplification cycle, iteration 12
  0:01:19.275   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 165)   Running Tip clipper
  0:01:19.275   127M / 2G    INFO   Simplification           (parallel_processing.hpp   : 

  0:00:00.012     4M / 12M   INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kmer instances into 16 files using 4 threads. This might take a while.
  0:00:00.014     4M / 12M   INFO    General                 (file_limit.hpp            :  32)   Open file limit set to 1048576
  0:00:00.014     4M / 12M   INFO    General                 (kmer_splitters.hpp        :  89)   Memory available for splitting buffers: 4.24995 Gb
  0:00:00.014     4M / 12M   INFO    General                 (kmer_splitters.hpp        :  97)   Using cell size of 1048576
  0:00:04.570     3G / 3G    INFO    General                 (kmer_splitters.hpp        : 287)   Processed 233864 reads
  0:00:04.625    11M / 974M  INFO    General                 (kmer_splitters.hpp        : 293)   Used 233864 reads
  0:00:04.625    11M / 974M  INFO    General                 (kmer_index_builder.hpp    : 120)   Starting k-mer counting.
  0:00:09.617    11M / 974M  INFO    General                 (kme

  0:01:02.949    93M / 2G    INFO    General                 (kmer_index_builder.hpp    : 120)   Starting k-mer counting.
  0:01:06.679    93M / 2G    INFO    General                 (kmer_index_builder.hpp    : 127)   K-mer counting done. There are 21406569 kmers in total.
  0:01:06.679    93M / 2G    INFO    General                 (kmer_index_builder.hpp    : 133)   Merging temporary buckets.
  0:01:11.856    93M / 2G    INFO   K-mer Index Building     (kmer_index_builder.hpp    : 314)   Building perfect hash indices
  0:01:13.855    93M / 2G    INFO    General                 (kmer_index_builder.hpp    : 150)   Merging final buckets.
  0:01:18.148    93M / 2G    INFO   K-mer Index Building     (kmer_index_builder.hpp    : 336)   Index built. Total 9934536 bytes occupied (3.71271 bits per kmer).
  0:01:18.355   477M / 2G    INFO    General                 (edge_index_builders.hpp   : 107)   Collecting edge information from graph, this takes a while.
  0:01:20.431   477M / 2G    INFO

  0:01:21.977   123M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:01:21.983   123M / 2G    INFO   StageManager             (stage.cpp                 : 166)   STAGE == Gap Closer
  0:01:21.984   123M / 2G    INFO    General                 (graph_pack.hpp            : 105)   Index refill
  0:01:21.988   123M / 2G    INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:01:21.988   123M / 2G    INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kmer instances into 64 files using 4 threads. This might take a while.
  0:01:21.990   123M / 2G    INFO    General                 (file_limit.hpp            :  32)   Open file limit set to 1048576
  0:01:21.990   123M / 2G    INFO    General                 (kmer_splitters.hpp        :  89)   Memory available for splitting buffers: 4.24553 Gb
  0:01:21.990   123M / 2G    INFO    General                 

  0:00:03.953     3G / 3G    INFO    General                 (kmer_splitters.hpp        : 287)   Processed 233270 reads
  0:00:04.000    11M / 847M  INFO    General                 (kmer_splitters.hpp        : 293)   Used 233270 reads
  0:00:04.000    11M / 847M  INFO    General                 (kmer_index_builder.hpp    : 120)   Starting k-mer counting.
  0:00:08.235    11M / 847M  INFO    General                 (kmer_index_builder.hpp    : 127)   K-mer counting done. There are 18943425 kmers in total.
  0:00:08.235    11M / 847M  INFO    General                 (kmer_index_builder.hpp    : 133)   Merging temporary buckets.
  0:00:13.209    11M / 847M  INFO    General                 (stage.cpp                 : 113)   PROCEDURE == Extension index construction
  0:00:13.213    11M / 847M  INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:00:13.213    11M / 847M  INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kme

  0:01:07.365   408M / 2G    INFO    General                 (edge_index_builders.hpp   : 107)   Collecting edge information from graph, this takes a while.
  0:01:09.174   408M / 2G    INFO    General                 (edge_index.hpp            :  92)   Index refilled
  0:01:09.175   408M / 2G    INFO    General                 (gap_closer.cpp            : 147)   Preparing shift maps
  0:01:09.439   433M / 2G    INFO    General                 (gap_closer.cpp            : 107)   Processing paired reads (takes a while)
  0:01:09.689   435M / 2G    INFO    General                 (gap_closer.cpp            : 126)   Used 55070 paired reads
  0:01:09.689   435M / 2G    INFO    General                 (gap_closer.cpp            : 128)   Merging paired indices
  0:01:09.834   440M / 2G    INFO   GapCloser                (gap_closer.cpp            : 332)   Closing short gaps
  0:01:10.184   440M / 2G    INFO   GapCloser                (gap_closer.cpp            : 366)   Closing short gaps com

  0:01:10.644   120M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Bulge remover triggered 0 times
  0:01:10.644   120M / 2G    INFO   Simplification           (parallel_processing.hpp   : 165)   Running Low coverage edge remover
  0:01:10.667   120M / 2G    INFO   Simplification           (parallel_processing.hpp   : 167)   Low coverage edge remover triggered 0 times
  0:01:10.673   120M / 2G    INFO   StageManager             (stage.cpp                 : 166)   STAGE == Gap Closer
  0:01:10.673   120M / 2G    INFO    General                 (graph_pack.hpp            : 105)   Index refill
  0:01:10.677   120M / 2G    INFO   K-mer Index Building     (kmer_index_builder.hpp    : 301)   Building kmer index
  0:01:10.677   120M / 2G    INFO    General                 (kmer_index_builder.hpp    : 117)   Splitting kmer instances into 64 files using 4 threads. This might take a while.
  0:01:10.680   120M / 2G    INFO    General                 (file_limit.hpp 

  0:03:13.010     2G / 4G    INFO   MultiGapJoiner           (hybrid_gap_closer.hpp     : 531)   Closed 3576 gaps
  0:03:20.503     2G / 4G    INFO    General                 (hybrid_aligning.cpp       : 166)   Closing gaps with long reads finished
  0:03:20.507     2G / 4G    INFO   StageManager             (stage.cpp                 : 166)   STAGE == Paired Information Counting
  0:03:20.516     2G / 4G    INFO    General                 (graph_pack.hpp            : 113)   Normalizing k-mer map. Total 3596 kmers to process
  0:03:20.517     2G / 4G    INFO    General                 (graph_pack.hpp            : 115)   Normalizing done
  0:03:20.518     2G / 4G    INFO    General                 (pair_info_count.cpp       : 322)   Min edge length for estimation: 17255
  0:03:20.518     2G / 4G    INFO    General                 (pair_info_count.cpp       : 333)   Estimating insert size for library #0
  0:03:20.519     2G / 4G    INFO    General                 (pair_info_count.cpp    

  0:03:26.965     2G / 4G    INFO    General                 (extenders_logic.cpp       : 396)   Creating scaffolding extender for lib 1
  0:03:26.965     2G / 4G    INFO   ExtensionChooser2015     (extension_chooser2015.hpp :  51)   ExtensionChooser2015 created
  0:03:26.967     2G / 4G    INFO    General                 (extenders_logic.cpp       : 422)   Using 1 long reads scaffolding library
  0:03:26.967     2G / 4G    INFO    General                 (launcher.cpp              : 432)   Total number of extenders is 5
  0:03:26.968     2G / 4G    INFO    General                 (path_extender.hpp         : 894)   Processed 0 paths from 916 (0%)
  0:03:26.979     2G / 4G    INFO    General                 (path_extender.hpp         : 894)   Processed 92 paths from 916 (10%)
  0:03:26.984     2G / 4G    INFO    General                 (path_extender.hpp         : 892)   Processed 128 paths from 916 (13%)
  0:03:26.991     2G / 4G    INFO    General                 (path_extender.hpp  


===== Copy files finished. 


===== Assembling finished. 


===== Breaking scaffolds started. 


== Running: /opt/conda/bin/python /opt/conda/share/spades/spades_pipeline/scripts/breaking_scaffolds_script.py --result_scaffolds_filename /home/jovyan/compbio-labs/assembly/spades/scaffolds.fasta --misc_dir /home/jovyan/compbio-labs/assembly/spades/misc --threshold_for_breaking_scaffolds 3


===== Breaking scaffolds finished. 


===== Terminate started. 


===== Terminate finished. 

 * Corrected reads are in /home/jovyan/compbio-labs/assembly/spades/corrected/
 * Assembled contigs are in /home/jovyan/compbio-labs/assembly/spades/contigs.fasta
 * Assembled scaffolds are in /home/jovyan/compbio-labs/assembly/spades/scaffolds.fasta
 * Paths in the assembly graph corresponding to the contigs are in /home/jovyan/compbio-labs/assembly/spades/contigs.paths
 * Paths in the assembly graph corresponding to the scaffolds are in /home/jovyan/compbio-labs/assembly/spades/scaffolds.paths
 * Assembly g