Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

Configuring Unzip

Zev Kronenberg edited this page Apr 22, 2019 · 6 revisions

Configuring Unzip

  • Falcon-wide options go in the [General] section.
  • Unzip-wide options go in the [Unzip] section.
  • Job-submission details go in a [job.*] sections.
    • Defaults go in [job.defaults].
    • Overrides go in [job.step.*].

(If you use JSON as the file format, then drop the brackets ([]) around section names.)

Typically, you will override NPROC and njobs in specific sections of the workflow. E.g.

[job.step.unzip.quiver]
NPROC=48  # 48 procs/job
njobs=30  # up to 30 simultaneous jobs in the submission queue (aka job concurrency)

[job.defaults]
NPROC=4
njobs=10

There are several specific sections available:

  • job.step.unzip.phasing (aka unzip.blasr_aln)
  • job.step.unzip.hasm
  • job.step.unzip.track_reads
  • job.step.unzip.quiver Sections specific to CCS assembly:
  • job.step.unzip.minimap
  • job.step.unzip.polish (polish replaces quiver section)

For more information, see pypeFLOW wiki.

Distributed tasks in each section

(This is current as of April 2018, but check the source-code to be sure.)

Some but not all of these tasks are parallel jobs.


Section job.step.unzip.phasing

(An alias is job.step.unzip.blasr_aln, but we might someday stop using blasr there.)

Mainly, this controls a script with (roughly)

blasr
samtools sort/index
python -m falcon_unzip.mains.phasing_make_het_call
python -m falcon_unzip.mains.phasing_generate_association_table
python -m falcon_unzip.mains.phasing_get_phased_blocks
python -m falcon_unzip.mains.phasing_get_phased_reads
python -m falcon_unzip.mains.phasing_readmap
python -m falcon_unzip.proto.extract_phased_preads
python -m falcon_unzip.proto.main_augment_pb

And also

python -m falcon_unzip.mains.graphs_to_h_tigs_2 apply

Section job.step.unzip.hasm

python -m falcon_unzip.mains.ovlp_filter_with_phase_strict
python -m falcon_unzip.mains.phased_ovlp_to_graph
python -m falcon_kit.mains.graph_to_contig

And also

# GFA generation

Section job.step.unzip.track_reads

python -m falcon_unzip.mains.rr_ctg_track --base-dir={params.topdir} --output={output.rawread_to_contigs}
python -m falcon_unzip.mains.pr_ctg_track --base-dir={params.topdir} --output={output.pread_to_contigs}
# Those outputs are used only by fetch_reads.
python -m falcon_unzip.mains.fetch_reads --base-dir={params.topdir} --fofn={input.fofn} --ctg-list={output.ctg_list_file}

And also, part of 4-quiver:

python -m falcon_unzip.mains.get_read_hctg_map
fc_rr_hctg_track.py
fc_rr_hctg_track2.exe

And

python -m falcon_unzip.mains.get_read2ctg

And

python -m falcon_unzip.mains.bam_partition_and_merge

And

python -m falcon_unzip.mains.bam_segregate

Section job.step.unzip.quiver

samtools faidx
pbalign
variantCaller