# NanoCount command line usage

### Activate virtual environment

In [1]:
conda activate nanocount

(nanocount) 

: 1

### Running NanoCount

In [4]:
NanoCount --help

usage: NanoCount [-h] [--version] -i ALIGNMENT_FILE [-o COUNT_FILE]
                 [-b FILTER_BAM_OUT] [-l MIN_READ_LENGTH]
                 [-f MIN_QUERY_FRACTION_ALIGNED] [-t EQUIVALENT_THRESHOLD]
                 [-s SCORING_VALUE] [-c CONVERGENCE_TARGET] [-e MAX_EM_ROUNDS]
                 [-x] [-p PRIMARY_SCORE] [-a] [-3 MAX_DIST_3_PRIME]
                 [-5 MAX_DIST_5_PRIME] [-v] [-q]

NanoCount estimates transcripts abundance from Oxford Nanopore *direct-RNA
sequencing* datasets, using an expectation-maximization approach like RSEM,
Kallisto, salmon, etc to handle the uncertainty of multi-mapping reads

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Input/Output options:
  -i ALIGNMENT_FILE, --alignment_file ALIGNMENT_FILE
                        BAM or SAM file containing aligned ONT dRNA-Seq reads
                        including secondary and supplementary alignment
              

: 1

#### Basic command

In [5]:
NanoCount -i ./data/aligned_reads_sorted.bam -o ./output/tx_counts.tsv -3 50
head ./output/tx_counts.tsv

[01;34m## Checking options and input files ##[0m
[01;34m## Initialise Nanocount ##[0m
[32m	Parse Bam file and filter low quality alignments[0m
[32m	Summary of alignments parsed in input bam file[0m
[32m		Valid alignments: 150,779[0m
[32m		Discarded unmapped alignments: 9,545[0m
[32m		Discarded alignment with invalid 3 prime end: 6,205[0m
[32m		Discarded negative strand alignments: 4,515[0m
[32m	Summary of reads filtered[0m
[32m		Reads with valid best alignment: 85,200[0m
[32m		Invalid secondary alignments: 58,464[0m
[32m		Valid secondary alignments: 4,330[0m
[32m		Reads with low query fraction aligned: 1,544[0m
[32m		Reads too short: 817[0m
[32m	Generate initial read/transcript compatibility index[0m
[01;34m## Start EM abundance estimate ##[0m
	Progress: 3.00 rounds [00:00, 7.90 rounds/s]
[32m	Exit EM loop after 3 rounds[0m
[32m	Convergence value: 0.0019414029372732597[0m
[01;34m## Summarize data ##[0m
[32m	Convert results to dataframe[0m
[32m	C

: 1

#### Adding extra transcripts information

The `extra_tx_info` option adds a columns with the transcript lengths and also includes all the zero-coverage transcripts in the results

In [7]:
NanoCount -i ./data/aligned_reads_sorted.bam -o ./output/tx_counts.tsv --extra_tx_info
head ./output/tx_counts.tsv

[01;34m## Checking options and input files ##[0m
[01;34m## Initialise Nanocount ##[0m
[32m	Parse Bam file and filter low quality alignments[0m
[32m	Summary of alignments parsed in input bam file[0m
[32m		Valid alignments: 153,201[0m
[32m		Discarded unmapped alignments: 9,545[0m
[32m		Discarded negative strand alignments: 4,515[0m
[32m		Discarded alignment with invalid 3 prime end: 3,783[0m
[32m	Summary of reads filtered[0m
[32m		Reads with valid best alignment: 85,899[0m
[32m		Invalid secondary alignments: 60,112[0m
[32m		Valid secondary alignments: 4,375[0m
[32m		Reads with low query fraction aligned: 1,580[0m
[32m		Reads too short: 798[0m
[32m	Generate initial read/transcript compatibility index[0m
[01;34m## Start EM abundance estimate ##[0m
	Progress: 3.00 rounds [00:00, 7.64 rounds/s]
[32m	Exit EM loop after 3 rounds[0m
[32m	Convergence value: 0.0019319339005414125[0m
[01;34m## Summarize data ##[0m
[32m	Convert results to dataframe[0m
[32m	C

: 1

#### Write selected alignment to BAM file

In [8]:
NanoCount -i ./data/aligned_reads_sorted.bam -o ./output/tx_counts.tsv -b ./output/aligned_reads_selected.bam --extra_tx_info
head ./output/tx_counts.tsv

[01;34m## Checking options and input files ##[0m
[01;34m## Initialise Nanocount ##[0m
[32m	Parse Bam file and filter low quality alignments[0m
[32m	Summary of alignments parsed in input bam file[0m
[32m		Valid alignments: 153,201[0m
[32m		Discarded unmapped alignments: 9,545[0m
[32m		Discarded negative strand alignments: 4,515[0m
[32m		Discarded alignment with invalid 3 prime end: 3,783[0m
[32m	Summary of reads filtered[0m
[32m		Reads with valid best alignment: 85,899[0m
[32m		Invalid secondary alignments: 60,112[0m
[32m		Valid secondary alignments: 4,375[0m
[32m		Reads with low query fraction aligned: 1,580[0m
[32m		Reads too short: 798[0m
[32m	Write selected alignments to BAM file[0m
[32m	Summary of alignments written to bam[0m
[32m		Alignments to select: 90,274[0m
[32m		Alignments written: 90,274[0m
[32m		Alignments skipped: 80,770[0m
[32m	Generate initial read/transcript compatibility index[0m
[01;34m## Start EM abundance estimate ##[0m
	Pr

: 1

#### Relaxing the equivalence threshold

The default value is 0.9 (90% of the alignment score of the primary alignment) but this value could be lowered to allow more secondary alignments to be included in the uncertainty calculation.
Lowering the value bellow 0.75 might not be relevant and will considerably increase the computation time.

In [11]:
NanoCount -i ./data/aligned_reads_sorted.bam -o ./output/tx_counts.tsv --equivalent_threshold 0.8
head ./output/tx_counts.tsv

[01;34m## Checking options and input files ##[0m
[01;34m## Initialise Nanocount ##[0m
[32m	Parse Bam file and filter low quality alignments[0m
[32m	Summary of alignments parsed in input bam file[0m
[32m		Valid alignments: 153,201[0m
[32m		Discarded unmapped alignments: 9,545[0m
[32m		Discarded negative strand alignments: 4,515[0m
[32m		Discarded alignment with invalid 3 prime end: 3,783[0m
[32m	Summary of reads filtered[0m
[32m		Reads with valid best alignment: 85,899[0m
[32m		Valid secondary alignments: 50,019[0m
[32m		Invalid secondary alignments: 14,468[0m
[32m		Reads with low query fraction aligned: 1,580[0m
[32m		Reads too short: 798[0m
[32m	Generate initial read/transcript compatibility index[0m
[01;34m## Start EM abundance estimate ##[0m
	Progress: 18.0 rounds [00:03, 4.75 rounds/s]
[32m	Exit EM loop after 18 rounds[0m
[32m	Convergence value: 0.00462674973518923[0m
[01;34m## Summarize data ##[0m
[32m	Convert results to dataframe[0m
[32m	C

: 1

#### verbose mode

Print additional information for QC and debugging

In [9]:
NanoCount -i ./data/aligned_reads_sorted.bam -o ./output/tx_counts.tsv --equivalent_threshold 0.8  --verbose

[01;34m## Checking options and input files ##[0m
[37m	[DEBUG]: Options summary[0m
[37m	[DEBUG]: 	Package name: NanoCount[0m
[37m	[DEBUG]: 	Package version: 0.2.4.post1[0m
[37m	[DEBUG]: 	Timestamp: 2021-06-27 15:29:53.995009[0m
[37m	[DEBUG]: 	alignment_file: ./data/aligned_reads_sorted.bam[0m
[37m	[DEBUG]: 	count_file: ./output/tx_counts.tsv[0m
[37m	[DEBUG]: 	filter_bam_out: [0m
[37m	[DEBUG]: 	min_read_length: 50[0m
[37m	[DEBUG]: 	discard_suplementary: False[0m
[37m	[DEBUG]: 	min_query_fraction_aligned: 0.5[0m
[37m	[DEBUG]: 	equivalent_threshold: 0.8[0m
[37m	[DEBUG]: 	scoring_value: alignment_score[0m
[37m	[DEBUG]: 	convergence_target: 0.005[0m
[37m	[DEBUG]: 	max_em_rounds: 100[0m
[37m	[DEBUG]: 	extra_tx_info: False[0m
[37m	[DEBUG]: 	primary_score: primary[0m
[37m	[DEBUG]: 	max_dist_3_prime: 100[0m
[37m	[DEBUG]: 	max_dist_5_prime: -1[0m
[37m	[DEBUG]: 	verbose: True[0m
[37m	[DEBUG]: 	quiet: False[0m
[01;34m## Initialise Nanocount ##[0m
[32m	Pa

: 1