This nextflow pipeline can be used to requantify 10x experiments (processed with cellranger) with kallisto.
nextflow run mkfastq_and_kallisto.nf \
--bamfile <cellranger_bamfile>
--kallisto_index <...> \
--kallisto_gene_map <...> \
--chemistry <10xv2|10xv3> \
--barcode_whitelist <...> \
--outdir <...>
mkfastq
to turn the bam-file into fastqskallisto bus
to pseudo-align the readsbustools correct
to correct the cell-barcodes and sort thembustools count
for gene-wise and equivalence-wise counting of expression
Here's what will be contained in the output folder:
.
└── kallisto
├── bustools_counts
│ ├── bus_output_eqcount
│ │ ├── tcc.barcodes.txt
│ │ ├── tcc.ec.txt
│ │ └── tcc.mtx
│ └── bus_output_genecount
│ ├── gene.barcodes.txt
│ ├── gene.genes.txt
│ └── gene.mtx
├── bustools_metrics
│ └── bus_output.json
└── sort_bus
└── bus_output
├── matrix.ec
├── output.corrected.sort.bus
├── run_info.json
└── transcripts.txt
Note that kallisto/sort_bus/output.corrected.sort.bus
will be pretty big (a few GB)
Similar to mkfastq_and_kallisto, but starting from fastq files already
nextflow run fastq_and_kallisto.nf \
--readsglob <something_L00*_R{1,2}.fastq.gz> \
--outdir <directory> \
--chemistry <10xv2|10xv3> \
Same as mkfastq_and_kallisto.nf
Just turns a 10x/cellranger bamfile into the fastqs.
nextflow run mkfastq.nf \
--bamfile <.bam>\
--outdir <directory> \
--publish_mode <copy|symlink>
--publish_mode copy
creates copies of the files in outdir, symlink
just links from the nextflow tmp-directory (to save space)