Segmentation fault with kallisto quant #432

bbimber · 2024-04-04T01:22:12Z

Hello,

I'm trying to run kallisto quant with the pseudobam option, and it is consistently running into a segfault error. Do you have suggestions on debugging that I could try? Below is the command and output:

kallisto quant -i <IDX> -o <OUTPUT> --gtf <GTF> --pseudobam -t 8 --verbose <10X_FQ1> <10X_FQ2>

and the end of the output:

[quant] done
[quant] processed 176,394,303 reads, 909,387 reads pseudoaligned
[quant] estimated average fragment length: 273.2
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 97 rounds
[  bam] writing pseudoalignments to BAM format .. /var/spool/slurmd/job39236197/slurm_script: line 118: 33371 Segmentation fault      (core dumped) $KALLISTO quant -i $RhSup_IDX -o $OUT --gtf $RhSup_GTF --pseudobam -t $THREADS --verbose $FQ1 $FQ2

thanks for any help.

The text was updated successfully, but these errors were encountered:

Yenaled · 2024-04-04T01:24:55Z

What version of kallisto are you running?

One possibility I can think of is that your GTF file is malformed (i.e. the transcript IDs in your GTF file don't match the transcript IDs in your index).

bbimber · 2024-04-04T01:39:24Z

The version is the latest, 0.50.1.

The GTF is a bit contrived, but i think it's valid. My reference space is about 30 coding sequences. I created the GTF by making one gene feature and on transcript feature for each reference, where that feature extends one 1 to the length of the reference. I'm not sure if this is helpful, but this is the actual code, which takes the FASTA FAI index, iterates it and writes out:

	while IFS=$'\t' read -r -a myArray
	do
		SEQ_NAME="${myArray[0]}"
		SEQ_LEN="${myArray[1]}"
		echo -e $SEQ_NAME"\tnimble\tgene\t1\t"$SEQ_LEN"\t.\t+\t.\tgene_id "\"$SEQ_NAME"\"; gene_name \""$SEQ_NAME"\"; gene_biotype = \"protein_coding\";" >> $GTF
		echo -e $SEQ_NAME"\ttnimble\ttranscript\t1\t"$SEQ_LEN"\t.\t+\t.\tgene_id "\"$SEQ_NAME"\"; gene_name \""$SEQ_NAME"\"; transcript_id = \""$SEQ_NAME"\"; gene_biotype = \"protein_coding\";" >> $GTF
	done < $FAI_FILE

the result looks something like this:

CCR7_NM_001032884	nimble	gene	1	1137	.	+	.	gene_id "CCR7_NM_001032884"; gene_name "CCR7_NM_001032884"; gene_biotype = "protein_coding";
CCR7_NM_001032884	nimble	transcript	1	1137	.	+	.	gene_id "CCR7_NM_001032884"; gene_name "CCR7_NM_001032884"; transcript_id = "CCR7_NM_001032884"; gene_biotype = "protein_coding";
CD3D_XM_015115817	nimble	gene	1	2264	.	+	.	gene_id "CD3D_XM_015115817"; gene_name "CD3D_XM_015115817"; gene_biotype = "protein_coding";
CD3D_XM_015115817	nimble	transcript	1	2264	.	+	.	gene_id "CD3D_XM_015115817"; gene_name "CD3D_XM_015115817"; transcript_id = "CD3D_XM_015115817"; gene_biotype = "protein_coding";
CD3D_XM_015115818	nimble	gene	1	2249	.	+	.	gene_id "CD3D_XM_015115818"; gene_name "CD3D_XM_015115818"; gene_biotype = "protein_coding";
CD3D_XM_015115818	nimble	transcript	1	2249	.	+	.	gene_id "CD3D_XM_015115818"; gene_name "CD3D_XM_015115818"; transcript_id = "CD3D_XM_015115818"; gene_biotype = "protein_coding";
CD3E_XM_015115816	nimble	gene	1	1549	.	+	.	gene_id "CD3E_XM_015115816"; gene_name "CD3E_XM_015115816"; gene_biotype = "protein_coding";
CD3E_XM_015115816	nimble	transcript	1	1549	.	+	.	gene_id "CD3E_XM_015115816"; gene_name "CD3E_XM_015115816"; transcript_id = "CD3E_XM_015115816"; gene_biotype = "protein_coding";
CD3E_XM_028834033	nimble	gene	1	1258	.	+	.	gene_id "CD3E_XM_028834033"; gene_name "CD3E_XM_028834033"; gene_biotype = "protein_coding";
CD3E_XM_028834033	nimble	transcript	1	1258	.	+	.	gene_id "CD3E_XM_028834033"; gene_name "CD3E_XM_028834033"; transcript_id = "CD3E_XM_028834033"; gene_biotype = "protein_coding";

Yenaled · 2024-04-04T01:59:08Z

OK, the latest version does not support BAM files. The last version to support BAM files is version 0.48.0.

bbimber · 2024-04-04T15:49:49Z

OK, the latest version does not support BAM files. The last version to support BAM files is version 0.48.0.

Got it. I'm running 0.48.0 to try it.

bbimber closed this as completed Apr 4, 2024

bbimber mentioned this issue Apr 4, 2024

Read-level pseudoalignment data that includes cellbarcode and UMI? #430

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault with kallisto quant #432

Segmentation fault with kallisto quant #432

bbimber commented Apr 4, 2024

Yenaled commented Apr 4, 2024

bbimber commented Apr 4, 2024

Yenaled commented Apr 4, 2024

bbimber commented Apr 4, 2024

Segmentation fault with kallisto quant #432

Segmentation fault with kallisto quant #432

Comments

bbimber commented Apr 4, 2024

Yenaled commented Apr 4, 2024

bbimber commented Apr 4, 2024

Yenaled commented Apr 4, 2024

bbimber commented Apr 4, 2024