terminate called after throwing an instance of 'std::invalid_argument' #917

Shicheng-Guo · 2021-05-24T20:08:09Z

Any idea to this issue?

terminate called after throwing an instance of 'std::invalid_argument'

(base) [sguo2@login02 regulomedb]$ bedtools sort -i hg38_RegulomeDB.bed > hg38_RegulomeDB.sort.bed
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoll
Aborted (core dumped)

(base) [sguo2@login02 regulomedb]$ bedtools
bedtools is a powerful toolset for genome arithmetic.

Version: v2.30.0
About: developed in the quinlanlab.org and by many contributors worldwide.
Docs: http://bedtools.readthedocs.io/
Code: https://github.com/arq5x/bedtools2
Mail: https://groups.google.com/forum/#!forum/bedtools-discuss

Usage: bedtools [options]

The bedtools sub-commands include:

The text was updated successfully, but these errors were encountered:

arq5x · 2021-05-24T20:39:19Z

could you share the first few lines of the input file?

Shicheng-Guo · 2021-05-24T21:38:23Z

Thanks Aaron, Here is top10 lines

(base) [sguo2@login01 regulomedb]$ head hg38_RegulomeDB.bed
chr1    11011   11012   rs544419019     False   False   True    True    False   False   False   6
chr1    13109   13110   rs540538026     False   False   False   False   False   False   False   7
chr1    13115   13116   rs62635286      False   False   False   False   False   False   False   7
chr1    13117   13118   rs62028691      False   False   False   False   False   False   False   7
chr1    13272   13273   rs531730856     False   False   False   False   False   False   False   7
chr1    13667   13668   rs2691328       False   False   True    False   False   False   False   6
chr1    14463   14464   rs546169444     False   False   True    False   False   False   False   6
chr1    14929   14930   rs6682385       False   False   False   False   False   False   False   7
chr1    14932   14933   rs199856693     False   False   False   False   False   False   False   7
chr1    15773   15774   rs374029747     False   False   False   False   False   False   False   7

arq5x · 2021-05-24T21:55:41Z

The error is being raised by the "False" and "True" in the file because it has exactly 12 columns and bedtools is interpreting it to be a BED12 format where False and True would be invalid for some of the columns where a number is strictly expected. I would recommend replacing True with 1 and False with 0.

sed -e 's/False/0/g' -e 's/True/1/g' hg38_RegulomeDB.bed > hg38_RegulomeDB.new.bed

Shicheng-Guo · 2021-05-25T03:37:18Z

Thanks Aaron! the solution works perfect.

- workflow ran into this error arq5x/bedtools2#917

hepcat72 · 2023-07-17T20:55:28Z

I'm running into this same issue, though I'm using the output of a bedtools intersect with -loj on 2 bed6 files... I didn't get this error on a different dataset and I'm not sure why... So my input looks like this:

chr1	259023	259024	ENST00000450734.1:ENSG00000228463	.	-	chr1	259023	259024	ENST00000450734.1:ENSG00000228463	.	-
chr1	264732	264733	ENST00000442116.1:ENSG00000228463	.	-	chr1	264732	264733	ENST00000442116.1:ENSG00000228463	.	-
chr1	266854	266855	ENST00000669836.1:ENSG00000286448	.	+	chr1	266854	266855	ENST00000669836.1:ENSG00000286448	.	+
chr1	268815	268816	ENST00000448958.2:ENSG00000228463,ENST00000634344.2:ENSG00000228463	.	-	chr1	268815	268816	ENST00000448958.2:ENSG00000228463,ENST00000634344.2:ENSG00000228463	.	-
chr1	297501	297502	ENST00000424587.7:ENSG00000228463	.	-	chr1	297501	297502	ENST00000424587.7:ENSG00000228463	.	-
chr1	348365	348366	ENST00000458203.2:RPL23AP24	.	-	chr1	348365	348366	ENST00000458203.2:RPL23AP24	.	-
chr1	358856	358857	ENST00000450983.1:ENSG00000236601	.	+	chr1	358856	358857	ENST00000450983.1:ENSG00000236601	.	+
chr1	358871	358872	ENST00000412666.1:ENSG00000236601	.	+	chr1	358871	358872	ENST00000412666.1:ENSG00000236601	.	+
chr1	359680	359681	ENST00000441866.2:ENSG00000228463	.	-	chr1	359680	359681	ENST00000441866.2:ENSG00000228463	.	-
chr1	360056	360057	ENST00000635159.1:ENSG00000236601	.	+	chr1	360056	360057	ENST00000635159.1:ENSG00000236601	.	+

And the series of commands are (in a snakemake rule):

        # Duplicate the columns to retain the actual tss coordinates
        intersectBed -a {input.tss:q} -b {input.tss:q} -loj 2> {log.dupe:q} | \
        # Expand the size of the tss position by the promoter distance
        slopBed -b {params.promoter_distance:q} -g {input.lengths:q} -i stdin 2> {log.slop:q} | \
        # Find all tss records that are within promoter_distance of each peak
        intersectBed -a {input.peaks:q} -b stdin -loj 2> {log.get_tss:q} | \
        # Calculate the distance between each peak and its tss records
        awk -v promoter_distance={params.promoter_distance:q} -F $'\\t' -f {input.awk_tss_distance_script:q} 2> {log.awk:q} | \
        # Group all tss distances together
        groupBy -grp 1-5 -c 6,7,8,9 -o collapse 2> {log.group_tss:q} 1> {output:q}

It's the slopBed command that issues the error.

hepcat72 · 2023-07-17T21:25:16Z

Well, it looks like if I append a 13th column, it works...

intersectBed -a {input.tss:q} -b {input.tss:q} -loj 2> {log.dupe:q} | \
perl -e 'while(<>){chomp;print;print("\t13thcolumn\n")}' | \
slopBed -b {params.promoter_distance:q} -g {input.lengths:q} -i stdin 2> {log.slop:q} | \
...

Still, I'm not sure what the issue is, because I just tried running my test data and then manually executing this series of commands (that doesn't produce the error - and without me adding the 13th column) and the output of intersect looks exactly the same to me. I mean, there clearly is different data, but the types of the data is the same. I just would like it to make sense. Is there a way to know what row/value is causing the error? Could it have to do with the size of the data?

hepcat72 · 2023-07-17T21:26:58Z

Oh wait. The chromosome names are different. The test data is just "19" (smallest chromosome) and the full data is "chr", e.g. "chr19". I bet that's it.

arq5x closed this as completed Jun 3, 2021

msbentsen mentioned this issue Jul 26, 2022

Error with TOBIAS PlotTrack loosolab/TOBIAS#159

Closed

tgstoecker added a commit to ncbench/ncbench-workflow that referenced this issue Mar 14, 2023

updated 75M cropbio with deduplicated version

69bb158

- workflow ran into this error arq5x/bedtools2#917

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

terminate called after throwing an instance of 'std::invalid_argument' #917

terminate called after throwing an instance of 'std::invalid_argument' #917

Shicheng-Guo commented May 24, 2021

arq5x commented May 24, 2021

Shicheng-Guo commented May 24, 2021

arq5x commented May 24, 2021

Shicheng-Guo commented May 25, 2021

hepcat72 commented Jul 17, 2023

hepcat72 commented Jul 17, 2023

hepcat72 commented Jul 17, 2023

terminate called after throwing an instance of 'std::invalid_argument' #917

terminate called after throwing an instance of 'std::invalid_argument' #917

Comments

Shicheng-Guo commented May 24, 2021

arq5x commented May 24, 2021

Shicheng-Guo commented May 24, 2021

arq5x commented May 24, 2021

Shicheng-Guo commented May 25, 2021

hepcat72 commented Jul 17, 2023

hepcat72 commented Jul 17, 2023

hepcat72 commented Jul 17, 2023