Using labels in combination with length filter seems to cause problems #3

claczny · 2014-09-25T15:15:38Z

When I take a fasta file that contains sequences that are below the length threshold (e.g., 1 knt) and want to use a matching labels file, I get the following error:

2014-09-25 16:56:35,995 DEBUG [TSNERunner] (DataSetUtils.java:318) - TSNE: Fitting performed in 0.00 seconds.
2014-09-25 16:56:36,426 DEBUG [TSNERunner] (DataSetUtils.java:318) - TSNE: Wrote the 60079 x 2 data matrix successfully!
2014-09-25 16:56:36,426 DEBUG [TSNERunner] (DataSetUtils.java:318) - TSNE:
2014-09-25 16:56:36,525 DEBUG [Thread-0] (ProcessInput.java:88) - Points created.
java.lang.IndexOutOfBoundsException: Index: 60079, Size: 60079
    at java.util.ArrayList.rangeCheck(ArrayList.java:638)
    at java.util.ArrayList.get(ArrayList.java:414)
    at lcsb.vizbin.service.DataSetFactory.createDataSetFromPointFile(DataSetFactory.java:75)
    at lu.uni.lcsb.vizbin.ProcessInput$2.run(ProcessInput.java:184)
2014-09-25 16:56:36,759 DEBUG [Thread-0] (ProcessInput.java:88) - Error! Check the logs.
2014-09-25 16:56:36,761 ERROR [Thread-0] (ProcessInput.java:250) - Index: 60079, Size: 60079
java.lang.IndexOutOfBoundsException: Index: 60079, Size: 60079
    at java.util.ArrayList.rangeCheck(ArrayList.java:638)
    at java.util.ArrayList.get(ArrayList.java:414)
    at lcsb.vizbin.service.DataSetFactory.createDataSetFromPointFile(DataSetFactory.java:75)
    at lu.uni.lcsb.vizbin.ProcessInput$2.run(ProcessInput.java:184)

When I do the length filtering before and include a matching labels file, VizBin runs through.
Probably that's a bug in connecting the labels with the sequences where the labels are not properly matched to the length-selected sequences.

The text was updated successfully, but these errors were encountered:

#3

fwhelan · 2015-08-25T18:51:52Z

I might be running into this issue as well:

grep ">" genus_idtxt.fna | wc -l
20282
wc -l genus_idtxt.ann
20283

less confirms that genus_idtxt.ann's first line is label; each following line is the name of the bin that corresponds to the fna file.
When I load fna and ann as the File to visualize and Annotation file, respectively, I do not get an error but the annotation file is ignored (all dots blue). I do have some contigs < 1000. When I restart VizBin and change the minimal contig length to 10, then everything goes through as expected.

claczny · 2015-08-26T05:39:10Z

Thank you for confirming this 👍

It is easy to workaround this issue by ensuring that the input is all >= 1,000 nt and that the annotation file matches that.
Although the size filtering integrated into VizBin becomes ineffective then.
However, this integrated filtering is IMO oftentimes a very convenient feature ;)
Yet, if the user already uses/creates an annotation file, size-filtering the sequences before using VizBin shouldn't be an issue.
Nevertheless, it is inconvenient and will be corrected in a future release. :)

claczny added the bug label Sep 25, 2014

claczny mentioned this issue Nov 18, 2014

Integration of JFreeChart #5

Closed

claczny assigned piotr-gawron Apr 14, 2015

piotr-gawron added a commit that referenced this issue Apr 16, 2015

bug fix:

6ae35c2

#3

claczny mentioned this issue Sep 24, 2015

where to find the log file? #31

Closed

claczny mentioned this issue May 10, 2016

Unknonw column header: 67 #36

Closed

claczny closed this as completed in 59d9615 Jan 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using labels in combination with length filter seems to cause problems #3

Using labels in combination with length filter seems to cause problems #3

claczny commented Sep 25, 2014

fwhelan commented Aug 25, 2015

claczny commented Aug 26, 2015

Using labels in combination with length filter seems to cause problems #3

Using labels in combination with length filter seems to cause problems #3

Comments

claczny commented Sep 25, 2014

fwhelan commented Aug 25, 2015

claczny commented Aug 26, 2015