flatfile-to-json.pl - insufficient data yields incomplete histograms #612

Closed
enuggetry opened this Issue Jun 26, 2015 · 11 comments

Comments

Projects
None yet
4 participants
@enuggetry
Contributor

enuggetry commented Jun 26, 2015

When loading flat genome data through “flatfile-to-json.pl”, JBrowse uses some form of heuristic method to generate histograms. Unfortunately, in the absence of sufficient data, the method outputs either incomplete or non-existent histograms. In the worst-case scenario, users of the application will wind up at an “infinite” loading box for feature histograms or may even encounter large blank spaces on a given chromosome within the JBrowse genome viewer, which [falsely] suggests that a given track does not have data.

Submitted by Mary Shimoyama

@enuggetry enuggetry added the MCW label Jun 30, 2015

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 10, 2015

Contributor

I disucssed this bug via email with Aurash a long time ago. Essentially, I think they just had a bad track configuration.

They said in their email that they also had this error:

From the console
Warning: "Unable to determine an appropriate data store to use with track 'undefined', please explicitly specify a storeClass in the configuration." (dojo.js:1050)

From the JavaScript console
Error: "TypeError: a is not a constructor" (dojo.js:32)

You can search this error in the jbrowse source code, and it looks like it indicates a track config problem. I think it would probably also lead to the type error after that, and when you have fatal javascript typeerrors, then it halts the interpreter so you get weird things like "infinite loading bars" and other things

Therefore, I think the issue with the histograms is just a red herring for the bad track config

Contributor

cmdcolin commented Jul 10, 2015

I disucssed this bug via email with Aurash a long time ago. Essentially, I think they just had a bad track configuration.

They said in their email that they also had this error:

From the console
Warning: "Unable to determine an appropriate data store to use with track 'undefined', please explicitly specify a storeClass in the configuration." (dojo.js:1050)

From the JavaScript console
Error: "TypeError: a is not a constructor" (dojo.js:32)

You can search this error in the jbrowse source code, and it looks like it indicates a track config problem. I think it would probably also lead to the type error after that, and when you have fatal javascript typeerrors, then it halts the interpreter so you get weird things like "infinite loading bars" and other things

Therefore, I think the issue with the histograms is just a red herring for the bad track config

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 10, 2015

Contributor

Full text of emails http://pastebin.com/eR6iuKvf
If they have a live example showing the bug that would help

Contributor

cmdcolin commented Jul 10, 2015

Full text of emails http://pastebin.com/eR6iuKvf
If they have a live example showing the bug that would help

@enuggetry

This comment has been minimized.

Show comment
Hide comment
@enuggetry

enuggetry Jul 10, 2015

Contributor

Thanks. Noted.
On Jul 10, 2015 11:24 AM, "Colin Diesh" notifications@github.com wrote:

Full text of emails http://pastebin.com/eR6iuKvf
If they have a live example showing the bug that would help


Reply to this email directly or view it on GitHub
#612 (comment).

Contributor

enuggetry commented Jul 10, 2015

Thanks. Noted.
On Jul 10, 2015 11:24 AM, "Colin Diesh" notifications@github.com wrote:

Full text of emails http://pastebin.com/eR6iuKvf
If they have a live example showing the bug that would help


Reply to this email directly or view it on GitHub
#612 (comment).

@halfwayBraindead

This comment has been minimized.

Show comment
Hide comment
@halfwayBraindead

halfwayBraindead Jul 21, 2015

Gents,

I do not think this issue stems from an allegedly faulty track configuration, as that would [only] hamper actual display of track features within the genome browser. Even when JBrowse attempts to write the actual histogram files following track insertion via "flatfile-to-json.pl", it fails to do so--see below:

-bash-3.2$ echo && ls -lhv ./tracks/DEBUG_25/*

./tracks/DEBUG_25/Chr1:
total 664K
-rw-r--r-- 1 rgdpub rgdpub 120 Jul 20 18:18 hist-5000000-0.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-9.json
-rw-r--r-- 1 rgdpub rgdpub 41K Jul 20 18:18 lf-10.json
-rw-r--r-- 1 rgdpub rgdpub 32K Jul 20 18:18 lf-11.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-12.json
-rw-r--r-- 1 rgdpub rgdpub 39K Jul 20 18:18 lf-13.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-14.json
-rw-r--r-- 1 rgdpub rgdpub 19K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.6K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr2:
total 340K
-rw-r--r-- 1 rgdpub rgdpub 116 Jul 20 18:18 hist-5000000-0.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 16K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 22K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 10K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.5K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr3:
total 404K
-rw-r--r-- 1 rgdpub rgdpub 29K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 77K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 35K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 32K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 29K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 30K Jul 20 18:18 lf-9.json
-rw-r--r-- 1 rgdpub rgdpub 7.7K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr4:
total 324K
-rw-r--r-- 1 rgdpub rgdpub 46K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 33K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 39K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 6.9K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr5:
total 356K
-rw-r--r-- 1 rgdpub rgdpub 38K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 46K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 43K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 20K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 8.9K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

Please take special note of how chromosomes 3 through 5 lack histograms entirely--as our team suspects, this may be stemming from a logic issue (or set of issues) within the "flatfile-to-json.pl" script--it is somehow failing to generate histograms on those chromosomes, despite actual features being present.

Further, "flatfile-to-json.pl" is not throwing any warnings or errors while loading these tracks--it makes every indication of a successful track loading process.

So, what does this actually look like in our development instance of JBrowse?

missinghistograms

The top-most track is the genes and transcripts track for the rn5 genome, at full feature density. The "DEBUG_10/25/50" tracks represent a 10x, 25x, and 50x reduction in feature density for the top-most track, respectively, to which the bottom-most track possesses only 2% of its original feature density.

As can be seen, the histograms predictably grow sparser and coarser as feature density diminishes, until they drop off completely in "DEBUG_50", but is that track truly devoid of gene features?

missinghistograms_2

No, it is not empty, and JBrowse should still be ideally generating histograms for this track.

GZipped copies of the GFF3s used in this test can be found below:

DEBUG_10
DEBUG_25
DEBUG_50

The genome assembly used as the reference sequence for this test was Rnor v5.0 (e.g. Rat 5) from NCBI.

Finally, please note that this test was performed on the latest public release of JBrowse, version 1.11.6.

Gents,

I do not think this issue stems from an allegedly faulty track configuration, as that would [only] hamper actual display of track features within the genome browser. Even when JBrowse attempts to write the actual histogram files following track insertion via "flatfile-to-json.pl", it fails to do so--see below:

-bash-3.2$ echo && ls -lhv ./tracks/DEBUG_25/*

./tracks/DEBUG_25/Chr1:
total 664K
-rw-r--r-- 1 rgdpub rgdpub 120 Jul 20 18:18 hist-5000000-0.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-9.json
-rw-r--r-- 1 rgdpub rgdpub 41K Jul 20 18:18 lf-10.json
-rw-r--r-- 1 rgdpub rgdpub 32K Jul 20 18:18 lf-11.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-12.json
-rw-r--r-- 1 rgdpub rgdpub 39K Jul 20 18:18 lf-13.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-14.json
-rw-r--r-- 1 rgdpub rgdpub 19K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.6K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr2:
total 340K
-rw-r--r-- 1 rgdpub rgdpub 116 Jul 20 18:18 hist-5000000-0.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 16K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 22K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 10K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.5K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr3:
total 404K
-rw-r--r-- 1 rgdpub rgdpub 29K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 77K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 35K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 32K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 29K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 48K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 30K Jul 20 18:18 lf-9.json
-rw-r--r-- 1 rgdpub rgdpub 7.7K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr4:
total 324K
-rw-r--r-- 1 rgdpub rgdpub 46K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 33K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 39K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 44K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 6.9K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

./tracks/DEBUG_25/Chr5:
total 356K
-rw-r--r-- 1 rgdpub rgdpub 38K Jul 20 18:18 lf-1.json
-rw-r--r-- 1 rgdpub rgdpub 46K Jul 20 18:18 lf-2.json
-rw-r--r-- 1 rgdpub rgdpub 40K Jul 20 18:18 lf-3.json
-rw-r--r-- 1 rgdpub rgdpub 43K Jul 20 18:18 lf-4.json
-rw-r--r-- 1 rgdpub rgdpub 49K Jul 20 18:18 lf-5.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-6.json
-rw-r--r-- 1 rgdpub rgdpub 47K Jul 20 18:18 lf-7.json
-rw-r--r-- 1 rgdpub rgdpub 20K Jul 20 18:18 lf-8.json
-rw-r--r-- 1 rgdpub rgdpub 8.9K Jul 20 18:18 names.txt
-rw-r--r-- 1 rgdpub rgdpub 2.3K Jul 20 18:18 trackData.json

Please take special note of how chromosomes 3 through 5 lack histograms entirely--as our team suspects, this may be stemming from a logic issue (or set of issues) within the "flatfile-to-json.pl" script--it is somehow failing to generate histograms on those chromosomes, despite actual features being present.

Further, "flatfile-to-json.pl" is not throwing any warnings or errors while loading these tracks--it makes every indication of a successful track loading process.

So, what does this actually look like in our development instance of JBrowse?

missinghistograms

The top-most track is the genes and transcripts track for the rn5 genome, at full feature density. The "DEBUG_10/25/50" tracks represent a 10x, 25x, and 50x reduction in feature density for the top-most track, respectively, to which the bottom-most track possesses only 2% of its original feature density.

As can be seen, the histograms predictably grow sparser and coarser as feature density diminishes, until they drop off completely in "DEBUG_50", but is that track truly devoid of gene features?

missinghistograms_2

No, it is not empty, and JBrowse should still be ideally generating histograms for this track.

GZipped copies of the GFF3s used in this test can be found below:

DEBUG_10
DEBUG_25
DEBUG_50

The genome assembly used as the reference sequence for this test was Rnor v5.0 (e.g. Rat 5) from NCBI.

Finally, please note that this test was performed on the latest public release of JBrowse, version 1.11.6.

@enuggetry

This comment has been minimized.

Show comment
Hide comment
@enuggetry

enuggetry Jul 21, 2015

Contributor

Thanks for the clear illustration, @halfwayBraindead, I see what you're saying.

Contributor

enuggetry commented Jul 21, 2015

Thanks for the clear illustration, @halfwayBraindead, I see what you're saying.

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 21, 2015

Contributor

I suppose this is indeed probably a confirmed bug. Good test data. Let me know if I can help fix but I think that I can indeed generate a fail case

Contributor

cmdcolin commented Jul 21, 2015

I suppose this is indeed probably a confirmed bug. Good test data. Let me know if I can help fix but I think that I can indeed generate a fail case

@halfwayBraindead

This comment has been minimized.

Show comment
Hide comment
@halfwayBraindead

halfwayBraindead Jul 24, 2015

I've been spending some time trying to workaround this histogram problem, and I generated BigWigs from the GFF3s in question--but, when using the NCList storeClass of JBrowse, it does not appear (from my use cases in 1.11.6) to support custom histogram specification.

It is well-known that the BAM storeClass allows for custom user specification of histograms, such as BigWigs, and to my delight, the GFF3 storeClass (enabling direct reading of GFF3s without needing to use "flatfile-to-json.pl" apriori) also supports custom histogram specification!

Unfortunately, after attempting to load a production-grade GFF3 (on the order of hundreds of Megabytes), the genome browser crashed even more rapidly than it does with poorly configured BAM data--in other words, the GFF3 parser of JBrowse is non-ideal. It's pretty slow, and even the source commenting of its main method declares that it requires significant refactoring.

In any case, here's an idea that could enable a workaround solution without needing too much effort from the JBrowse team: why not enable custom histogram specification for the NCList storeClass? This sort of code already exists for BAM and GFF3 storeClasses, so why not port this code over to the NCList storeClass as an optional user definition?

That way, users can generate their own Bedgraphs and Wiggles; they can define and take responsibility for their own histograms without requiring an extensive examination and/or re-write of the existing histogram generation method(s) in "flatfile-to-json.pl".

This suggestion is being made with the knowledge in-mind that JBrowse development resources are limited. What do you think?

I've been spending some time trying to workaround this histogram problem, and I generated BigWigs from the GFF3s in question--but, when using the NCList storeClass of JBrowse, it does not appear (from my use cases in 1.11.6) to support custom histogram specification.

It is well-known that the BAM storeClass allows for custom user specification of histograms, such as BigWigs, and to my delight, the GFF3 storeClass (enabling direct reading of GFF3s without needing to use "flatfile-to-json.pl" apriori) also supports custom histogram specification!

Unfortunately, after attempting to load a production-grade GFF3 (on the order of hundreds of Megabytes), the genome browser crashed even more rapidly than it does with poorly configured BAM data--in other words, the GFF3 parser of JBrowse is non-ideal. It's pretty slow, and even the source commenting of its main method declares that it requires significant refactoring.

In any case, here's an idea that could enable a workaround solution without needing too much effort from the JBrowse team: why not enable custom histogram specification for the NCList storeClass? This sort of code already exists for BAM and GFF3 storeClasses, so why not port this code over to the NCList storeClass as an optional user definition?

That way, users can generate their own Bedgraphs and Wiggles; they can define and take responsibility for their own histograms without requiring an extensive examination and/or re-write of the existing histogram generation method(s) in "flatfile-to-json.pl".

This suggestion is being made with the knowledge in-mind that JBrowse development resources are limited. What do you think?

@selewis

This comment has been minimized.

Show comment
Hide comment
@selewis

selewis Jul 24, 2015

+1

On Fri, Jul 24, 2015 at 8:36 AM, halfwayBraindead notifications@github.com
wrote:

I've been spending some time trying to workaround this histogram problem,
and I generated BigWigs from the GFF3s in question--but, when using the
NCList storeClass of JBrowse, it does not appear (from my use cases in
1.11.6) to support custom histogram specification.

It is well-known that the BAM storeClass allows for custom user
specification of histograms, such as BigWigs, and to my delight, the GFF3
storeClass (enabling direct reading of GFF3s without needing to use "
flatfile-to-json.pl" apriori) also supports custom histogram
specification!

Unfortunately, after attempting to load a production-grade GFF3 (on the
order of hundreds of Megabytes), the genome browser crashed even more
rapidly than it does with poorly configured BAM data--in other words, the
GFF3 parser of JBrowse is non-ideal. It's pretty slow, and even the source
commenting of its main method declares that it requires significant
refactoring.

In any case, here's an idea that could enable a workaround solution
without needing too much effort from the JBrowse team: why not enable
custom histogram specification for the NCList storeClass? This sort of code
already exists for BAM and GFF3 storeClasses, so why not port this code
over to the NCList storeClass as an optional user definition?

That way, users can generate their own Bedgraphs and Wiggles; they can
define and take responsibility for their own histograms without requiring
an extensive examination and/or re-write of the existing histogram
generation method(s) in "flatfile-to-json.pl".

This suggestion is being made with the knowledge in-mind that JBrowse
development resources are limited. What do you think?


Reply to this email directly or view it on GitHub
#612 (comment).

selewis commented Jul 24, 2015

+1

On Fri, Jul 24, 2015 at 8:36 AM, halfwayBraindead notifications@github.com
wrote:

I've been spending some time trying to workaround this histogram problem,
and I generated BigWigs from the GFF3s in question--but, when using the
NCList storeClass of JBrowse, it does not appear (from my use cases in
1.11.6) to support custom histogram specification.

It is well-known that the BAM storeClass allows for custom user
specification of histograms, such as BigWigs, and to my delight, the GFF3
storeClass (enabling direct reading of GFF3s without needing to use "
flatfile-to-json.pl" apriori) also supports custom histogram
specification!

Unfortunately, after attempting to load a production-grade GFF3 (on the
order of hundreds of Megabytes), the genome browser crashed even more
rapidly than it does with poorly configured BAM data--in other words, the
GFF3 parser of JBrowse is non-ideal. It's pretty slow, and even the source
commenting of its main method declares that it requires significant
refactoring.

In any case, here's an idea that could enable a workaround solution
without needing too much effort from the JBrowse team: why not enable
custom histogram specification for the NCList storeClass? This sort of code
already exists for BAM and GFF3 storeClasses, so why not port this code
over to the NCList storeClass as an optional user definition?

That way, users can generate their own Bedgraphs and Wiggles; they can
define and take responsibility for their own histograms without requiring
an extensive examination and/or re-write of the existing histogram
generation method(s) in "flatfile-to-json.pl".

This suggestion is being made with the knowledge in-mind that JBrowse
development resources are limited. What do you think?


Reply to this email directly or view it on GitHub
#612 (comment).

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 27, 2015

Contributor

I think I found a patch that makes the histogram section of the config file usable with tracks that were run with flatfile-to-json. You can check it out on the master jbrowse branch

The diff:

diff --git a/src/JBrowse/View/Track/CanvasFeatures.js b/src/JBrowse/View/Track/CanvasFeatures.js
index 66f7775..dcd8a70 100644
--- a/src/JBrowse/View/Track/CanvasFeatures.js
+++ b/src/JBrowse/View/Track/CanvasFeatures.js
@@ -362,7 +362,7 @@ return declare(
             basesPerBin: basesPerBin
         };

-        if( this.store.getRegionFeatureDensities ) {
+        if( !this.config.histograms.store&&this.store.getRegionFeatureDensities ) {
Contributor

cmdcolin commented Jul 27, 2015

I think I found a patch that makes the histogram section of the config file usable with tracks that were run with flatfile-to-json. You can check it out on the master jbrowse branch

The diff:

diff --git a/src/JBrowse/View/Track/CanvasFeatures.js b/src/JBrowse/View/Track/CanvasFeatures.js
index 66f7775..dcd8a70 100644
--- a/src/JBrowse/View/Track/CanvasFeatures.js
+++ b/src/JBrowse/View/Track/CanvasFeatures.js
@@ -362,7 +362,7 @@ return declare(
             basesPerBin: basesPerBin
         };

-        if( this.store.getRegionFeatureDensities ) {
+        if( !this.config.histograms.store&&this.store.getRegionFeatureDensities ) {
@halfwayBraindead

This comment has been minimized.

Show comment
Hide comment
@halfwayBraindead

halfwayBraindead Jul 30, 2015

Checked out and tested the "Master" branch, and the histogram storeClass works well now within NCList tracks! A definite plus.

Another significant issue with this approach cropped up, though:

histogram_scaling

As can be seen, the scaling of BigWig histogram bars seems "off", and even when inserting a basic BigWig track (whole track dedicated to BigWig file) from the configuration guide, the same issue appears:

histogram_scaling2

Turns out that the "autoscale" parameter resolves this scaling issue for BigWig tracks, though--the configuration guide claims that it's set to the value of "local" by default, but it appears not to be in the "Master" branch (default appears to be the "global" value):

histogram_scaling3

Unfortunately, no counterpart method appears to exist for the histogram storeClass within CanvasFeatures tracks such as GFF3/BAM/NCList - would be great to port that sort of code over!

(And, also, to greatly expand the feature capability of the histogram storeClass within CanvasFeatures tracks: as Rob wrote previously, "all you can really change about how it looks is its color". Such a venture would likely tie-in closely with #624.)

Checked out and tested the "Master" branch, and the histogram storeClass works well now within NCList tracks! A definite plus.

Another significant issue with this approach cropped up, though:

histogram_scaling

As can be seen, the scaling of BigWig histogram bars seems "off", and even when inserting a basic BigWig track (whole track dedicated to BigWig file) from the configuration guide, the same issue appears:

histogram_scaling2

Turns out that the "autoscale" parameter resolves this scaling issue for BigWig tracks, though--the configuration guide claims that it's set to the value of "local" by default, but it appears not to be in the "Master" branch (default appears to be the "global" value):

histogram_scaling3

Unfortunately, no counterpart method appears to exist for the histogram storeClass within CanvasFeatures tracks such as GFF3/BAM/NCList - would be great to port that sort of code over!

(And, also, to greatly expand the feature capability of the histogram storeClass within CanvasFeatures tracks: as Rob wrote previously, "all you can really change about how it looks is its color". Such a venture would likely tie-in closely with #624.)

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jan 27, 2017

Contributor

I think the original issue that this thread was about got fixed! The configuration of bigwigs and bigwig summaries on feature tracks probably another one

Contributor

cmdcolin commented Jan 27, 2017

I think the original issue that this thread was about got fixed! The configuration of bigwigs and bigwig summaries on feature tracks probably another one

@cmdcolin cmdcolin closed this Jan 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment