[BDG-FORMATS-29] Re-organize the Feature schema #30

tdanford · 2014-09-15T10:28:32Z

This is attempting to re-organize the Feature schema, along the
discussions that we (Timothy, Uri, Matt, Frank) have had in email and on
the phone. The main requirements are:

less file-format dependence in the field choice ('qValue'-like fields
could be relegated to the 'attributes' field)
fewer fields to improve the memory footprint

tdanford · 2014-09-15T10:31:25Z

This PR reflects a fix for Issue #29

laserson · 2014-09-15T17:32:03Z

src/main/resources/avro/bdg.avdl

+
+   Key is database name and value is the accession.
+   */
+  map<string> dbxrefs = null;


Last time, you mentioned the possibility of multiple accessions in a single database. Does this reflect your resolution of this potential issue?

Yup! This is meant to be an ID->DB map, but I haven't done any downstream testing yet.

Is it ever possible that an object can have one accession in multiple databases? I don't know if we're talking about a 7-sigma issue here, but is it worth going back to your original proposal of having an array of Dbxref objects?

Well, and ID->DB (as opposed to DB->ID) mapping would handle "multiple acc's in one DB," right? I'm agnostic either way.

No, I mean, what if the same accession is used in multiple databases? Does
this ever happen?
On Sep 15, 2014 12:39 PM, "Timothy Danford" notifications@github.com
wrote:

In src/main/resources/avro/bdg.avdl:

union { null, double } signalValue = null;

union { null, double } pValue = null;

union { null, double } qValue = null;

union { null, long } peak = null;

/**

The value associated with this feature (if double)

*/

union { null, double } value = null;

/**

Cross-references into other databases.

Key is database name and value is the accession.

*/

map dbxrefs = null;

Well, and ID->DB (as opposed to DB->ID) mapping would handle "multiple
acc's in one DB," right? I'm agnostic either way.

—
Reply to this email directly or view it on GitHub
https://github.com/bigdatagenomics/bdg-formats/pull/30/files#r17563834.

Oh. Um. Yeah, maybe. "7-sigma," like you said, but we should probably plan for it.

Should we go back to your initial suggestion then? Of an array of Dbxref objects?

laserson · 2014-09-15T17:33:48Z

+1 overall

fnothaft · 2014-09-18T14:17:12Z

@tdanford would you like to get this in before I cut a new bdg-formats release?

tdanford · 2014-09-18T14:28:17Z

I don't think it's necessary, no.

fnothaft · 2014-09-18T14:29:59Z

OK, thanks. I'll cut the release now.

AmplabJenkins · 2014-09-30T11:12:15Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/bdg-formats-prb/44/

tdanford · 2014-10-07T14:11:22Z

Should this be rebased down before merging?

ANSWER: YES.

Let me do that, real quick.

fnothaft · 2014-10-07T14:11:45Z

Yes, please squash.

This is attempting to re-organize the Feature schema, along the discussions that we (Timothy, Uri, Matt, Frank) have had in email and on the phone. The main requirements are: * less file-format dependence in the field choice ('qValue'-like fields could be relegated to the 'attributes' field) * fewer fields to improve the memory footprint

AmplabJenkins · 2014-10-07T14:17:18Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/bdg-formats-prb/45/Test PASSed.

[BDG-FORMATS-29] Re-organize the Feature schema

fnothaft · 2014-10-07T14:17:57Z

Merged! Thanks @tdanford!

tdanford mentioned this pull request Sep 15, 2014

Trying out (in Feature2) a different approach to the Feature schema. #28

Closed

tdanford force-pushed the revised-feature branch from 8794ee9 to cee80b7 Compare September 15, 2014 10:30

tdanford changed the title ~~[FORMATS-92] Re-organize the Feature schema~~ [FORMATS-29] Re-organize the Feature schema Sep 15, 2014

laserson reviewed Sep 15, 2014
View reviewed changes

tdanford force-pushed the revised-feature branch 3 times, most recently from c39a1c2 to 9020236 Compare September 15, 2014 18:04

tdanford changed the title ~~[FORMATS-29] Re-organize the Feature schema~~ [BDG-FORMATS-29] Re-organize the Feature schema Sep 15, 2014

tdanford force-pushed the revised-feature branch from 9020236 to c766065 Compare September 17, 2014 17:52

fnothaft force-pushed the master branch from e1019d3 to 8de8cf3 Compare September 18, 2014 14:33

tdanford force-pushed the revised-feature branch from c766065 to 60c2748 Compare September 22, 2014 18:07

fnothaft mentioned this pull request Oct 7, 2014

[ADAM-327] Adding gene, transcript, and exon models. bigdatagenomics/adam#404

Merged

tdanford force-pushed the revised-feature branch from 23fdcf2 to 886e273 Compare October 7, 2014 14:13

fnothaft added a commit that referenced this pull request Oct 7, 2014

Merge pull request #30 from tdanford/revised-feature

4f86ad1

[BDG-FORMATS-29] Re-organize the Feature schema

fnothaft merged commit 4f86ad1 into bigdatagenomics:master Oct 7, 2014

tdanford deleted the revised-feature branch October 9, 2014 10:04

laserson mentioned this pull request Jan 10, 2015

Re-organize the Feature schema #29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BDG-FORMATS-29] Re-organize the Feature schema #30

[BDG-FORMATS-29] Re-organize the Feature schema #30

tdanford commented Sep 15, 2014

tdanford commented Sep 15, 2014

laserson Sep 15, 2014

tdanford Sep 15, 2014

laserson Sep 15, 2014

tdanford Sep 15, 2014

laserson Sep 15, 2014

tdanford Sep 15, 2014

laserson Sep 17, 2014

laserson commented Sep 15, 2014

fnothaft commented Sep 18, 2014

tdanford commented Sep 18, 2014

fnothaft commented Sep 18, 2014

AmplabJenkins commented Sep 30, 2014

tdanford commented Oct 7, 2014

fnothaft commented Oct 7, 2014

AmplabJenkins commented Oct 7, 2014

fnothaft commented Oct 7, 2014

[BDG-FORMATS-29] Re-organize the Feature schema #30

[BDG-FORMATS-29] Re-organize the Feature schema #30

Conversation

tdanford commented Sep 15, 2014

tdanford commented Sep 15, 2014

laserson Sep 15, 2014

Choose a reason for hiding this comment

tdanford Sep 15, 2014

Choose a reason for hiding this comment

laserson Sep 15, 2014

Choose a reason for hiding this comment

tdanford Sep 15, 2014

Choose a reason for hiding this comment

laserson Sep 15, 2014

Choose a reason for hiding this comment

tdanford Sep 15, 2014

Choose a reason for hiding this comment

laserson Sep 17, 2014

Choose a reason for hiding this comment

laserson commented Sep 15, 2014

fnothaft commented Sep 18, 2014

tdanford commented Sep 18, 2014

fnothaft commented Sep 18, 2014

AmplabJenkins commented Sep 30, 2014

tdanford commented Oct 7, 2014

fnothaft commented Oct 7, 2014

AmplabJenkins commented Oct 7, 2014

fnothaft commented Oct 7, 2014