`Symbol` case class is nucleotide-centric #672

Closed
laserson opened this Issue May 4, 2015 · 9 comments

Comments

Projects
None yet
3 participants
@laserson
Contributor

laserson commented May 4, 2015

Not sure if this really matters, but just wanted to address that the Symbol class in the new Alphabet machinery includes a "complement" field, which doesn't make sense for proteins or other things (e.g., secondary structure). Perhaps necessary to subclass Alphabet? Thoughts?

@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie May 4, 2015

Member

Agree. Should we consider incorporating BioScala into ADAM?

https://github.com/bioscala/bioscala

It's BSD licensed.

Member

massie commented May 4, 2015

Agree. Should we consider incorporating BioScala into ADAM?

https://github.com/bioscala/bioscala

It's BSD licensed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 4, 2015

Member

I don't know of anyone using ADAM for proteomic or RNA secondary structure work, so I'm inclined to not make another change for now.

BioScala does look like an attractive package if we were to go that route, but it doesn't look like it is actively being developed. CCing @antonkulaga who I see has committed there a lot, who might know more about the status of the BioScala project.

Member

fnothaft commented May 4, 2015

I don't know of anyone using ADAM for proteomic or RNA secondary structure work, so I'm inclined to not make another change for now.

BioScala does look like an attractive package if we were to go that route, but it doesn't look like it is actively being developed. CCing @antonkulaga who I see has committed there a lot, who might know more about the status of the BioScala project.

@laserson

This comment has been minimized.

Show comment
Hide comment
@laserson

laserson May 4, 2015

Contributor

I'm not familiar with that project, but I'm not opposed. Also, what about the issue of persisting Alphabet information?

Contributor

laserson commented May 4, 2015

I'm not familiar with that project, but I'm not opposed. Also, what about the issue of persisting Alphabet information?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 4, 2015

Member

Also, what about the issue of persisting Alphabet information?

Sorry, I didn't follow here. What do you mean by that?

Member

fnothaft commented May 4, 2015

Also, what about the issue of persisting Alphabet information?

Sorry, I didn't follow here. What do you mean by that?

@laserson

This comment has been minimized.

Show comment
Hide comment
@laserson

laserson May 4, 2015

Contributor

Sorry, just thinking aloud. Wondering whether it'd be useful to model Alphabets at the Avro level. But that's probably unnecessary complication...

Contributor

laserson commented May 4, 2015

Sorry, just thinking aloud. Wondering whether it'd be useful to model Alphabets at the Avro level. But that's probably unnecessary complication...

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 4, 2015

Member

@laserson we used to have a Base enum in the Avro level, but it wound up not being terribly useful so we scrapped it in bigdatagenomics/bdg-formats#46. If I could TL;DR my experiences with enum based alphabets, the problem is that strings are reasonably efficient and a fair bit easier to work with.

Member

fnothaft commented May 4, 2015

@laserson we used to have a Base enum in the Avro level, but it wound up not being terribly useful so we scrapped it in bigdatagenomics/bdg-formats#46. If I could TL;DR my experiences with enum based alphabets, the problem is that strings are reasonably efficient and a fair bit easier to work with.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 5, 2015

Member

Also, I would say that it would probably be sufficient to just break the "reverse complement-able" section of alphabets out into another trait. E.g., you'd have Alphabet and ComplementableAlphabet or something of the like.

Member

fnothaft commented May 5, 2015

Also, I would say that it would probably be sufficient to just break the "reverse complement-able" section of alphabets out into another trait. E.g., you'd have Alphabet and ComplementableAlphabet or something of the like.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jun 24, 2015

Member

Do we want to keep this open or close it? IMO, this is not a problem for now. I think we should close it and reopen it later if we decide to work with protein sequences/etc.

Member

fnothaft commented Jun 24, 2015

Do we want to keep this open or close it? IMO, this is not a problem for now. I think we should close it and reopen it later if we decide to work with protein sequences/etc.

@fnothaft fnothaft added the wontfix label Jul 6, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 6, 2016

Member

Closing as won't fix.

Member

fnothaft commented Jul 6, 2016

Closing as won't fix.

@fnothaft fnothaft closed this Jul 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment