New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using reference IDs and use reference names instead #217

Merged
merged 1 commit into from Apr 22, 2014

Conversation

Projects
None yet
3 participants
@nealsid
Contributor

nealsid commented Apr 18, 2014

I have a few more tests to fix but this is definitely ready for more eyeballs on it. Hopefully I'll be able to fix the remaining ones tomorrow.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins commented Apr 18, 2014

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/280/

currentOffsetFromEnd += 1
if (currentOffsetFromEnd > 10000) {
currentOffsetFromEnd = 0
}
Int.MaxValue - currentOffsetFromEnd
// NB : this is really ugly - any better way to manufacture

This comment has been minimized.

@fnothaft

fnothaft Apr 18, 2014

Member

I don't think the relative position is important (e.g., last partition, first, etc.), it's only important that all unmapped reads are distributed evenly. Perhaps we could go with "unmapped" instead of "zzzzz" as that'll likely be at the end (assuming most assemblies start with "chr[om]") anyways, and is descriptive.

@fnothaft

fnothaft Apr 18, 2014

Member

I don't think the relative position is important (e.g., last partition, first, etc.), it's only important that all unmapped reads are distributed evenly. Perhaps we could go with "unmapped" instead of "zzzzz" as that'll likely be at the end (assuming most assemblies start with "chr[om]") anyways, and is descriptive.

This comment has been minimized.

@nealsid

nealsid Apr 18, 2014

Contributor

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

    assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.

@nealsid

nealsid Apr 18, 2014

Contributor

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

    assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.

This comment has been minimized.

@fnothaft

fnothaft Apr 18, 2014

Member

I'll let @massie comment but that seems like a unit test artifact, not a real constraint.

Neal notifications@github.com wrote:

     currentOffsetFromEnd += 1
     if (currentOffsetFromEnd > 10000) {
       currentOffsetFromEnd = 0
     }
  •    Int.MaxValue - currentOffsetFromEnd
    
  •    // NB : this is really ugly - any better way to manufacture
    

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

   assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.


Reply to this email directly or view it on GitHub:
https://github.com/bigdatagenomics/adam/pull/217/files#r11774179

In adam-core/src/main/scala/org/bdgenomics/adam/rdd/ADAMRDDFunctions.scala:

>          currentOffsetFromEnd += 1
>          if (currentOffsetFromEnd > 10000) {
>            currentOffsetFromEnd = 0
>          }
> -        Int.MaxValue - currentOffsetFromEnd
> +        // NB : this is really ugly - any better way to manufacture

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

    assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.


Reply to this email directly or view it on GitHub.

@fnothaft

fnothaft Apr 18, 2014

Member

I'll let @massie comment but that seems like a unit test artifact, not a real constraint.

Neal notifications@github.com wrote:

     currentOffsetFromEnd += 1
     if (currentOffsetFromEnd > 10000) {
       currentOffsetFromEnd = 0
     }
  •    Int.MaxValue - currentOffsetFromEnd
    
  •    // NB : this is really ugly - any better way to manufacture
    

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

   assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.


Reply to this email directly or view it on GitHub:
https://github.com/bigdatagenomics/adam/pull/217/files#r11774179

In adam-core/src/main/scala/org/bdgenomics/adam/rdd/ADAMRDDFunctions.scala:

>          currentOffsetFromEnd += 1
>          if (currentOffsetFromEnd > 10000) {
>            currentOffsetFromEnd = 0
>          }
> -        Int.MaxValue - currentOffsetFromEnd
> +        // NB : this is really ugly - any better way to manufacture

"unmapped" is fine, although there is a dependency in the unit tests on the unmapped reads being at the end:

    assert(unmapped.forall(p => p._2 > mapped.takeRight(1)(0)._2))

Not sure if that reflects something we need to require on production data or just a unit test issue.


Reply to this email directly or view it on GitHub.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 18, 2014

Member

Thanks for putting this together @nealsid! This looks great; I had a few small comments inline. I look forward to the finishing tests, and having this merged in.

Member

fnothaft commented Apr 18, 2014

Thanks for putting this together @nealsid! This looks great; I had a few small comments inline. I look forward to the finishing tests, and having this merged in.

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 18, 2014

Contributor

Thanks for the comments, Frank!

Contributor

nealsid commented Apr 18, 2014

Thanks for the comments, Frank!

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins commented Apr 18, 2014

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/282/

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins commented Apr 18, 2014

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/283/

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins commented Apr 18, 2014

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/284/

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins commented Apr 22, 2014

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/288/

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/293/

AmplabJenkins commented Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/293/

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

Jenkins, test this please.

Contributor

nealsid commented Apr 22, 2014

Jenkins, test this please.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/294/

AmplabJenkins commented Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/294/

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

Ready for merging! Thanks @massie, @fnothaft

Contributor

nealsid commented Apr 22, 2014

Ready for merging! Thanks @massie, @fnothaft

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

Oops, actually hold off. I'll rebase & squash.

Contributor

nealsid commented Apr 22, 2014

Oops, actually hold off. I'll rebase & squash.

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

Jenkins, test this place

Contributor

nealsid commented Apr 22, 2014

Jenkins, test this place

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

Jenkins, test this please.

Contributor

nealsid commented Apr 22, 2014

Jenkins, test this please.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/295/

AmplabJenkins commented Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/295/

Migrate from reference ids to reference names
We decided that using RefSeq names for our contigs was better practice
than minting our own IDs and handling complicated sequence dictionary
merges.
@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/296/

AmplabJenkins commented Apr 22, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/296/

@nealsid

This comment has been minimized.

Show comment
Hide comment
@nealsid

nealsid Apr 22, 2014

Contributor

This is good to go now. Thanks!

Contributor

nealsid commented Apr 22, 2014

This is good to go now. Thanks!

fnothaft added a commit that referenced this pull request Apr 22, 2014

Merge pull request #217 from hammerlab/ref-id-to-name
Stop using reference IDs and use reference names instead

@fnothaft fnothaft merged commit b777945 into bigdatagenomics:master Apr 22, 2014

1 check passed

default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 22, 2014

Member

Merged! Thanks @nealsid!

Member

fnothaft commented Apr 22, 2014

Merged! Thanks @nealsid!

@nealsid nealsid deleted the hammerlab:ref-id-to-name branch Apr 22, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment