Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order contig & position correctly #407

Closed
ihodes opened this issue Dec 17, 2014 · 5 comments · Fixed by #434 or #678
Closed

Order contig & position correctly #407

ihodes opened this issue Dec 17, 2014 · 5 comments · Fixed by #434 or #678

Comments

@ihodes
Copy link
Member

ihodes commented Dec 17, 2014

We're ALMOST there:

We sort by contig length, and then lexicographically. We need to do a little better…

@danvk
Copy link
Contributor

danvk commented Dec 17, 2014

We'd get the desired behavior if we sorted a zero-padded version of the
contig name. You'd have to be careful to add the right # of zeros to just
the numbers, though (e.g. "2"→"02" but "20"→"20" and "X"→"X"). Not sure how
easy this is in SQLAlchemy/Postgres.

On Wed Dec 17 2014 at 12:37:03 PM Isaac Hodes notifications@github.com
wrote:

We're ALMOST there:

https://camo.githubusercontent.com/f415d8c8c07bc30fbfb6d2d7d9e7b6bdb299103c/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f662e636c2e6c792f6974656d732f32353343304f3270337730573241324f326530362f53637265656e25323053686f74253230323031342d31322d3137253230617425323031322e33362e3033253230504d2e706e67

We sort by contig length, and then lexicographically. We need to do a
little better…


Reply to this email directly or view it on GitHub
#407.

@arahuja
Copy link
Contributor

arahuja commented Jan 6, 2015

Not sure if this the same issue or a new one, but looks like positions are not sorted correctly now either:

image

From this run

@ihodes ihodes changed the title Order contigs correctly Order contig & position correctly Jan 16, 2015
@ihodes
Copy link
Member Author

ihodes commented Jan 23, 2015

FWIW, while I work on the issue + translate this into SQLAlchemy, the SQL to sort contig/position nicely looks like:

order by COALESCE(SUBSTRING(contig FROM '^\d+')::INTEGER, 1000), length(contig), contig, position;

ihodes added a commit that referenced this issue Jan 23, 2015
Sort contigs first by number, then length, then lexicographically

Fixes #407
ihodes added a commit that referenced this issue Jan 23, 2015
Position was no longer being sorted by correcty, due to casting its type
to a string.

As noted by @arahuja in issue #407

This does not fix the issue, though; that requires a more complex sort
by contig (in the works).
@ihodes ihodes added this to the Spring Forward milestone Mar 2, 2015
@ihodes
Copy link
Member Author

ihodes commented Mar 2, 2015

This regressed somehow and passed tests: http://cycledash.demeter.hpc.mssm.edu/runs/153/examine

Need to fix and add better regression testing.

@ihodes ihodes reopened this Mar 2, 2015
@ihodes ihodes modified the milestones: Stabilize , Spring Forward May 22, 2015
@ihodes
Copy link
Member Author

ihodes commented Jun 1, 2015

This appears to only be a problem when contigs are prefixed with chr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants