Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utilities for whole-genome coordinate. #103

Merged
merged 3 commits into from Sep 1, 2017
Merged

Conversation

alumi
Copy link
Member

@alumi alumi commented Aug 31, 2017

Summary

This PR adds some utilities for conversion between chromosomal positions and positions in whole-genome coordinate like this one and this one.

Tests

  • lein test :all 馃啑

@codecov
Copy link

codecov bot commented Aug 31, 2017

Codecov Report

Merging #103 into master will increase coverage by 0.27%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #103      +/-   ##
==========================================
+ Coverage   84.94%   85.22%   +0.27%     
==========================================
  Files          60       61       +1     
  Lines        3966     4019      +53     
  Branches      400      403       +3     
==========================================
+ Hits         3369     3425      +56     
  Misses        197      197              
+ Partials      400      397       -3
Impacted Files Coverage 螖
src/cljam/util/whole_genome.clj 100% <100%> (酶)
src/cljam/algo/depth.clj 80.8% <0%> (+6.99%) 猬嗭笍

Continue to review full report at Codecov.

Legend - Click here to learn more
螖 = absolute <relative> (impact), 酶 = not affected, ? = missing data
Powered by Codecov. Last update 579df0d...0a56354. Read the comment docs.

Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the enhancement! Implementation seems good. I commented about code style and naming. Please reply to me if you cannot agree on the naming convention.

(testing "whole-genome -> chr"
(are [?wg-pos ?chr-and-pos]
(= (wg/to-chr-and-pos idx ?wg-pos) ?chr-and-pos)
0 nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect indentations

[e-offset {e-name :name e-len :len}] (first (rsubseq offset->ref <= (dec end')))]
(if (= s-offset e-offset)
[{:chr s-name, :start (- start' s-offset), :end (min (- end' s-offset) s-len)}]
(concat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to-regions returns a vector when the region is in a ref, returning a lazy sequence when not. I think it should return a vector in both cases.

(map (fn [r offset] [(:name r) {:offset offset, :len (:len r)}]) refs)
(into {})))

(defn to-whole-genome-coord
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> is used for conversion functions in cljam. I prefer ->whole-genome-coord.

(reductions + 0))
refs)))

(defn to-chr-and-pos
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ->chr-and-pos.

(when (<= 1 (- wg-pos offset) len)
[name (- wg-pos offset)])))

(defn to-regions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ->regions.

@@ -0,0 +1,53 @@
(ns cljam.util.whole-genome
"Utilities for conversions between chromosomal positions and whole-genome positions.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wrap a line at 80 characters if possible. cf. clojure-style-guide#80-character-limits

(ns cljam.util.whole-genome
"Utilities for conversions between chromosomal positions and whole-genome positions.")

(defn create-chr-to-whole-genome-index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cljam, create-* is used for file-generating functions such as bam-indexer/create-index and dict/create-dict. I think make-* is better for this function, or simply chr-to-whole-genome-index is also good.

(when (<= 1 pos len)
(inc (+ offset (dec pos))))))

(defn create-whole-genome-to-chr-index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment on create-chr-to-whole-genome-index

@alumi
Copy link
Member Author

alumi commented Aug 31, 2017

Thank you so much for the advice! 馃槀
I was wondering how to name these functions.

I also modified ->region function to make it return a vector.

@totakke totakke merged commit 79eed24 into master Sep 1, 2017
@totakke totakke deleted the feature/whole-genome branch September 1, 2017 02:37
@totakke
Copy link
Member

totakke commented Sep 1, 2017

Perfect, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants