-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: conversion from/to hgvs with GENCODE ID #45
feat: conversion from/to hgvs with GENCODE ID #45
Conversation
- [PR 44](chrovis#44) and [PR 43](chrovis#43) implemented loading GENCODE files and looking up by GENCODE ID, but lacks GENCODE ID conversion support - extracted a function to check if a string is supported transcript ID in hgvs to vcf conversion (`->supported-transcript`)
Codecov Report
@@ Coverage Diff @@
## master #45 +/- ##
==========================================
+ Coverage 43.19% 43.30% +0.11%
==========================================
Files 15 15
Lines 1799 1801 +2
Branches 39 39
==========================================
+ Hits 777 780 +3
+ Misses 983 982 -1
Partials 39 39
Continue to review full report at Codecov.
|
test/varity/hgvs_to_vcf_test.clj
Outdated
@@ -14,6 +14,36 @@ | |||
test-ref-seq-file]] | |||
[varity.vcf-to-hgvs :as v2h])) | |||
|
|||
(deftest ->supported-transcript-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok to add private function's test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the support of the conversion with GENCODE. It seems to be working. I have left a few suggestions.
src/varity/hgvs_to_vcf.clj
Outdated
(defn- ->supported-transcript | ||
[s] | ||
(re-find #"^((NM|NR)_|ENS(T|P))\d+(\.\d+)?$" s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fn name and the return value are discord. Furthermore, this fn has only to check the transcript.
Recommendation:
(defn- supported-transcript?
[s]
(some? (re-matches #"((NM|NR)_|ENS(T|P))\d+(\.\d+)?" (str s))))
src/varity/hgvs_to_vcf.clj
Outdated
@@ -66,11 +70,11 @@ | |||
(throw (ex-info "supported HGVS kinds are only `:coding-dna` and `:protein`" | |||
{:type ::unsupported-hgvs-kind | |||
:hgvs-kind kind}))) | |||
rgs (if-let [[rs] (re-find #"^(NM|NR)_\d+\.?(\d+)?$" (str transcript))] | |||
rgs (if-let [[rs] (->supported-transcript (str transcript))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rgs (if (supported-transcript? transcript)
(rg/ref-genes transcript rgidx)
test/varity/hgvs_to_vcf_test.clj
Outdated
@@ -14,6 +14,36 @@ | |||
test-ref-seq-file]] | |||
[varity.vcf-to-hgvs :as v2h])) | |||
|
|||
(deftest ->supported-transcript-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course! 👍
test/varity/vcf_to_hgvs_test.clj
Outdated
|
||
(deftest coding-dna-ref-gene?-test | ||
(testing "valid reference genes" | ||
(are [transcript] (#'v2h/coding-dna-ref-gene? {:name transcript}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using true?
or false?
to test a predicate fn strictly because Clojure treats everything except false
and nil
as true
.
(are [transcript] (true? (#'v2h/coding-dna-ref-gene? {:name transcript}))
...
(are [transcript] (false? (#'v2h/coding-dna-ref-gene? {:name transcript}))
- rename function - makes tests more strict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the revision. lein test :all
fails because of the lack of rgidx
. Please check it.
src/varity/hgvs_to_vcf.clj
Outdated
rgs (if-let [[rs] (re-find #"^(NM|NR)_\d+\.?(\d+)?$" (str transcript))] | ||
(rg/ref-genes rs rgidx) | ||
rgs (if (supported-transcript? transcript) | ||
(rg/ref-genes (str transcript)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rgidx
is missing.
src/varity/hgvs_to_vcf.clj
Outdated
@@ -33,6 +33,10 @@ | |||
[hgvs seq-rdr rgs] | |||
(distinct (apply concat (keep #(prot/->vcf-variants hgvs seq-rdr %) rgs)))) | |||
|
|||
(defn- supported-transcript? | |||
[s] | |||
(some? (re-matches #"^((NM|NR)_|ENS(T|P))\d+(\.\d+)?$" (str s)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^$
are unnecessary for re-matches
.
@totakke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
->supported-transcript
)