Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: conversion from/to hgvs with GENCODE ID #45

Merged
merged 3 commits into from
Nov 10, 2021

Conversation

k-kom
Copy link
Contributor

@k-kom k-kom commented Nov 9, 2021

  • PR 44 and PR 43 implemented loading GENCODE files and looking up by GENCODE ID, but lacks GENCODE ID conversion support
  • extracted a function to check if a string is supported transcript ID in hgvs to vcf conversion (->supported-transcript)

- [PR 44](chrovis#44) and [PR 43](chrovis#43) implemented loading GENCODE files and looking up by GENCODE ID, but lacks GENCODE ID conversion support
- extracted a function to check if a string is supported transcript ID in hgvs to vcf conversion (`->supported-transcript`)
@k-kom k-kom self-assigned this Nov 9, 2021
@codecov
Copy link

codecov bot commented Nov 9, 2021

Codecov Report

Merging #45 (e4c6247) into master (a078cbf) will increase coverage by 0.11%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
+ Coverage   43.19%   43.30%   +0.11%     
==========================================
  Files          15       15              
  Lines        1799     1801       +2     
  Branches       39       39              
==========================================
+ Hits          777      780       +3     
+ Misses        983      982       -1     
  Partials       39       39              
Impacted Files Coverage Δ
src/varity/hgvs_to_vcf.clj 22.97% <40.00%> (+2.13%) ⬆️
src/varity/vcf_to_hgvs.clj 16.36% <100.00%> (+0.60%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a078cbf...e4c6247. Read the comment docs.

@@ -14,6 +14,36 @@
test-ref-seq-file]]
[varity.vcf-to-hgvs :as v2h]))

(deftest ->supported-transcript-test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to add private function's test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course! 👍

@k-kom k-kom requested review from totakke and federkasten and removed request for federkasten November 9, 2021 03:32
Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the support of the conversion with GENCODE. It seems to be working. I have left a few suggestions.

Comment on lines 36 to 38
(defn- ->supported-transcript
[s]
(re-find #"^((NM|NR)_|ENS(T|P))\d+(\.\d+)?$" s))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fn name and the return value are discord. Furthermore, this fn has only to check the transcript.

Recommendation:

(defn- supported-transcript?
  [s]
  (some? (re-matches #"((NM|NR)_|ENS(T|P))\d+(\.\d+)?" (str s))))

@@ -66,11 +70,11 @@
(throw (ex-info "supported HGVS kinds are only `:coding-dna` and `:protein`"
{:type ::unsupported-hgvs-kind
:hgvs-kind kind})))
rgs (if-let [[rs] (re-find #"^(NM|NR)_\d+\.?(\d+)?$" (str transcript))]
rgs (if-let [[rs] (->supported-transcript (str transcript))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rgs (if (supported-transcript? transcript)
      (rg/ref-genes transcript rgidx)

@@ -14,6 +14,36 @@
test-ref-seq-file]]
[varity.vcf-to-hgvs :as v2h]))

(deftest ->supported-transcript-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course! 👍


(deftest coding-dna-ref-gene?-test
(testing "valid reference genes"
(are [transcript] (#'v2h/coding-dna-ref-gene? {:name transcript})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using true? or false? to test a predicate fn strictly because Clojure treats everything except false and nil as true.

(are [transcript] (true? (#'v2h/coding-dna-ref-gene? {:name transcript}))
...
(are [transcript] (false? (#'v2h/coding-dna-ref-gene? {:name transcript}))

- rename function
- makes tests more strict
@k-kom
Copy link
Contributor Author

k-kom commented Nov 9, 2021

@totakke
Thank you for reviewing. I updated the codes you commented on.

And can we release 0.8.0 after this PR is merged?
I created PR for releasing (sorry if something is missing). #46

@k-kom k-kom requested a review from totakke November 9, 2021 09:11
Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the revision. lein test :all fails because of the lack of rgidx. Please check it.

rgs (if-let [[rs] (re-find #"^(NM|NR)_\d+\.?(\d+)?$" (str transcript))]
(rg/ref-genes rs rgidx)
rgs (if (supported-transcript? transcript)
(rg/ref-genes (str transcript))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rgidx is missing.

@@ -33,6 +33,10 @@
[hgvs seq-rdr rgs]
(distinct (apply concat (keep #(prot/->vcf-variants hgvs seq-rdr %) rgs))))

(defn- supported-transcript?
[s]
(some? (re-matches #"^((NM|NR)_|ENS(T|P))\d+(\.\d+)?$" (str s))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^$ are unnecessary for re-matches.

@k-kom
Copy link
Contributor Author

k-kom commented Nov 10, 2021

@totakke
Sorry I didn't notice the test failure. I fixed that and confirmed :all test is passed.

@k-kom k-kom requested a review from totakke November 10, 2021 03:00
Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@totakke totakke merged commit e90b38b into chrovis:master Nov 10, 2021
@k-kom k-kom mentioned this pull request Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants