Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for generating CSI file from BCF. #196

Merged
merged 4 commits into from
May 28, 2020

Conversation

niyarin
Copy link
Contributor

@niyarin niyarin commented May 18, 2020

This PR is to add function that creates a CSI from BCF.
It is almost the same as #190 (creating CSI from VCF).

@niyarin niyarin requested review from alumi and a team as code owners May 18, 2020 06:28
@niyarin niyarin requested review from athos and removed request for a team May 18, 2020 06:28
@codecov
Copy link

codecov bot commented May 18, 2020

Codecov Report

Merging #196 into master will increase coverage by 0.04%.
The diff coverage is 96.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #196      +/-   ##
==========================================
+ Coverage   86.85%   86.90%   +0.04%     
==========================================
  Files          77       77              
  Lines        6223     6246      +23     
  Branches      520      519       -1     
==========================================
+ Hits         5405     5428      +23     
- Misses        298      299       +1     
+ Partials      520      519       -1     
Impacted Files Coverage Δ
src/cljam/io/bcf/reader.clj 90.28% <95.83%> (+0.50%) ⬆️
src/cljam/algo/vcf_indexer.clj 100.00% <100.00%> (ø)
src/cljam/io/csi.clj 90.42% <0.00%> (+0.53%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 81f05ee...00adb85. Read the comment docs.

Copy link
Member

@alumi alumi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!!

@@ -0,0 +1,18 @@
(ns cljam.algo.bcf-indexer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge this with cljam.algo.vcf-indexer and make create-index work with both VCF/BCF

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankyou,I merged bcf-indexer and vcf-indexer.

@niyarin niyarin force-pushed the feature/csi-generator-for-bcf branch from d85bc3e to e03fd13 Compare May 25, 2020 16:39
@niyarin niyarin force-pushed the feature/csi-generator-for-bcf branch from e03fd13 to a4e482c Compare May 25, 2020 16:57
Copy link
Member

@alumi alumi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the linter action is not working. Please check it later 🙏

names (->> (map :chr offsets)
(concat (->> (.meta-info ^VCFReader r)
(concat (->> (.meta-info r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes a reflection. Use cljam.io.vcf/meta-info instead.

(:require [cljam.io.csi :as csi]
[cljam.io.util :as io-util]
[cljam.io.vcf :as vcf])
(:import cljam.io.vcf.reader.VCFReader))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use "io-util/vcf-reader?" to get variant-file-type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it’s okay to remove the line..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I misunderstood that io-util is unnecessary.

(map-indexed (fn [index contig]
[(:id contig) index]))
(into {}))
kws (mapv keyword (drop 8 (.header rdr)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

(when (pos? (.available input-stream))
(when-let [line (read-data-line-buffer input-stream)]
(let [end-pointer (.getFilePointer input-stream)]
(let [{:keys [chr pos ref info]} (parse-fn line)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant let

(do (vcf-indexer/create-index test-bcf-complex-file
tmp-csi-file {:shift 14 :depth 5})
(let [computed ^CSI (csi/read-index tmp-csi-file)
csi ^CSI (csi/read-index tmp-csi-file)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
csi ^CSI (csi/read-index tmp-csi-file)]
csi ^CSI (csi/read-index test-bcf-complex-csi-file)]

@niyarin
Copy link
Contributor Author

niyarin commented May 26, 2020

I'm sorry.
I know the cause of linter's troubles and need to add the base_sha option. (Doesn't work with the second commit in the branch)

I fix sources that are pointed.

@niyarin niyarin force-pushed the feature/csi-generator-for-bcf branch from 208e0cd to 12e2dbc Compare May 26, 2020 14:05
Copy link
Member

@alumi alumi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@athos athos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Your code basically looks good 👍
Just added a comment on type hints.

(testing "bcf"
(with-open [r (vcf/reader test-bcf-changed-chr-order-file)]
(let [file-offsets (vcf/read-file-offsets r)
computed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a type hint ^CSI to the declaration of computed, and you won't need to type hint every subsequent use-site of the name. Try not to use type hints too much and keep your code concise 💪 The same goes for some of the test cases below.

@niyarin
Copy link
Contributor Author

niyarin commented May 28, 2020

Thank you for pointing. I fixed them.

Copy link
Member

@athos athos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 I will merge it now. Thank you for working on the PR!

@athos athos merged commit 806aad3 into master May 28, 2020
@athos athos deleted the feature/csi-generator-for-bcf branch May 28, 2020 09:12
@niyarin
Copy link
Contributor Author

niyarin commented May 29, 2020

Thank you for the reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants