You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
conda activate my_env && pip install cnvkit (+ conda installation of 'r-base' for DNAcopy R package)
Data:
Hybrid capture panel of genes and hotspots (capture probes coordinates used as manifest), somatic DNA (FFPE) with no normal sample
I was previously using CNVkit v0.9.6 on the same panel but with an amplicon tech. Segmenting with HAAR used to give better results than CBS, so I wanted to give it a try on my new data and with an upgraded CNVkit.
=> I should precise that using v0.9.8 on my hybrid-capture data with a CBS-segmentation went fine (so no installation issues I guess ?)
Exact command and error: cnvkit.py batch my_patient.bam -p 1 -n --segment-method haar -f hg19.fa -t capture_probes.manifest.bed --drop-low-coverage
Most patient when fine, except for a couple, that raised following error:
File "/home/bioinfo/miniconda3/envs/cnvkit_repo/lib/python3.9/site-packages/cnvlib/segmentation/haar.py", line 328, in HaarConv
highNonNormed += signal[highEnd] * weight[highEnd] - signal[k-1] * weight[k-1]
IndexError: index 2 is out of bounds for axis 0 with size 2
Possible cause:
This error happens during segmentation of chrY and can be reproduced: cnvkit.py segment --drop-low-coverage -m haar my_patient.cnr (I think I can even share with you both a buggy and a fonctionning ".cnr" with equivalent chrY depth, if needed)
My panel has a single probe on SRY used as a control, which is "binned" by CNVkit into 1 antitarget and 1 target region within each .cnr as examplified here:
Y 2507140 2654431 Antitarget 0 -10.8612 0.0001
Y 2654931 2655203 NM_003140|SRY|3PUTR|1;NM_003140|SRY|1/1 1286.11 0.110645 0.932772
Every patient "HAAR-segmentable" has a 0 mean_depth for SRY antitarget region and a "classic" depth for target region (> 1000X for male and around 0 for female)
=> low-covered antitarget is excluded lead to a 1-sized signal_array, so smoothing returns a np.zeros(signalSize, dtype=np.float_) thanks to this condition
In the contrary, my bogued patients have a very small depth at their antitarget SRY region (0.000536353, but not 0)
=> So antitarget is kept and this leads to a 2-sized signal_arr, next given to smoothing.savgol()viacnarr.smooth_log() and after "winging", we obtain a 4-sized signal_arr
=> It is here that something's wrong: Going through smoothing.convolve_weighted(), the array shape is messed up (7 for both signal_arr and weights_arr, when we expect them to be 4-shaped), leading to this "index out of range" error described abovehead
Possible fixes:
My understanding of these "smoothing things" is too limited to understand the precise origin of this bug, but here are some ideas:
In smoothing.savgol(), you have the following condition
=> Including case where len(x)==2 should not have a huge impact on segmentation ?
I don't know where in the code regions from the ".cnr" with depth at 0 are excluded. But maybe you should insert a cutoff there, to exclude regions having a depth bellow a certain value ?
Thanks in advance for your help.
Best regards.
Felix.
The text was updated successfully, but these errors were encountered:
tetedange13
changed the title
[CNVkit v0.9.8] Errored smoothing when HAAR-segmentation ?
[CNVkit v0.9.8] Errored smoothing with HAAR-segmentation ?
Mar 25, 2021
Hi @tetedange13, thank you for this amazingly detailed bug report! I've investigated this and submitted a PR with what I think might be a permanent solution.
Hi,
First, thanks for this very useful tool and your investment developping it !
Informatic context: Running CNVkit v0.9.8 on CentOS 6 virtual machine, installed through:
conda create -n my_env pip
conda activate my_env && pip install cnvkit
(+ conda installation of 'r-base' for DNAcopy R package)Data:
Hybrid capture panel of genes and hotspots (capture probes coordinates used as manifest), somatic DNA (FFPE) with no normal sample
I was previously using CNVkit v0.9.6 on the same panel but with an amplicon tech. Segmenting with HAAR used to give better results than CBS, so I wanted to give it a try on my new data and with an upgraded CNVkit.
=> I should precise that using v0.9.8 on my hybrid-capture data with a CBS-segmentation went fine (so no installation issues I guess ?)
Exact command and error:
cnvkit.py batch my_patient.bam -p 1 -n --segment-method haar -f hg19.fa -t capture_probes.manifest.bed --drop-low-coverage
Most patient when fine, except for a couple, that raised following error:
(put complete traceback here for clarity)
Possible cause:
This error happens during segmentation of chrY and can be reproduced:
cnvkit.py segment --drop-low-coverage -m haar my_patient.cnr
(I think I can even share with you both a buggy and a fonctionning ".cnr" with equivalent chrY depth, if needed)My panel has a single probe on SRY used as a control, which is "binned" by CNVkit into 1 antitarget and 1 target region within each .cnr as examplified here:
=> low-covered antitarget is excluded lead to a 1-sized signal_array, so smoothing returns a
np.zeros(signalSize, dtype=np.float_)
thanks to this conditioncnvkit/cnvlib/segmentation/haar.py
Line 289 in b218280
=> So antitarget is kept and this leads to a 2-sized signal_arr, next given to
smoothing.savgol()
via cnarr.smooth_log() and after "winging", we obtain a 4-sized signal_arr=> It is here that something's wrong: Going through
smoothing.convolve_weighted()
, the array shape is messed up (7 for both signal_arr and weights_arr, when we expect them to be 4-shaped), leading to this "index out of range" error described aboveheadPossible fixes:
My understanding of these "smoothing things" is too limited to understand the precise origin of this bug, but here are some ideas:
smoothing.savgol()
, you have the following conditioncnvkit/cnvlib/smoothing.py
Line 174 in b218280
=> Including case where
len(x)==2
should not have a huge impact on segmentation ?Thanks in advance for your help.
Best regards.
Felix.
The text was updated successfully, but these errors were encountered: