Skip to content

Strange pattern in y axis after Kmeans used in plotHeatmap #972

@PLStenger

Description

@PLStenger

I used the V. 3.3.0 version of deeptools (command line, installed from bioconda).

I have Whole Genome Bisulfite data. I run computeMatrix scale-regions -S for each file. I specify I didn't used the whole genome here, but only a selection of interesting genes (into files select_*.bed).

for SELECTION in $(ls $DATADIRECTORY/select_*.bed)
do

for FILE in $(ls $DATADIRECTORY/*.bam_sorted.bam.bw)
do

computeMatrix scale-regions -S ${FILE##*/} -R ${SELECTION##*/} --beforeRegionStartLength 3000 --regionBodyLength 5000 --afterRegionStartLength 3000 -o ${FILE##*/}_${SELECTION##*/}_matrix.mat.gz

done ;
done ;

After I did a computeMatrixOperations rbind -m on my each replicate for binding them, and then I did a computeMatrixOperations cbind -m on my four sample point.

Then I use the Kmeans (=4) technics :

plotHeatmap -m FILE_p0_05_rbind.bed_matrix.mat.gz \
     -out FILE_p0_05_rbind_kmeans_04.bed_matrix.mat.gz.pdf \
     --colorMap RdBu \
     --whatToShow 'plot, heatmap and colorbar' \
     --zMin -3 --zMax 3 \
     --kmeans 4 \
     --outFileSortedRegions FILE_p0_05_rbind_kmeans_04_out_clusters.bed

And here I have a huge y axis (between 0 to 3000), because 2 genes drive the cluster 1.

Capture d’écran 2020-07-15 à 12 07 26

When I delete them in the bed file, I redo analysis and all it's ok.

Capture d’écran 2020-07-15 à 12 07 42

I specify also when I run analysis on the whole genome without gene selection (so the two genes are in the whole set) I obtain Y value between (0 and 40). Also, it's just an example, but with others kind of subset I also obtain sometimes (but not always) this strange huge y axis range, with others genes.

So, before that I assume that the y axis was a kind of methylation percentage (because I always obtain plots between 0 and 40 for y axis), but now I'm not sure about was really is y axis (if it's not a bug in these command lines). So my question is what is exactly the y axis? And/Or if it's well methylation percentage, is there any bug that could explain this pattern?

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions