cluster_counts interpretation #210

sunsetyerin · 2023-02-02T23:08:14Z

I have a question aboutcluster_counts column from nanocompore sampcomp result.
I saw previous issues that cluster_counts is the number of reads assigned to each cluster.
but wonder which one is number of reads and which one is number of clusters.

control_1:32/12__control_2:21/9__test_1:60/23__test_2:38/16

From here, control_1:32/12, 32 is the number of reads and 12 is the number of clusters?

The text was updated successfully, but these errors were encountered:

lmulroney · 2023-02-03T13:03:36Z

Hi @sunsetyerin,

Nanocompore is limited to 2 clusters for the gmm. If 1 cluster fits the data better than 2, no further processing is done and the site is considered unmodified. If 2 clusters fit the data better than 2, the statical test (usually the logistical regression test) is performed.

From this example there are 32 reads from control 1 assigned to cluster 1 (c1) and 12 reads assigned to cluster 2 (c2). Control 2 has 21 reads assigned to c1 and 9 reads assigned to c2. Test 1 has 60 reads assigned to c1 and 23 reads assigned to c2, and test 2 has 38 reads assigned to c1 and 16 reads assigned to c2.

The basic way to read those lines is:
[sample name]:[number of reads assigned to cluster 1]/[number of reads assigned to cluster 2]__repeated for each further sample.

I hope this explanation helps.

lmulroney closed this as completed May 9, 2023

lmulroney mentioned this issue May 18, 2023

nanocompore sampcomp output question. #219

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster_counts interpretation #210

cluster_counts interpretation #210

sunsetyerin commented Feb 2, 2023

lmulroney commented Feb 3, 2023

cluster_counts interpretation #210

cluster_counts interpretation #210

Comments

sunsetyerin commented Feb 2, 2023

lmulroney commented Feb 3, 2023