Hello,
I have been working on 6mA methylation analysis for human brain samples using modkit v0.5.1 and had a few questions regarding the calculated threshold for modification calling. Previously, we noticed that using the calculated threshold for CpG specific 5mC/5hmC modifications in our data resulted in alarmingly low calculated thresholds at ~0.6. In accordance with our background research and focus, we set a --filter-threshold at 0.75 which worked great.
However, there is much less consensus information regarding 6mA modifications in the human genome. So while modkit is generating calculated thresholds at ~0.81, I am still a little skeptical about using this value. I understand the general advice given is that the calculated threshold is a good metric for modkit pileup. However, it seems that a majority of the validation for this calculation was performed on data sets with non-sparse modifications to generate the ~99% accuracy values.
For our exploratory analysis, I am worried that even small shifts in the cutoff threshold might inappropriately impact the retained 6mA calls. I have tried running --sample-probs on some of our samples with the highest (Su602, Su792) and lowest (Su301, Su769) total depth. I have attached these visuals below; is there a better approach to estimating a cutoff threshold from these sample probs instead of just relying on the calculated threshold?
Thanks!

Su602

Su792

Su301

Su769
Hello,
I have been working on 6mA methylation analysis for human brain samples using modkit v0.5.1 and had a few questions regarding the calculated threshold for modification calling. Previously, we noticed that using the calculated threshold for CpG specific 5mC/5hmC modifications in our data resulted in alarmingly low calculated thresholds at ~0.6. In accordance with our background research and focus, we set a --filter-threshold at 0.75 which worked great.
However, there is much less consensus information regarding 6mA modifications in the human genome. So while modkit is generating calculated thresholds at ~0.81, I am still a little skeptical about using this value. I understand the general advice given is that the calculated threshold is a good metric for modkit pileup. However, it seems that a majority of the validation for this calculation was performed on data sets with non-sparse modifications to generate the ~99% accuracy values.
For our exploratory analysis, I am worried that even small shifts in the cutoff threshold might inappropriately impact the retained 6mA calls. I have tried running --sample-probs on some of our samples with the highest (Su602, Su792) and lowest (Su301, Su769) total depth. I have attached these visuals below; is there a better approach to estimating a cutoff threshold from these sample probs instead of just relying on the calculated threshold?
Thanks!