You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to determine whether some mOTUs in my samples have more diversity within them than others. In particular, I would like to be able to compare the percent nucleotide divergence (number of SNVs per bp) between two samples that I see for mOTU 1 to what I see for mOTU2. It seems like the distances returned by motus snv_call are not suited for this analysis, since they seem to be normalized differently for different OTUs, and also depend on how much variation is seen in the samples. E.g., if I leave out some samples that contain a lot of diversity for a given OTU and rerun motus snv_call then the distances between the remaining samples will increase.
Is there a way to convert these distances to be in units of nucleotide changes per total marker-gene length, so that they can be compared across OTUs and so the distance between two samples won't depend on the amount of diversity across other samples?
Thank you for your help!
The text was updated successfully, but these errors were encountered:
The distances themselves cannot be converted back to the information you are looking for.
However, if you run motus snv_call with the -k option, it will keep the intermediary files.
With that you can (1) either parse the filtered output for the number of SNVs per mOTU or (2) run: python /path/to/metasnv/metaSNV_DistDiv.py --filt /path/to/snv_call_output/filtered-m5-d10-b80-c5-p0.9 --div --n_threads n
This command will create new files in the distance directory of the output (in this case distances-m5-d10-b80-c5-p0.9), including some with the suffix .diversity. These are matrices of the intra and inter sample nucleotide diversity, i.e. a measure of the number of SNV weighted by their frequency. The intra-sample diversities are on the diagonal.
Thanks for you advice Lucas. I wasn't able to get your choice (2) to work, as I get the errors
ERROR: No such file '/.all_cov.tab',
ERROR: No such file '/.all_perc.tab',
ERROR: No such file '/bed_header'
and can't seem to tell metaSNV_DistDiv.py where these files are (they are in the snv_call_output/ folder), and copying them into the filtered-m5-d10-b80-c5-p0.9 folder didn't make a difference.
But I've decided for my project to work with the BAM files produced by motus map_snv to do my own genotyping and distance calculations.
I am trying to determine whether some mOTUs in my samples have more diversity within them than others. In particular, I would like to be able to compare the percent nucleotide divergence (number of SNVs per bp) between two samples that I see for mOTU 1 to what I see for mOTU2. It seems like the distances returned by
motus snv_call
are not suited for this analysis, since they seem to be normalized differently for different OTUs, and also depend on how much variation is seen in the samples. E.g., if I leave out some samples that contain a lot of diversity for a given OTU and rerunmotus snv_call
then the distances between the remaining samples will increase.Is there a way to convert these distances to be in units of nucleotide changes per total marker-gene length, so that they can be compared across OTUs and so the distance between two samples won't depend on the amount of diversity across other samples?
Thank you for your help!
The text was updated successfully, but these errors were encountered: