rmun NOT work by default #99

holypeggy · 2023-08-17T03:03:59Z

hi, thanks for your excellent work.
In my abundance analysis, the unknown taxa, such as "g_un_xxx" as you mentioned, were not grouped into the Others by default "rmun=FALSE" (figure 1). However, when i set "rmun=TURE", they were removed in the analysis (figure 2). I want to grouped them into Other. I am appreciate if you can provide some help in this issue.

xiangpin · 2023-08-18T02:40:36Z

Thanks, In fact, when rmun=TRUE, the unknown taxa do group to Others instead of removing. I should update the help doc.
Let's validate it.

> library(MicrobiotaProcess)
> data(mouse.time.mpse)
> # First, the unknown taxa do not group to `Others` when `rmun = FASLE` (by default).
> mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Genus, order.by.feature=T, width=4/5) -> p1
># Second, the unknown taxa group to `Others` when `rmun = TRUE`
>mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Genus, rmun=T, order.by.feature=T, width=4/5) -> p2
>p1 / p2

We can see the unknown taxa in p2 were not displayed, which they were grouped to Others in fact. In addition, Since we display the top 10 most abundant taxa, we can see that g__Roseburia, g__A2 and g__Acetatifactor were also displayed, but not in p1 (Because they were grouped to Others in p1). So let's see if the total abundance of Others and the unknown taxa in p1 is equal to the total abundance of Others and the unique taxa (g__Roseburia, g__A2 and g__Acetatifactor) in p2.

> p1$data |> dplyr::filter(grepl("__un_", Genus) | grepl('Others', Genus)) %>% dplyr::group_by(Sample) %>% dplyr::summarize(total=sum(RelRareAbundanceBySample)) %>% dplyr::arrange(total)
# A tibble: 19 × 2
   Sample total
   <fct>  <dbl>
 1 F3D8    58.4
 2 F3D9    61.6
 3 F3D1    62.1
 4 F3D5    64.5
 5 F3D6    68.0
 6 F3D7    72.0
 7 F3D0    72.6
 8 F3D2    72.7
 9 F3D3    74.1
10 F3D149  76.4
11 F3D141  77.5
12 F3D148  77.6
13 F3D146  78.4
14 F3D142  78.5
15 F3D144  79.7
16 F3D143  80.0
17 F3D150  81.1
18 F3D145  82.3
19 F3D147  83.0
> p2$data |> dplyr::filter(Genus %in% c('g__Roseburia', 'g__A2', 'g__Acetatifactor', 'Others'))  %>% dplyr::group_by(Sample) %>% dplyr::summarize(total=sum(RelRareAbundanceBySample)) %>% dplyr::arrange(total)
# A tibble: 19 × 2
   Sample total
   <fct>  <dbl>
 1 F3D8    58.4
 2 F3D9    61.6
 3 F3D1    62.1
 4 F3D5    64.5
 5 F3D6    68.0
 6 F3D7    72.0
 7 F3D0    72.6
 8 F3D2    72.7
 9 F3D3    74.1
10 F3D149  76.4
11 F3D141  77.5
12 F3D148  77.6
13 F3D146  78.4
14 F3D142  78.5
15 F3D144  79.7
16 F3D143  80.0
17 F3D150  81.1
18 F3D145  82.3
19 F3D147  83.0

Yes, they are equal. So rmun = TURE does do what your want. But I should update the help doc to avoid misunderstanding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rmun NOT work by default #99

rmun NOT work by default #99

holypeggy commented Aug 17, 2023

xiangpin commented Aug 18, 2023

rmun NOT work by default #99

rmun NOT work by default #99

Comments

holypeggy commented Aug 17, 2023

xiangpin commented Aug 18, 2023