Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rmun NOT work by default #99

Open
holypeggy opened this issue Aug 17, 2023 · 1 comment
Open

rmun NOT work by default #99

holypeggy opened this issue Aug 17, 2023 · 1 comment

Comments

@holypeggy
Copy link

hi, thanks for your excellent work.
In my abundance analysis, the unknown taxa, such as "g_un_xxx" as you mentioned, were not grouped into the Others by default "rmun=FALSE" (figure 1). However, when i set "rmun=TURE", they were removed in the analysis (figure 2). I want to grouped them into Other. I am appreciate if you can provide some help in this issue.
1692241294192
1692241319537

@xiangpin
Copy link
Member

Thanks, In fact, when rmun=TRUE, the unknown taxa do group to Others instead of removing. I should update the help doc.
Let's validate it.

> library(MicrobiotaProcess)
> data(mouse.time.mpse)
> # First, the unknown taxa do not group to `Others` when `rmun = FASLE` (by default).
> mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Genus, order.by.feature=T, width=4/5) -> p1
># Second, the unknown taxa group to `Others` when `rmun = TRUE`
>mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Genus, rmun=T, order.by.feature=T, width=4/5) -> p2
>p1 / p2

image

We can see the unknown taxa in p2 were not displayed, which they were grouped to Others in fact. In addition, Since we display the top 10 most abundant taxa, we can see that g__Roseburia, g__A2 and g__Acetatifactor were also displayed, but not in p1 (Because they were grouped to Others in p1). So let's see if the total abundance of Others and the unknown taxa in p1 is equal to the total abundance of Others and the unique taxa (g__Roseburia, g__A2 and g__Acetatifactor) in p2.

> p1$data |> dplyr::filter(grepl("__un_", Genus) | grepl('Others', Genus)) %>% dplyr::group_by(Sample) %>% dplyr::summarize(total=sum(RelRareAbundanceBySample)) %>% dplyr::arrange(total)
# A tibble: 19 × 2
   Sample total
   <fct>  <dbl>
 1 F3D8    58.4
 2 F3D9    61.6
 3 F3D1    62.1
 4 F3D5    64.5
 5 F3D6    68.0
 6 F3D7    72.0
 7 F3D0    72.6
 8 F3D2    72.7
 9 F3D3    74.1
10 F3D149  76.4
11 F3D141  77.5
12 F3D148  77.6
13 F3D146  78.4
14 F3D142  78.5
15 F3D144  79.7
16 F3D143  80.0
17 F3D150  81.1
18 F3D145  82.3
19 F3D147  83.0
> p2$data |> dplyr::filter(Genus %in% c('g__Roseburia', 'g__A2', 'g__Acetatifactor', 'Others'))  %>% dplyr::group_by(Sample) %>% dplyr::summarize(total=sum(RelRareAbundanceBySample)) %>% dplyr::arrange(total)
# A tibble: 19 × 2
   Sample total
   <fct>  <dbl>
 1 F3D8    58.4
 2 F3D9    61.6
 3 F3D1    62.1
 4 F3D5    64.5
 5 F3D6    68.0
 6 F3D7    72.0
 7 F3D0    72.6
 8 F3D2    72.7
 9 F3D3    74.1
10 F3D149  76.4
11 F3D141  77.5
12 F3D148  77.6
13 F3D146  78.4
14 F3D142  78.5
15 F3D144  79.7
16 F3D143  80.0
17 F3D150  81.1
18 F3D145  82.3
19 F3D147  83.0

Yes, they are equal. So rmun = TURE does do what your want. But I should update the help doc to avoid misunderstanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants