Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated mp_import_humann_regroup to keep the abundance of contributed taxa #98

Open
xiangpin opened this issue Aug 15, 2023 · 0 comments

Comments

@xiangpin
Copy link
Member

introduced keep.contribute.abundance argument in mp_import_humann_regroup() 6ccd981.

The default

keep.contribute.abundance = FALSE
only the taxa information was kept.

> library(MicrobiotaProcess)
MicrobiotaProcess v1.13.2.992 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues

If you use MicrobiotaProcess in published research, please cite the
paper:

Shuangbin Xu, Li Zhan, Wenli Tang, Qianwen Wang, Zehan Dai, Lang Zhou,
Tingze Feng, Meijun Chen, Tianzhi Wu, Erqiang Hu, Guangchuang Yu.
MicrobiotaProcess: A comprehensive R package for deep mining
microbiome. The Innovation. 2023, 4(2):100388. doi:
10.1016/j.xinn.2023.100388

Export the citation to BibTex by citation('MicrobiotaProcess')

This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))

Attaching package: ‘MicrobiotaProcess’

The following object is masked from ‘package:stats’:

    filter

> mpse.ko1 <- mp_import_humann_regroup('./QJ.humann3_ko.tsv', './SRP190865_meta.csv')
> mpse.ko1
# A MPSE-tibble (MPSE object) abstraction: 498,387 × 6
# OTU=5359 | Samples=93 | Assays=Abundance | Taxonomy=NULL
   OTU    Sample     Abundance geo_loc_name_country Group contribute.taxa
   <chr>  <chr>          <dbl> <chr>                <chr> <list>
 1 K00001 SRR8849198       0   China                PCOS  <tibble [8 × 1]>
 2 K00002 SRR8849198       0   China                PCOS  <tibble [3 × 1]>
 3 K00003 SRR8849198      55.1 China                PCOS  <tibble [29 × 1]>
 4 K00004 SRR8849198       0   China                PCOS  <tibble [3 × 1]>
 5 K00005 SRR8849198      83.0 China                PCOS  <tibble [24 × 1]>
 6 K00007 SRR8849198       0   China                PCOS  <tibble [1 × 1]>
 7 K00008 SRR8849198       0   China                PCOS  <tibble [6 × 1]>
 8 K00009 SRR8849198      39.4 China                PCOS  <tibble [23 × 1]>
 9 K00010 SRR8849198       0   China                PCOS  <tibble [16 × 1]>
10 K00012 SRR8849198    1878.  China                PCOS  <tibble [27 × 1]>
# ℹ 498,377 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko1 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa)
# A tibble: 85,919 × 2
   OTU    contribute.taxa
   <chr>  <chr>
 1 K00001 s__Bifidobacterium_bifidum
 2 K00001 s__Bifidobacterium_longum
 3 K00001 s__Eggerthella_lenta
 4 K00001 s__Enterobacter_cloacae_complex
 5 K00001 s__Klebsiella_pneumoniae
 6 K00001 s__Lactobacillus_gasseri
 7 K00001 s__Lactobacillus_paragasseri
 8 K00001 s__Megasphaera_elsdenii
 9 K00002 s__Blautia_obeum
10 K00002 s__Blautia_producta
# ℹ 85,909 more rows
# ℹ Use `print(n = ...)` to see more rows

keep.contribute.abundance=TRUE

the abundance of each contributed taxa in each sample will be kept, and they can be extract with mp_extract_feature.

> mpse.ko2 <- mp_import_humann_regroup('./QJ.humann3_ko.tsv', './SRP190865_meta.csv', keep.contribute.abundance=T)
> mpse.ko2
# A MPSE-tibble (MPSE object) abstraction: 498,387 × 6
# OTU=5359 | Samples=93 | Assays=Abundance | Taxonomy=NULL
   OTU    Sample     Abundance geo_loc_name_country Group contribute.taxa
   <chr>  <chr>          <dbl> <chr>                <chr> <list>
 1 K00001 SRR8849198       0   China                PCOS  <tibble [8 × 94]>
 2 K00002 SRR8849198       0   China                PCOS  <tibble [3 × 94]>
 3 K00003 SRR8849198      55.1 China                PCOS  <tibble [29 × 94]>
 4 K00004 SRR8849198       0   China                PCOS  <tibble [3 × 94]>
 5 K00005 SRR8849198      83.0 China                PCOS  <tibble [24 × 94]>
 6 K00007 SRR8849198       0   China                PCOS  <tibble [1 × 94]>
 7 K00008 SRR8849198       0   China                PCOS  <tibble [6 × 94]>
 8 K00009 SRR8849198      39.4 China                PCOS  <tibble [23 × 94]>
 9 K00010 SRR8849198       0   China                PCOS  <tibble [16 × 94]>
10 K00012 SRR8849198    1878.  China                PCOS  <tibble [27 × 94]>
# ℹ 498,377 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko2 %>% mp_extract_feature()
# A tibble: 5,359 × 2
   OTU    contribute.taxa
   <chr>  <list>
 1 K00001 <tibble [8 × 94]>
 2 K00002 <tibble [3 × 94]>
 3 K00003 <tibble [29 × 94]>
 4 K00004 <tibble [3 × 94]>
 5 K00005 <tibble [24 × 94]>
 6 K00007 <tibble [1 × 94]>
 7 K00008 <tibble [6 × 94]>
 8 K00009 <tibble [23 × 94]>
 9 K00010 <tibble [16 × 94]>
10 K00012 <tibble [27 × 94]>
# ℹ 5,349 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse.ko2 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa)
# A tibble: 85,421 × 95
   OTU    contribute.taxa SRR8849198 SRR8849199 SRR8849200 SRR8849201 SRR8849202
   <chr>  <chr>                <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
 1 K00001 s__Bifidobacte…          0          0          0       3.77          0
 2 K00001 s__Bifidobacte…          0          0          0       0             0
 3 K00001 s__Eggerthella…          0          0          0       0             0
 4 K00001 s__Enterobacte…          0          0          0       0             0
 5 K00001 s__Klebsiella_…          0          0          0       0             0
 6 K00001 s__Lactobacill…          0          0          0       0             0
 7 K00001 s__Lactobacill…          0          0          0       0             0
 8 K00001 s__Megasphaera…          0          0          0      66.0           0
 9 K00002 s__Blautia_obe…          0          0          0       0             0
10 K00002 s__Blautia_pro…          0          0          0       0             0
# ℹ 85,411 more rows
# ℹ 88 more variables: SRR8849203 <dbl>, SRR8849204 <dbl>, SRR8849205 <dbl>,
#   SRR8849206 <dbl>, SRR8849207 <dbl>, SRR8849208 <dbl>, SRR8849209 <dbl>,
#   SRR8849210 <dbl>, SRR8849211 <dbl>, SRR8849212 <dbl>, SRR8849213 <dbl>,
#   SRR8849214 <dbl>, SRR8849215 <dbl>, SRR8849216 <dbl>, SRR8849217 <dbl>,
#   SRR8849218 <dbl>, SRR8849219 <dbl>, SRR8849220 <dbl>, SRR8849221 <dbl>,
#   SRR8849222 <dbl>, SRR8849223 <dbl>, SRR8849224 <dbl>, SRR8849225 <dbl>, …
# ℹ Use `print(n = ...)` to see more rows

the gene abundance of specified taxa can be extracted quickly and converted to MPSE. For example, the following codes will extract the gene abundance of Bifidobacterium, then re-calculate the total specified gene abundance according to the abundance of each contributed taxa, and generated a new MPSE object.

> mpse.ko2 %>% mp_extract_feature() %>% tidyr::unnest(contribute.taxa) %>% dplyr::filter(grepl('s__Bifidobact', contribute.taxa)) %>% dplyr::select(-contribute.taxa) %>% dplyr::group_by(OTU) %>% dplyr::summarize(dplyr::across(dplyr::everything(),sum)) %>% tibble::column_to_rownames(var='OTU') %>% MPSE() %>% dplyr::left_join(mpse.ko2 %>% mp_extract_sample())
# A MPSE-tibble (MPSE object) abstraction: 82,398 × 5
# OTU=886 | Samples=93 | Assays=Abundance | Taxonomy=NULL
   OTU    Sample     Abundance geo_loc_name_country Group
   <chr>  <chr>          <dbl> <chr>                <chr>
 1 K00001 SRR8849198      0    China                PCOS
 2 K00012 SRR8849198      8.03 China                PCOS
 3 K00013 SRR8849198     47.4  China                PCOS
 4 K00016 SRR8849198     51.8  China                PCOS
 5 K00031 SRR8849198      0    China                PCOS
 6 K00052 SRR8849198     40.5  China                PCOS
 7 K00053 SRR8849198    146.   China                PCOS
 8 K00057 SRR8849198      0    China                PCOS
 9 K00058 SRR8849198     27.3  China                PCOS
10 K00059 SRR8849198      5.59 China                PCOS
# ℹ 82,388 more rows
# ℹ Use `print(n = ...)` to see more rows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant