# Merge traits with a map

Created by 
* [Li, Chaonan (李超男)](https://www.researchgate.net/profile/Chaonan-Li-5) / licn@mtc.edu.cn / [Ecological Security and Protection Key Laboratory of Sichuan Province, Mianyang Normal University](https://zdsys.mtc.edu.cn/)
* Liao, Haijun (廖海君) / liaohj@mtc.edu.cn /
[Engineering Research Center of Chuanxibei RHS Construction at Mianyang Normal University of Sichuan Province](https://rhs.mtc.edu.cn/)

Reviewed by [Li, Xiangzhen (李香真)](https://www.researchgate.net/profile/Xiangzhen-Li-2) / lixz@fafu.edu.cn /
[College of Resources and Environment, Fujian Agriculture and Forestry University](https://zhxy.fafu.edu.cn/main.htm)

To enable microbiome data analysis in conjunction with the metadata extracted from a map, we also implemented several functions to merge microbial traits with a `SpatialPolygonsDataFrame` or extract a new metadata from a `SpatialPolygonsDataFrame`. By using these functions, we can perform the statistical comparing based on administrative areas or grids. 

Now, let's go through each of these functions and see how they are used.

## Load required R packages

Here we need four R packages for this section of microgeo R package tutorial. Just run the following codes to import them into R environment.

In [1]:
# Install and load `magrittr`, `ggplot2`, `devtools` and `microgeo` packages 
if (!suppressMessages(require(magrittr))) install.packages("magrittr")
if (!require(ggplot2)  %>% suppressMessages) install.packages("ggplot2")
if (!require(devtools) %>% suppressMessages) install.packages("devtools")
if (!require(microgeo) %>% suppressMessages) devtools::install_github("ChaonanLi/microgeo")

## Create a standard microgeo dataset

We also need a standard microgeo dataset for the presentations in the section of tutorial.

In [2]:
# Example by using the map downloaded from DataV.GeoAtlas
data(qtp)
map <- read_aliyun_map(adcode = c(540000, 630000, 510000)) %>% suppressMessages() 
dataset.dts.aliyun <- create_dataset(mat = qtp$asv, ant = qtp$tax, met = qtp$met, map = map,
                                     phy = qtp$tre, env = qtp$env, lon = "longitude", lat = "latitude")
dataset.dts.aliyun %<>% rarefy_count_table()
dataset.dts.aliyun %<>% tidy_dataset()
dataset.dts.aliyun %<>% calc_alpha_div(measures = c("observed", "shannon")) 
dataset.dts.aliyun %<>% calc_beta_div(measures = c("bray", "jaccard")) 
dataset.dts.aliyun %>% show_dataset()

[36mℹ[39m [2023-10-12 10:50:00] [34m[3m[34mINFO[34m[23m[39m ==> all samples fall within the map area!

[36mℹ[39m [2023-10-12 10:50:00] [34m[3m[34mINFO[34m[23m[39m ==> dataset has been created successfully!

[36mℹ[39m [2023-10-12 10:50:00] [34m[3m[34mINFO[34m[23m[39m ==> use `object %>% show_dataset()` to check the summary of dataset.

[36mℹ[39m [2023-10-12 10:50:03] [34m[3m[34mINFO[34m[23m[39m ==> the ASV/gene abundance table has been rarefied with a sub-sample depth of 5310

[32m✔[39m [2023-10-12 10:50:07] [32m[3m[32mSAVE[32m[23m[39m ==> new results have been saved to: object$div$alpha

[32m✔[39m [2023-10-12 10:50:54] [32m[3m[32mSAVE[32m[23m[39m ==> new results have been saved to: object$div$beta



[34m──[39m [34mThe Summary of Microgeo Dataset[39m [34m─────────────────────────────────────────────[39m


[36mℹ[39m object$mat: 6808 ASVs/genes and 1244 samples [32m[3m[32m[subsample depth: 5310][32m[23m[39m

[36mℹ[39m object$ant: 6808 ASVs/genes and 7 annotation levels (Kingdom, Phylum, Class, Order, Family, Genus, Species)

[36mℹ[39m object$met: 1244 samples and 2 variables (longitude, latitude)

[36mℹ[39m object$map: a SpatialPolygonsDataFrame with the CRS of '+proj=longlat +datum=WGS84 +no_defs'

[36mℹ[39m object$phy: a phylogenetic tree with 6808 tip labels

[36mℹ[39m object$env: 1244 samples and 10 variables




[30m──[39m [30mThe Summary of Biogeographic Traits[39m [30m─────────────────────────────────────────[39m


[32m✔[39m object$div$alpha: 2 alpha diversity index/indices (observed, shannon)

[32m✔[39m object$div$beta: 2 beta diversity distance matrix/matrices (bray, jaccard)




[44m• To check the summary of dataset, Replace `object` with the variable name of your dataset[49m
[44m• For example, if the variable name is `dataset.dts`you can run `head(dataset.dts$met)` to check the content of `met`[49m


## Merge a `data.frame` with a map

Firstly, we check the `data.frame` of alpha diversity indices, and the `SpatialPolygonsDataFrame`.

In [3]:
# Check the data.frame of alpha diversity indices 
head(dataset.dts.aliyun$div$alpha)

Unnamed: 0_level_0,observed,shannon
Unnamed: 0_level_1,<dbl>,<dbl>
s1,1016,6.357566
s2,937,6.257915
s3,860,6.122338
s4,1041,6.277216
s5,897,6.196335
s6,980,6.196634


In [4]:
# Check the SpatialPolygonsDataFrame
head(dataset.dts.aliyun$map@data)

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>
1,DataV.GeoAtlas,microgeo,西藏自治区,88.38828,31.56375
2,DataV.GeoAtlas,microgeo,青海省,96.04353,35.7264
3,DataV.GeoAtlas,microgeo,四川省,102.69345,30.67454


Then, we merge the `data.frame` of alpha diversity indices with a `SpatialPolygonsDataFrame`.

In [5]:
# Merge data to a SpatialPolygonsDataFrame
common.map.mean4df <- merge_dfs_to_map(map = dataset.dts.aliyun$map, dat = dataset.dts.aliyun$div$alpha, 
                                       met = dataset.dts.aliyun$met, med = 'mean')
head(common.map.mean4df@data[,1:12])
# Now, you can visualize the microbial traits (alpha diversity indices) onto a map

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER,observed_mean,shannon_mean,observed_sd,shannon_sd,observed_se,shannon_se,sample.num
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
1,DataV.GeoAtlas,microgeo,西藏自治区,88.38828,31.56375,663.7004,5.847481,239.1759,0.4757448,10.44845,0.02078301,524
2,DataV.GeoAtlas,microgeo,青海省,96.04353,35.7264,648.1538,5.837383,246.6328,0.5394032,11.73112,0.02565679,442
3,DataV.GeoAtlas,microgeo,四川省,102.69345,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278


We also can merge the `data.frame` of alpha diversity indices with a gridded `SpatialPolygonsDataFrame`. 

In [6]:
# Grid the map [SpatialPolygonsDataFrame]
gridded.map <- grid_map(map = dataset.dts.aliyun$map, res = 1.5) %>% suppressMessages
head(gridded.map@data)

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>
1,Gridded.Map,microgeo,1,83.74702,29.73742
2,Gridded.Map,microgeo,2,85.46302,28.50944
3,Gridded.Map,microgeo,3,86.67299,28.33105
4,Gridded.Map,microgeo,4,89.49169,28.25211
5,Gridded.Map,microgeo,5,88.12468,28.29693
6,Gridded.Map,microgeo,6,85.14919,29.46224


In [7]:
# Merge data to a gridded map
gridded.map.mean4df <- merge_dfs_to_map(map = gridded.map, dat = dataset.dts.aliyun$div$alpha, 
                                        met = dataset.dts.aliyun$met, med = 'mean')
head(gridded.map.mean4df@data[,1:12])
# Now, you can visualize the microbial traits (alpha diversity indices) onto a map

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER,observed_mean,shannon_mean,observed_sd,shannon_sd,observed_se,shannon_se,sample.num
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
1,Gridded.Map,microgeo,1,83.74702,29.73742,590.3846,5.808563,145.1508,0.2121631,40.25759,0.05884345,13
2,Gridded.Map,microgeo,2,85.46302,28.50944,700.0,5.958273,194.764,0.2918049,112.44702,0.16847362,3
3,Gridded.Map,microgeo,3,86.67299,28.33105,549.7692,5.773531,144.6502,0.2807023,40.11876,0.0778528,13
4,Gridded.Map,microgeo,4,89.49169,28.25211,667.75,5.776863,310.2401,0.5625262,155.12005,0.2812631,4
5,Gridded.Map,microgeo,5,88.12468,28.29693,472.0,5.427903,151.4959,0.4062918,39.11607,0.10490409,15
6,Gridded.Map,microgeo,6,85.14919,29.46224,555.3333,5.751697,145.5349,0.2339007,29.70719,0.04774478,24


## Merge a `distance matrix` with a map

Firstly, we check the distance `matrix`, and the `SpatialPolygonsDataFrame`.

In [8]:
# Check the distance matrix 
dataset.dts.aliyun$div$beta$bray[1:5, 1:5]

Unnamed: 0_level_0,s1,s2,s3,s4,s5
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
s1,0.0,0.4824859,0.5909605,0.5045198,0.4084746
s2,0.4824859,0.0,0.4745763,0.4146893,0.3973635
s3,0.5909605,0.4745763,0.0,0.4054614,0.4800377
s4,0.5045198,0.4146893,0.4054614,0.0,0.4220339
s5,0.4084746,0.3973635,0.4800377,0.4220339,0.0


In [9]:
# Check the SpatialPolygonsDataFrame
head(dataset.dts.aliyun$map@data)

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>
1,DataV.GeoAtlas,microgeo,西藏自治区,88.38828,31.56375
2,DataV.GeoAtlas,microgeo,青海省,96.04353,35.7264
3,DataV.GeoAtlas,microgeo,四川省,102.69345,30.67454


Then, we merge the distance `matrix` with a `SpatialPolygonsDataFrame`. 

In [10]:
# Merge distance matrix to a common map
common.map.mean4mx <- merge_mtx_to_map(map = dataset.dts.aliyun$map, dat = dataset.dts.aliyun$div$beta$bray, 
                                        met = dataset.dts.aliyun$met, var = 'bray', med = 'mean')
head(common.map.mean4mx@data[,1:9])
# Now, you can visualize the microbial traits (beta diversity distance matrix) onto a map

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER,bray_mean,bray_sd,bray_se,sample.num
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
1,DataV.GeoAtlas,microgeo,西藏自治区,88.38828,31.56375,0.8126941,0.1042846,0.0002817208,524
2,DataV.GeoAtlas,microgeo,青海省,96.04353,35.7264,0.7973568,0.1222254,0.000391513,442
3,DataV.GeoAtlas,microgeo,四川省,102.69345,30.67454,0.721191,0.1388104,0.0007074157,278


We also can merge a distance `matrix` with a gridded `SpatialPolygonsDataFrame`.

In [11]:
# Grid the map 
gridded.map <- grid_map(map = dataset.dts.aliyun$map, res = 1.5) %>% suppressMessages
head(gridded.map@data)

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>
1,Gridded.Map,microgeo,1,83.74702,29.73742
2,Gridded.Map,microgeo,2,85.46302,28.50944
3,Gridded.Map,microgeo,3,86.67299,28.33105
4,Gridded.Map,microgeo,4,89.49169,28.25211
5,Gridded.Map,microgeo,5,88.12468,28.29693
6,Gridded.Map,microgeo,6,85.14919,29.46224


In [12]:
# Merge distance matrix to a gridded map
gridded.map.mean4mx <- merge_mtx_to_map(map = gridded.map, dat = dataset.dts.aliyun$div$beta$bray, 
                                        met = dataset.dts.aliyun$met, var = 'bray', med = 'mean')
head(gridded.map.mean4mx@data[,1:9])
# Now, you can visualize the microbial traits (beta diversity distance matrix) onto a map

Unnamed: 0_level_0,TYPE,FMTS,NAME,X.CENTER,Y.CENTER,bray_mean,bray_sd,bray_se,sample.num
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
1,Gridded.Map,microgeo,1,83.74702,29.73742,0.6607803,0.09350795,0.01058769,13
2,Gridded.Map,microgeo,2,85.46302,28.50944,0.6412429,0.09637935,0.055644645,3
3,Gridded.Map,microgeo,3,86.67299,28.33105,0.7319209,0.13096241,0.014828573,13
4,Gridded.Map,microgeo,4,89.49169,28.25211,0.7658192,0.23505938,0.095962592,4
5,Gridded.Map,microgeo,5,88.12468,28.29693,0.7739127,0.13827951,0.013494699,15
6,Gridded.Map,microgeo,6,85.14919,29.46224,0.7697815,0.10741188,0.006465435,24


## Extract the metadata table from a map

In [13]:
# Extract metadata from a common map
# This new matadata table can be used for subsequent statistical analysis
metadata <- dataset.dts.aliyun$map %>% extract_metadata_from_map(met = dataset.dts.aliyun$met)
head(metadata)

Unnamed: 0_level_0,longitude,latitude,NAME,TYPE,FMTS,X.CENTER,Y.CENTER
Unnamed: 0_level_1,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>
s1,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454
s2,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454
s3,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454
s4,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454
s5,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454
s6,98.20639,33.1028,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454


In [14]:
# Extract metadata from a common map
# This new matadata table can be used for subsequent statistical analysis
metadata.from.c.df <- common.map.mean4df %>% extract_metadata_from_map(met = dataset.dts.aliyun$met)
head(metadata.from.c.df)

Unnamed: 0_level_0,longitude,latitude,NAME,TYPE,FMTS,X.CENTER,Y.CENTER,observed_mean,shannon_mean,observed_sd,shannon_sd,observed_se,shannon_se,sample.num
Unnamed: 0_level_1,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
s1,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278
s2,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278
s3,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278
s4,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278
s5,98.20894,33.10321,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278
s6,98.20639,33.1028,四川省,DataV.GeoAtlas,microgeo,102.6935,30.67454,706.8094,5.974206,197.926,0.3673581,11.87082,0.02203268,278


In [15]:
# Extract metadata from a gridded map
# This new matadata table can be used for subsequent statistical analysis
metadata.from.g.mx <- gridded.map.mean4mx %>% extract_metadata_from_map(met = dataset.dts.aliyun$met)
head(metadata.from.g.mx)

Unnamed: 0_level_0,longitude,latitude,NAME,TYPE,FMTS,X.CENTER,Y.CENTER,bray_mean,bray_sd,bray_se,sample.num
Unnamed: 0_level_1,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
s1,98.20894,33.10321,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
s2,98.20894,33.10321,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
s3,98.20894,33.10321,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
s4,98.20894,33.10321,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
s5,98.20894,33.10321,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
s6,98.20639,33.1028,101,Gridded.Map,microgeo,98.64554,32.45835,0.6565824,0.1579915,0.0050213,45
