From d627774015e7648c6340a2a4a4ab4e8b77fefda7 Mon Sep 17 00:00:00 2001 From: Anne Chao Date: Tue, 6 Sep 2016 14:53:57 +0000 Subject: [PATCH] version 0.1.1 --- DESCRIPTION | 16 + MD5 | 95 + NAMESPACE | 15 + R/BasicFun.R | 81 + R/BootstrapFunMa.R | 26 + R/Cf0Fun.R | 17 + R/Chao1_bcEstFun.R | 15 + R/Chao1_bcFun.R | 11 + R/Chao1_sharedEstFun.R | 20 + R/Chao1_sharedFun.R | 19 + R/Chao2_bcEstFun.R | 19 + R/Chao2_bcFun.R | 11 + R/Chao2_sharedEstFun.R | 24 + R/Chao2_sharedFun.R | 20 + R/ChaoShared.Ind.R | 46 + R/ChaoShared.Sam.R | 33 + R/DataTransform.R | 33 + R/Diversity_subroutine.R | 946 ++++++++++ R/ExtenProbFunMa.R | 58 + R/Genetic_subroutine.R | 438 +++++ R/HeteroEstFun.R | 67 + R/HeteroFun.R | 13 + R/HomoEstFun.R | 25 + R/HomoFun.R | 11 + R/InfreqSpeciesGroupInprove.R | 12 + R/Multiple_Community_Measure_subroutine.R | 1530 ++++++++++++++++ R/PanEstFun.R | 23 + R/PanEstFun.Sam.R | 25 + R/PanFun.R | 18 + R/PanFun.Sam.R | 20 + R/PanbcEstFun.R | 18 + R/PanbcEstFun.Sam.R | 20 + R/PanbcFun.R | 11 + R/PanbcFun.Sam.R | 11 + R/Q.R | 2 + R/RareSpeciesGroup.R | 14 + R/RareSpeciesGroupInprove.R | 21 + R/SortDataFun.R | 19 + R/SpecAbunAce.R | 161 ++ R/SpecAbunAce1.R | 181 ++ R/SpecAbunChao1.R | 89 + R/SpecAbunChao1bc.R | 84 + R/SpecAbunHomo.R | 112 ++ R/SpecAbunHomoMle.R | 97 + R/SpecAbunJack1.R | 109 ++ R/SpecAbunJack2.R | 121 ++ R/SpecAbunOut.R | 39 + R/SpecAbuniChao1.R | 156 ++ R/SpecInciChao2.R | 81 + R/SpecInciChao2bc.R | 73 + R/SpecInciHomo.R | 132 ++ R/SpecInciJack1.R | 103 ++ R/SpecInciJack2.R | 104 ++ R/SpecInciModelh.R | 153 ++ R/SpecInciModelh1.R | 170 ++ R/SpecInciModelth.R | 150 ++ R/SpecInciModelth1.R | 163 ++ R/SpecInciOut.R | 88 + R/SpecInciiChao2.R | 150 ++ R/SpeciesAbundance.R | 6 + R/SpeciesIncidence.R | 5 + R/Two_Community_similarity_subroutine.R | 2025 +++++++++++++++++++++ R/VarEstFun.R | 62 + R/VarEstFun.Sam.R | 64 + R/basicAbun.R | 87 + R/basicInci.R | 74 + R/diff_Chao1.R | 28 + R/diff_Chao1bc.R | 25 + R/diff_Chao2.R | 30 + R/diff_Chao2bc.R | 28 + R/diff_Pan.R | 31 + R/diff_Panbc.R | 28 + R/f.R | 2 + R/logCI.R | 11 + R/print.ChaoShared.R | 71 + R/print.ChaoSpecies.R | 69 + R/spader.R | 1388 ++++++++++++++ data/ChaoSharedData.rda | Bin 0 -> 10121 bytes data/ChaoSpeciesData.rda | Bin 0 -> 1018 bytes data/DiversityData.rda | Bin 0 -> 2606 bytes data/GeneticsDataAbu.rda | Bin 0 -> 251 bytes data/SimilarityMultData.rda | Bin 0 -> 7457 bytes data/SimilarityPairData.rda | Bin 0 -> 6332 bytes man/ChaoShared.Rd | 47 + man/ChaoSpecies.Rd | 53 + man/DivDemoData.Rd | 34 + man/Diversity.Rd | 56 + man/Genetics.Rd | 54 + man/GeneticsDataAbu.Rd | 18 + man/MulComSimDemoData.Rd | 24 + man/SharedSpecDemoData.Rd | 25 + man/SimilarityMult.Rd | 76 + man/SimilarityPair.Rd | 64 + man/SpecDemoData.Rd | 32 + man/TwoComSimDemoData.Rd | 25 + man/spader-package.Rd | 49 + 96 files changed, 10910 insertions(+) create mode 100644 DESCRIPTION create mode 100644 MD5 create mode 100644 NAMESPACE create mode 100644 R/BasicFun.R create mode 100644 R/BootstrapFunMa.R create mode 100644 R/Cf0Fun.R create mode 100644 R/Chao1_bcEstFun.R create mode 100644 R/Chao1_bcFun.R create mode 100644 R/Chao1_sharedEstFun.R create mode 100644 R/Chao1_sharedFun.R create mode 100644 R/Chao2_bcEstFun.R create mode 100644 R/Chao2_bcFun.R create mode 100644 R/Chao2_sharedEstFun.R create mode 100644 R/Chao2_sharedFun.R create mode 100644 R/ChaoShared.Ind.R create mode 100644 R/ChaoShared.Sam.R create mode 100644 R/DataTransform.R create mode 100644 R/Diversity_subroutine.R create mode 100644 R/ExtenProbFunMa.R create mode 100644 R/Genetic_subroutine.R create mode 100644 R/HeteroEstFun.R create mode 100644 R/HeteroFun.R create mode 100644 R/HomoEstFun.R create mode 100644 R/HomoFun.R create mode 100644 R/InfreqSpeciesGroupInprove.R create mode 100644 R/Multiple_Community_Measure_subroutine.R create mode 100644 R/PanEstFun.R create mode 100644 R/PanEstFun.Sam.R create mode 100644 R/PanFun.R create mode 100644 R/PanFun.Sam.R create mode 100644 R/PanbcEstFun.R create mode 100644 R/PanbcEstFun.Sam.R create mode 100644 R/PanbcFun.R create mode 100644 R/PanbcFun.Sam.R create mode 100644 R/Q.R create mode 100644 R/RareSpeciesGroup.R create mode 100644 R/RareSpeciesGroupInprove.R create mode 100644 R/SortDataFun.R create mode 100644 R/SpecAbunAce.R create mode 100644 R/SpecAbunAce1.R create mode 100644 R/SpecAbunChao1.R create mode 100644 R/SpecAbunChao1bc.R create mode 100644 R/SpecAbunHomo.R create mode 100644 R/SpecAbunHomoMle.R create mode 100644 R/SpecAbunJack1.R create mode 100644 R/SpecAbunJack2.R create mode 100644 R/SpecAbunOut.R create mode 100644 R/SpecAbuniChao1.R create mode 100644 R/SpecInciChao2.R create mode 100644 R/SpecInciChao2bc.R create mode 100644 R/SpecInciHomo.R create mode 100644 R/SpecInciJack1.R create mode 100644 R/SpecInciJack2.R create mode 100644 R/SpecInciModelh.R create mode 100644 R/SpecInciModelh1.R create mode 100644 R/SpecInciModelth.R create mode 100644 R/SpecInciModelth1.R create mode 100644 R/SpecInciOut.R create mode 100644 R/SpecInciiChao2.R create mode 100644 R/SpeciesAbundance.R create mode 100644 R/SpeciesIncidence.R create mode 100644 R/Two_Community_similarity_subroutine.R create mode 100644 R/VarEstFun.R create mode 100644 R/VarEstFun.Sam.R create mode 100644 R/basicAbun.R create mode 100644 R/basicInci.R create mode 100644 R/diff_Chao1.R create mode 100644 R/diff_Chao1bc.R create mode 100644 R/diff_Chao2.R create mode 100644 R/diff_Chao2bc.R create mode 100644 R/diff_Pan.R create mode 100644 R/diff_Panbc.R create mode 100644 R/f.R create mode 100644 R/logCI.R create mode 100644 R/print.ChaoShared.R create mode 100644 R/print.ChaoSpecies.R create mode 100644 R/spader.R create mode 100644 data/ChaoSharedData.rda create mode 100644 data/ChaoSpeciesData.rda create mode 100644 data/DiversityData.rda create mode 100644 data/GeneticsDataAbu.rda create mode 100644 data/SimilarityMultData.rda create mode 100644 data/SimilarityPairData.rda create mode 100644 man/ChaoShared.Rd create mode 100644 man/ChaoSpecies.Rd create mode 100644 man/DivDemoData.Rd create mode 100644 man/Diversity.Rd create mode 100644 man/Genetics.Rd create mode 100644 man/GeneticsDataAbu.Rd create mode 100644 man/MulComSimDemoData.Rd create mode 100644 man/SharedSpecDemoData.Rd create mode 100644 man/SimilarityMult.Rd create mode 100644 man/SimilarityPair.Rd create mode 100644 man/SpecDemoData.Rd create mode 100644 man/TwoComSimDemoData.Rd create mode 100644 man/spader-package.Rd diff --git a/DESCRIPTION b/DESCRIPTION new file mode 100644 index 0000000..cc9b84a --- /dev/null +++ b/DESCRIPTION @@ -0,0 +1,16 @@ +Package: SpadeR +Type: Package +Title: Species-Richness Prediction and Diversity Estimation with R +Version: 0.1.1 +Date: 2016-09-06 +Author: Anne Chao, K. H. Ma, T. C. Hsieh and Chun-Huo Chiu +Maintainer: Anne Chao +Description: Estimation of various biodiversity indices and related (dis)similarity measures based on individual-based (abundance) data or sampling-unit-based (incidence) data taken from one or multiple communities/assemblages. +License: GPL (>= 3) +Imports: stats +Depends: R (>= 2.14) +RoxygenNote: 5.0.1 +NeedsCompilation: no +Packaged: 2016-09-06 05:40:30 UTC; USER +Repository: CRAN +Date/Publication: 2016-09-06 14:53:57 diff --git a/MD5 b/MD5 new file mode 100644 index 0000000..0b3093e --- /dev/null +++ b/MD5 @@ -0,0 +1,95 @@ +ef63b6781a5d00b24a8ceab08ec09fe7 *DESCRIPTION +cc5423c19ec4d8c4dbe8294ed31230ef *NAMESPACE +96457fe9e2885e62a1730637e7585d58 *R/BasicFun.R +5d7753da40d64deb34cafe045c765a23 *R/BootstrapFunMa.R +c334bc7b849cf6539ecb451226c9df20 *R/Cf0Fun.R +cb0265a58940b0cce40a390f757bcbd0 *R/Chao1_bcEstFun.R +b3fb7c9989c503646096f1abd50a6547 *R/Chao1_bcFun.R +039be360ad01867f44254330949c6e1c *R/Chao1_sharedEstFun.R +7c4a548deef925116e8a48984a1addc1 *R/Chao1_sharedFun.R +3f77a1cdcf2f1b2fb5263817f44a8f88 *R/Chao2_bcEstFun.R +c153837bbc449206f7ea681e6d94fd28 *R/Chao2_bcFun.R +c89b36add1fbc292d20f6eb55c2ce85f *R/Chao2_sharedEstFun.R +cdd0cbf8f385812ff5f55794227e5a5d *R/Chao2_sharedFun.R +d7469eabd5f4cf5b483c553ad6a8fe4e *R/ChaoShared.Ind.R +ca71c23b7a874f01ffdf1a6dd33c3696 *R/ChaoShared.Sam.R +cbee1eb899e2ddc829cc2ef62fe93748 *R/DataTransform.R +fe89e0e7eae2864010f4f9a7f6f65909 *R/Diversity_subroutine.R +6e5bd623f03da44acdf0d4540fa18cd0 *R/ExtenProbFunMa.R +d980172e8d892a76caf5a2cf79f19335 *R/Genetic_subroutine.R +c00bbe6954c76b4b27602fe3bc288439 *R/HeteroEstFun.R +f1f0e4cf7ecb19fd7412805156ed8446 *R/HeteroFun.R +83517ffa83e8b8e087904f10c9168b78 *R/HomoEstFun.R +3d9a45bb13ce034198be5d1cf71c86fb *R/HomoFun.R +d8dfae8845816236c93c32cdcb813438 *R/InfreqSpeciesGroupInprove.R +987f73637aa92efcfaf395505d6ccf82 *R/Multiple_Community_Measure_subroutine.R +39dc8325397490ed3140fe8a0eb115e7 *R/PanEstFun.R +89a11bdd038507caf5f60c5485125f9b *R/PanEstFun.Sam.R +4769699365e75ef21c150f2240d06ca6 *R/PanFun.R +50c6687b94e782758c0fa10dc05b165a *R/PanFun.Sam.R +d18b8fe73323d4a9c371a5bf8141c777 *R/PanbcEstFun.R +a4a5d47abe22d13a1e5c38a0864a5949 *R/PanbcEstFun.Sam.R +833b94bcf514a054b073792ad26acd28 *R/PanbcFun.R +d90ec73de3b2b91288b73276754f85dd *R/PanbcFun.Sam.R +1b0cf8a5e1f11a53d261294b3dc577ff *R/Q.R +cd3ddba9f74ce2de02a929414d52e85b *R/RareSpeciesGroup.R +337792c0859f55b0920aa893ae7e785b *R/RareSpeciesGroupInprove.R +a5682eb9cad07cef9939ea16619c6f9c *R/SortDataFun.R +5bd3709d9e635e97d18bb61cdb116680 *R/SpecAbunAce.R +79e74c922e2c0598fc10aefba21974fc *R/SpecAbunAce1.R +7e41eb21e99a160367481f9744c8cd60 *R/SpecAbunChao1.R +ab21bf01562f267d0f7d11c6f02719fd *R/SpecAbunChao1bc.R +f62cbc6946fe98b5e36830a0844b3078 *R/SpecAbunHomo.R +b21e53f1c4ca1840c5103d0f18dfa2ac *R/SpecAbunHomoMle.R +42390b87ed8cae0a0f8db06c69b1224d *R/SpecAbunJack1.R +681da6a5f421d9013dd1e75b958bb871 *R/SpecAbunJack2.R +30ccbe67d4577795408bbd8bac8a84a2 *R/SpecAbunOut.R +a51e167166a6d1d17b8bb3d0542942c5 *R/SpecAbuniChao1.R +c7227301d3b6fa7a5b115354f380b142 *R/SpecInciChao2.R +803be732f3dcf3100349318b899a2c33 *R/SpecInciChao2bc.R +b669edf7d22082ec617e7271384dc86f *R/SpecInciHomo.R +45b25c518af62ce91a89f2762a725ea0 *R/SpecInciJack1.R +87f660e1cbee3bfbd7d94ab90de835cb *R/SpecInciJack2.R +867c837cf023c8003bc314b3227ab7da *R/SpecInciModelh.R +593fcd8e64681d9737d4d9106a13ae1a *R/SpecInciModelh1.R +5e43e5933dea74aacad4469b1d0f7649 *R/SpecInciModelth.R +aa944c0a60bf41c64aaf797f72ca579b *R/SpecInciModelth1.R +482486ae9d1f4e36466e4f005f78f0e5 *R/SpecInciOut.R +bd8f9a1282d1e3df2048307f13057da6 *R/SpecInciiChao2.R +4ca5e2eeabaac55241e63b572870c3a7 *R/SpeciesAbundance.R +a320dc46f58c26278c4ef7cf6d6f6fb2 *R/SpeciesIncidence.R +cf42cbba7113351d64dd37f8dbbdbf52 *R/Two_Community_similarity_subroutine.R +c0d0e1cc1cb39f11182ad8520967f128 *R/VarEstFun.R +e649a92435617a44bceabb8516a522ed *R/VarEstFun.Sam.R +2a0b9fa3376a386b4c69da5668aacb44 *R/basicAbun.R +f593c24ad2d29cc948107730f5c3770d *R/basicInci.R +1c1ff82971d608d0fe05d66563a56483 *R/diff_Chao1.R +8b08b749f3ee692d1f260f9a736f2d35 *R/diff_Chao1bc.R +8b324719c0b5d1e10ae7ca78224224fd *R/diff_Chao2.R +68e1fd85b96535239af6c0516cd98357 *R/diff_Chao2bc.R +908c46104df13696e05df6b165e4fdf5 *R/diff_Pan.R +7452f504e68c93f2bfd9eae8f579bdeb *R/diff_Panbc.R +218fd6e8ce56b30d220b8bdc3bf6d315 *R/f.R +a5744186fadffea992bfc37f66b43974 *R/logCI.R +f13660ab18e0d0274707d659badfbd71 *R/print.ChaoShared.R +be07b5b687c48cc73df8c5f3ccd3debd *R/print.ChaoSpecies.R +d77cef44b1a076fa6719fcda59ad422a *R/spader.R +0ac3c5357791fc3998def99e6c0f7c82 *data/ChaoSharedData.rda +b78e7681dd20cd52f35879ba61a6cb4f *data/ChaoSpeciesData.rda +52ca0aaa3349ee6db5ec9e241567a5fd *data/DiversityData.rda +0352f4e62caada865d1cbcf9f5ed27e4 *data/GeneticsDataAbu.rda +5a832c3b55c6048eecf020e50a0a591a *data/SimilarityMultData.rda +b938ca1206c477d439364ea5ac20a227 *data/SimilarityPairData.rda +b911af0e3d807849a3d4ac2143bb216d *man/ChaoShared.Rd +842d4a37df37aa2ebff1507f2a811f46 *man/ChaoSpecies.Rd +2d08ebb6103373a38ed376afa3f4ecba *man/DivDemoData.Rd +8c3205b588f99e61d2b8e15fb5af1988 *man/Diversity.Rd +960f74e6b04717faa5f1f2ac331fcde6 *man/Genetics.Rd +ef85ac24c7e724ab0c7caea534c9c486 *man/GeneticsDataAbu.Rd +b0fa6c87e38b26e832ccb1b8c7d12ab3 *man/MulComSimDemoData.Rd +c5af1479718c074e02cd3b98d756956d *man/SharedSpecDemoData.Rd +5041135dca36d2b5cb4f413c60694ad3 *man/SimilarityMult.Rd +b685a0965a551f5775f8360e92e258a6 *man/SimilarityPair.Rd +921b76b91b118770fc93c3cf1641f408 *man/SpecDemoData.Rd +a1584e2d81563c4f968d707df9eced61 *man/TwoComSimDemoData.Rd +96c41d4f4d6b4fe6866ca6893674cf7a *man/spader-package.Rd diff --git a/NAMESPACE b/NAMESPACE new file mode 100644 index 0000000..8d7eac2 --- /dev/null +++ b/NAMESPACE @@ -0,0 +1,15 @@ +import(stats) +importFrom("grDevices", "adjustcolor") +importFrom("graphics", "legend", "lines", "plot", "polygon") +S3method(print,ChaoShared) +S3method(print,ChaoSpecies) +S3method(print,spadeDiv) +S3method(print,spadeTwo) +S3method(print,spadeMult) +S3method(print,spadeGenetic) +export(ChaoShared) +export(ChaoSpecies) +export(Diversity) +export(SimilarityMult) +export(SimilarityPair) +export(Genetics) diff --git a/R/BasicFun.R b/R/BasicFun.R new file mode 100644 index 0000000..01cd30b --- /dev/null +++ b/R/BasicFun.R @@ -0,0 +1,81 @@ +BasicFun <- function(x1, x2, B, datatype) { + if(datatype=="abundance"){ + n1 <- sum(x1) + n2 <- sum(x2) + D1 <- sum(x1 > 0) + D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + x1_share <- x1[which(x1 > 0 & x2 > 0)] + x2_share <- x2[which(x1 > 0 & x2 > 0)] + f11 <- sum(x1_share == 1 & x2_share == 1) + f1.plus.rare <- sum(x1_share == 1 & x2_share <= 10) + fplus.1.rare <- sum(x2_share == 1 & x1_share <= 10) + f2.plus.rare <- sum(x1_share == 2 & x2_share <= 10) + fplus.2.rare <- sum(x2_share == 2 & x1_share <= 10) + f1.plus <- sum(x1_share == 1 & x2_share >0) + fplus.1 <- sum(x2_share == 1 & x1_share >0) + f2.plus <- sum(x1_share == 2 & x2_share >0) + fplus.2 <- sum(x2_share == 2 & x1_share >0) + f22 <- sum(x1_share == 2 & x2_share == 2) + D12_rare <- sum(x1_share <= 10 & x2_share <= 10) + + pos <- (x1 > 0 & x2 > 0) & (x1 > 10 | x2 > 10) + n1_rare <- n1 - sum(x1[pos]) + n2_rare <- n2 - sum(x2[pos]) + + pos_r <- (x1_share <= 10 & x2_share <= 10) + pos1_r <- (x1_share == 1 & x2_share <= 10) + pos2_r <- (x2_share == 1 & x1_share <= 10) + + tmp <- sum(x1_share[pos_r] * x2_share[pos_r]) + C12_rare <- 1 - (sum(x2_share[pos1_r]) + sum(x1_share[pos2_r]) - f11) / tmp + # C12_rare <- round(C12_rare, 4) + + T10 <- sum(x1_share[x1_share <= 10 & x2_share <= 10]) + T01 <- sum(x2_share[x1_share <= 10 & x2_share <= 10]) + T11 <- tmp + T21 <- sum(x1_share[pos_r] * (x1_share - 1)[pos_r] * x2_share[pos_r]) + T12 <- sum(x1_share[pos_r] * (x2_share - 1)[pos_r] * x2_share[pos_r]) + + T22 <- sum(x1_share[pos_r] * x2_share[pos_r] * + (x1_share - 1)[pos_r] * (x2_share - 1)[pos_r]) + + S12_0 <- D12_rare / C12_rare + CCV_1 <- S12_0 * n1_rare * T21 / (n1_rare - 1) / T10 / T11 - 1 + CCV_2 <- S12_0 * n2_rare * T12 / (n2_rare - 1) / T01 / T11 - 1 + CCV_12 <- n1_rare * n2_rare * S12_0^2 * T22 / + ((n1_rare - 1) * (n2_rare - 1) * T10 * T01 * T11) - + S12_0 * T11 / T10 / T01 - CCV_1 - CCV_2 + out <- list(n1=n1, n2=n2, D1=D1, D2=D2, D12=D12, B=B, f11=f11, f1.plus=f1.plus, + fplus.1=fplus.1, f2.plus=f2.plus, fplus.2=fplus.2, f22=f22, + f1.plus.rare=f1.plus.rare,fplus.1.rare=fplus.1.rare, f2.plus.rare=f2.plus.rare, fplus.2.rare=fplus.2.rare, + n1_rare=n1_rare, n2_rare=n2_rare, D12_rare=D12_rare, + C12_rare=C12_rare, CCV_1=CCV_1, CCV_2=CCV_2, CCV_12=CCV_12, + datatype="abundance") + + }else if(datatype=="incidence_freq"){ + y1 <- x1 + y2 <- x2 + t1 <- y1[1] + t2 <- y2[1] + x1 <- y1[-1] + x2 <- y2[-1] + u1 <- sum(x1) + u2 <- sum(x2) + D1 <- sum(x1 > 0) + D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + + Q11 <- sum(x1 == 1 & x2 == 1) + Q1.plus <- sum(x1 == 1 & x2 >= 1) + Qplus.1 <- sum(x2 == 1 & x1 >= 1) + Q2.plus <- sum(x1 == 2 & x2 >= 1) + Qplus.2 <- sum(x2 == 2 & x1 >= 1) + Q22 <- sum(x1 == 2 & x2 == 2) + out <- list(T1=t1, T2=t2, U1=u1, U2=u2, D1=D1, D2=D2, D12=D12, B=B, Q11=Q11, Q1.plus=Q1.plus, + Qplus.1=Qplus.1, Q2.plus=Q2.plus, Qplus.2=Qplus.2, Q22=Q22, + datatype="incidence") + + }else warning("invaild datatype, please set datatype as abundance or incidence.") + return(out) + } diff --git a/R/BootstrapFunMa.R b/R/BootstrapFunMa.R new file mode 100644 index 0000000..41a7105 --- /dev/null +++ b/R/BootstrapFunMa.R @@ -0,0 +1,26 @@ +BootstrapFunMa <- +function(x1, x2, B, FunName) { + n1 <- sum(x1); n2 <- sum(x2) + z <- SortDataFun(x1, x2) + z1 <- z[, 1] + z2 <- z[, 2] + newprob <- ExtenProbFunMa(z1, z2) + p1 <- newprob$prob1 + p2 <- newprob$prob2 + set.seed(123) + X1 <- rmultinom(B, n1, p1) + set.seed(123) + X2 <- rmultinom(B, n2, p2) + X <- rbind(X1, X2) + + se <- sd(apply(X, 2, function(x) { + y <- matrix(x, ncol=2) + y1 <- y[, 1] + y2 <- y[, 2] +# y1 <- x[1 : length(p1)] +# y2 <- x[(length(p1) + 1) : (2 * length(p1))] + FunName(y1, y2) + }), na.rm=T) + + return(se) +} diff --git a/R/Cf0Fun.R b/R/Cf0Fun.R new file mode 100644 index 0000000..43911b9 --- /dev/null +++ b/R/Cf0Fun.R @@ -0,0 +1,17 @@ +Cf0Fun <- +function(x) { + n <- sum(x) + f1 <- sum(x == 1); f2 <- sum(x == 2) + if (f2 > 0) { + C <- 1 - f1 / n * ((n - 1) * f1 / ((n - 1) * f1 + 2 * f2)) + f0 <- (n - 1) / n * f1^2 / (2 * f2) + } else if (f2 == 0 & f1 != 0) { + C <- 1 - f1 / n * ((n - 1) * (f1 - 1) / ((n - 1) * (f1 - 1) + 2)) + f0 <- (n - 1) / n * f1 * (f1 - 1) / 2 + } else { + C <- 1 + f0 <- (n - 1) / n * f1 * (f1 - 1) / 2 + } + f0 <- ceiling(f0) + return(c(C, f0)) +} diff --git a/R/Chao1_bcEstFun.R b/R/Chao1_bcEstFun.R new file mode 100644 index 0000000..e5f8130 --- /dev/null +++ b/R/Chao1_bcEstFun.R @@ -0,0 +1,15 @@ +Chao1_bcEstFun <- +function(x1, x2) { + D12 <- sum(x1 > 0 & x2 > 0) + x1_share <- x1[which(x1 > 0 & x2 > 0)] + x2_share <- x2[which(x1 > 0 & x2 > 0)] + f11 <- sum(x1_share == 1 & x2_share == 1) + f1.plus <- sum(x1_share == 1 & x2_share >= 1) + fplus.1 <- sum(x2_share == 1 & x1_share >= 1) + f2.plus <- sum(x1_share == 2 & x2_share >= 1) + fplus.2 <- sum(x2_share == 2 & x1_share >= 1) + est <- D12 + f11 * f1.plus * fplus.1 / (4 * (f2.plus + 1) * (fplus.2 + 1)) + + f1.plus * (f1.plus - 1) / (2 * (f2.plus + 1)) + + fplus.1 * (fplus.1 - 1) / (2 * (fplus.2 + 1)) + return(est) +} diff --git a/R/Chao1_bcFun.R b/R/Chao1_bcFun.R new file mode 100644 index 0000000..ceee5a7 --- /dev/null +++ b/R/Chao1_bcFun.R @@ -0,0 +1,11 @@ +Chao1_bcFun <- +function(x1, x2, conf=0.95) { + est <- Chao1_bcEstFun(x1, x2) + se <- VarEstFun(x1, x2, diffFun=diff_Chao1bc, FunName=Chao1_bcEstFun) + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Chao1-shared-bc") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/Chao1_sharedEstFun.R b/R/Chao1_sharedEstFun.R new file mode 100644 index 0000000..5ad1a99 --- /dev/null +++ b/R/Chao1_sharedEstFun.R @@ -0,0 +1,20 @@ +Chao1_sharedEstFun <- +function(x1, x2) { + D12 <- sum(x1 > 0 & x2 > 0) + x1_share <- x1[which(x1 > 0 & x2 > 0)] + x2_share <- x2[which(x1 > 0 & x2 > 0)] + f11 <- sum(x1_share == 1 & x2_share == 1) + f1.plus <- sum(x1_share == 1 & x2_share >= 1) + fplus.1 <- sum(x2_share == 1 & x1_share >= 1) + f2.plus <- sum(x1_share == 2 & x2_share >= 1) + fplus.2 <- sum(x2_share == 2 & x1_share >= 1) + if (f2.plus == 0 || fplus.2 == 0) { + est <- D12 + f11 * f1.plus * fplus.1 / (4 * (f2.plus + 1) * (fplus.2 + 1)) + + f1.plus * (f1.plus - 1) / (2 * (f2.plus + 1)) + + fplus.1 * (fplus.1 - 1) / (2 * (fplus.2 + 1)) + } else { + est <- D12 + f11 * f1.plus * fplus.1 / (4 * f2.plus * fplus.2) + + f1.plus^2 / (2 * f2.plus) + fplus.1^2 / (2 * fplus.2) + } + return(est) +} diff --git a/R/Chao1_sharedFun.R b/R/Chao1_sharedFun.R new file mode 100644 index 0000000..cf09b55 --- /dev/null +++ b/R/Chao1_sharedFun.R @@ -0,0 +1,19 @@ +Chao1_sharedFun <- +function(x1, x2, conf=0.95) { + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + + est <- Chao1_sharedEstFun(x1, x2) + if (f2p == 0 || fp2 == 0) { + se <- VarEstFun(x1, x2, diffFun=diff_Chao1bc, FunName=Chao1_bcEstFun) + } else { + se <- VarEstFun(x1, x2, diff_Chao1, FunName=Chao1_sharedEstFun) + } + + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Chao1(shared)") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/Chao2_bcEstFun.R b/R/Chao2_bcEstFun.R new file mode 100644 index 0000000..64ac9ca --- /dev/null +++ b/R/Chao2_bcEstFun.R @@ -0,0 +1,19 @@ +Chao2_bcEstFun <- +function(y1, y2) { + t1 <- y1[1] + t2 <- y2[1] + c1 <- (t1 - 1) / t1 + c2 <- (t2 - 1) / t2 + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + Q11 <- sum(x1 == 1 & x2 == 1) + Q1.plus <- sum(x1 == 1 & x2 >= 1) + Qplus.1 <- sum(x2 == 1 & x1 >= 1) + Q2.plus <- sum(x1 == 2 & x2 >= 1) + Qplus.2 <- sum(x2 == 2 & x1 >= 1) + est <- D12 + Q11 * c1 * c2 * Q1.plus * Qplus.1 / (4 * (Q2.plus + 1) * (Qplus.2 + 1)) + + c1 * Q1.plus * (Q1.plus - 1) / (2 * (Q2.plus + 1)) + + c2 * Qplus.1 * (Qplus.1 - 1) / (2 * (Qplus.2 + 1)) + return(est) +} diff --git a/R/Chao2_bcFun.R b/R/Chao2_bcFun.R new file mode 100644 index 0000000..43092b6 --- /dev/null +++ b/R/Chao2_bcFun.R @@ -0,0 +1,11 @@ +Chao2_bcFun <- +function(y1, y2, conf=0.95) { + est <- Chao2_bcEstFun(y1, y2) + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Chao2bc, FunName=Chao2_bcEstFun) + CI <- logCI(y1, y2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Chao2-shared-bc") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/Chao2_sharedEstFun.R b/R/Chao2_sharedEstFun.R new file mode 100644 index 0000000..7b2d8fe --- /dev/null +++ b/R/Chao2_sharedEstFun.R @@ -0,0 +1,24 @@ +Chao2_sharedEstFun <- +function(y1, y2) { + t1 <- y1[1] + t2 <- y2[1] + c1 <- (t1 - 1) / t1 + c2 <- (t2 - 1) / t2 + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + Q11 <- sum(x1 == 1 & x2 == 1) + Q1.plus <- sum(x1 == 1 & x2 >= 1) + Qplus.1 <- sum(x2 == 1 & x1 >= 1) + Q2.plus <- sum(x1 == 2 & x2 >= 1) + Qplus.2 <- sum(x2 == 2 & x1 >= 1) + if (Q2.plus == 0 || Qplus.2 == 0) { + est <- D12 + Q11 * c1 * c2 * Q1.plus * Qplus.1 / (4 * (Q2.plus + 1) * (Qplus.2 + 1)) + + c1 * Q1.plus * (Q1.plus - 1) / (2 * (Q2.plus + 1)) + + c2 * Qplus.1 * (Qplus.1 - 1) / (2 * (Qplus.2 + 1)) + } else { + est <- D12 + Q11 * c1 * c2 * Q1.plus * Qplus.1 / (4 * Q2.plus * Qplus.2) + + c1 * Q1.plus^2 / (2 * Q2.plus) + c2 * Qplus.1^2 / (2 * Qplus.2) + } + return(est) +} diff --git a/R/Chao2_sharedFun.R b/R/Chao2_sharedFun.R new file mode 100644 index 0000000..94f226a --- /dev/null +++ b/R/Chao2_sharedFun.R @@ -0,0 +1,20 @@ +Chao2_sharedFun <- +function(y1, y2, conf=0.95) { + x1 <- y1[-1] + x2 <- y2[-1] + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + + est <- Chao2_sharedEstFun(y1, y2) + if (f2p == 0 || fp2 == 0) { + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Chao2bc, FunName=Chao2_bcEstFun) + } else { + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Chao2, FunName=Chao2_sharedEstFun) + } + CI <- logCI(y1, y2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Chao2-shared") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/ChaoShared.Ind.R b/R/ChaoShared.Ind.R new file mode 100644 index 0000000..7b58c6c --- /dev/null +++ b/R/ChaoShared.Ind.R @@ -0,0 +1,46 @@ +ChaoShared.Ind <- +function(x1, x2, method = c("all", "Homogeneous", + "Heterogeneous(ACE-shared)", + "Chao1-shared", + "Chao1-shared-bc"), + B = 200, conf = 0.95, se = TRUE) { +#function(x1, x2, method = c("all", "Homogeneous", +# "Heterogeneous(ACE-shared)", +# "Chao1(shared)", +# "Chao1-shared-bc", +# "Lower-bound", +# "Lower-bound-bc"), +# B = 200, conf = 0.95, se = TRUE) { + + method <- match.arg(method) + if (se == FALSE) + B <- 1 + if (method == "all") { + a <- HomoFun(x1, x2, B, conf) + b <- HeteroFun(x1, x2, B, conf) + #c <- Chao1_sharedFun(x1, x2, conf) + #d <- Chao1_bcFun(x1, x2, conf) + e <- PanFun(x1, x2, conf) + f <- PanbcFun(x1, x2, conf) + #out <- rbind(a, b, c, d, e, f) + out <- rbind(a, b, e, f) + rownames(out)<-c(" Homogeneous"," Heterogeneous(ACE-shared)"," Chao1-shared"," Chao1-shared-bc") + } + if (method == "Homogeneous") + out <- HomoFun(x1, x2, B, conf) + if (method == "Heterogeneous(ACE-shared)") + out <- HeteroFun(x1, x2, B, conf) + if (method == "Chao1(shared)") + out <- Chao1_sharedFun(x1, x2, conf) + if (method == "Chao1-shared-bc") + out <- Chao1_bcFun(x1, x2, conf) + if (method == "Lower-bound") + out <- PanFun(x1, x2, conf) + if (method == "Lower-bound-bc") + out <- PanbcFun(x1, x2, conf) + + if (se == FALSE) { + out <- data.frame(Estimator = out[, 1], row.names = rownames(out)) + } + return(out) +} diff --git a/R/ChaoShared.Sam.R b/R/ChaoShared.Sam.R new file mode 100644 index 0000000..05acd57 --- /dev/null +++ b/R/ChaoShared.Sam.R @@ -0,0 +1,33 @@ +ChaoShared.Sam <- +function(y1, y2, method = c("all","Chao2-shared","Chao2-shared-bc"),conf = 0.95, se = TRUE) { +#function(y1, y2, method = c("all", +# "Chao2-shared", +# "Chao2-shared-bc", +# "Lower-bound", +# "Lower-bound-bc"), +# conf = 0.95, se = TRUE) { + method <- match.arg(method) + + if (method == "all") { + #a <- Chao2_sharedFun(y1, y2, conf) + #b <- Chao2_bcFun(y1, y2, conf) + c <- PanFun.Sam(y1, y2, conf) + d <- PanbcFun.Sam(y1, y2, conf) + #out <- rbind(a, b, c, d) + out <- rbind(c, d) + rownames(out)<-c(" Chao2-shared"," Chao2-shared-bc") + } + #if (method == "Chao2-shared") + # out <- Chao2_sharedFun(y1, y2, conf) + #if (method == "Chao2-shared-bc") + # out <- Chao2_bcFun(y1, y2, conf) + if (method == "Chao2-shared") + out <- PanFun.Sam(y1, y2, conf) + if (method == "Chao2-shared-bc") + out <- PanbcFun.Sam(y1, y2, conf) + + if (se == FALSE) { + out <- data.frame(Estimator = out[, 1], row.names = rownames(out)) + } + return(out) +} diff --git a/R/DataTransform.R b/R/DataTransform.R new file mode 100644 index 0000000..07a8822 --- /dev/null +++ b/R/DataTransform.R @@ -0,0 +1,33 @@ +DataTransform <- +function(data, type = c("FreqCount", "MatrixInci", "MatrixAbun", "InciCount")){ + Freq2Abun <- function(data){ + dat <- data[-(1:2)] + j <- 1:length(dat) + data.abun <- as.numeric(rep(dat[which(j %% 2 == 1)], dat[which(j %% 2 == 0)])) + return(data.abun) + } + Mat2Inci <- function(data){ + t <- length(data[1,]) + dat <- apply(data, 1, sum) + data.inci <- c(t, dat) + return(data.inci) + } + Count2Inci <- function(data){ + t <- data[1] + dat <- data[-(1:3)] + j <- 1:length(dat) + data.inci <- as.numeric(c(t, as.numeric(rep(dat[which(j %% 2 == 1)], dat[which(j %% 2 == 0)])))) + return(data.inci) + } + + if (type == "FreqCount"){ + data <- Freq2Abun(data) + } else if (type == "MatrixInci"){ + data <- Mat2Inci(data) + } else if (type == "MatrixAbun") { + data <- apply(data, 1, sum) + } else { + data <- Count2Inci(data) + } + return(data) +} diff --git a/R/Diversity_subroutine.R b/R/Diversity_subroutine.R new file mode 100644 index 0000000..57d61cf --- /dev/null +++ b/R/Diversity_subroutine.R @@ -0,0 +1,946 @@ +CV.Ind=function(x) +{ + x <- x[x>0] + n <- sum(x) + f1 <- sum(x == 1) + C.hat=1-f1/n + Sobs=sum(x>0) + S0=Sobs/C.hat + r.square=max(S0*sum(x*(x-1))/n/(n-1)-1,0 ) + r.square^0.5 +} +Chao1=function(x,conf=0.95) +{ + z <--qnorm((1 - conf)/2) + x=x[x>0] + D=sum(x>0) + f1=sum(x==1) + f2=sum(x==2) + n=sum(x) + if (f1 > 0 & f2 > 0) + { + S_Chao1 <- D + (n - 1)/n*f1^2/(2*f2) + var_Chao1 <- f2*((n - 1)/n*(f1/f2)^2/2 + + ((n - 1)/n)^2*(f1/f2)^3 + ((n - 1 )/n)^2*(f1/f2)^4/4) + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } + else if (f1 > 1 & f2 == 0) + { + S_Chao1 <- D + (n - 1)/n*f1*(f1 - 1)/(2*(f2 + 1)) + var_Chao1 <- (n - 1)/n*f1*(f1 - 1)/2 + + ((n - 1)/n)^2*f1*(2*f1 - 1)^2/4 - ((n - 1)/n)^2*f1^4/4/S_Chao1 + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } + else + { + S_Chao1 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i) sum(x==i)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*sum(x==i))))^2/n + var_Chao1 <- var_obs + P <- sum(sapply(i, function(i) sum(x==i)*exp(-i)/D)) + CI_Chao1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + return( c( round(c(S_Chao1,var_Chao1^0.5,CI_Chao1[1],CI_Chao1[2]),1),conf) ) +} + +Chao1_bc=function(x,conf=0.95) +{ + z <- -qnorm((1 - conf)/2) + x=x[x>0] + D=sum(x>0) + f1=sum(x==1) + f2=sum(x==2) + n=sum(x) + + S_Chao1_bc <- D + (n - 1)/n*f1*(f1 - 1)/(2*(f2 + 1)) + var_Chao1_bc <- (n - 1)/n*f1*(f1 - 1)/2/(f2 + 1) + + ((n - 1)/n)^2*f1*(2*f1 - 1)^2/4/(f2 + 1)^2 + ((n - 1)/n)^2*f1^2*f2*(f1 - 1)^2/4/(f2 + 1)^4 + + t <- round(S_Chao1_bc - D, 5) + if (t != 0) + { + K <- exp(z*sqrt(log(1 + var_Chao1_bc/t^2))) + CI_Chao1_bc <- c(D + t/K, D + t*K) + } + if(t == 0) + { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)sum(x==i)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*sum(x==i))))^2/n + var_Chao1_bc <- var_obs + P <- sum(sapply(i, function(i)sum(x==i)*exp(-i)/D)) + CI_Chao1_bc <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + round(c(S_Chao1_bc,var_Chao1_bc^0.5,CI_Chao1_bc[1],CI_Chao1_bc[2]) ,1) +} + + + + +EstiBootComm.Ind <- function(Spec) +{ + Sobs <- sum(Spec > 0) #observed species + n <- sum(Spec) #sample size + f1 <- sum(Spec == 1) #singleton + f2 <- sum(Spec == 2) #doubleton + a <- ifelse(f1 == 0, 0, (n - 1) * f1 / ((n - 1) * f1 + 2 * f2) * f1 / n) + b <- sum(Spec / n * (1 - Spec / n) ^ n) + w <- a / b #adjusted factor for rare species in the sample + f0.hat <- ceiling(ifelse(f2 == 0, (n - 1) / n * f1 * (f1 - 1) / 2, (n - 1) / n * f1 ^ 2/ 2 / f2)) #estimation of unseen species via Chao1 + Prob.hat <- Spec / n * (1 - w * (1 - Spec / n) ^ n) #estimation of relative abundance of observed species in the sample + Prob.hat.Unse <- rep(2 * f2/((n - 1) * f1 + 2 * f2), f0.hat) #estimation of relative abundance of unseen species in the sample + return(c(Prob.hat, Prob.hat.Unse)) #Output: a vector of estimated relative abundance +} + +entropy_MEE_equ=function(X) +{ + x=X + x=x[x>0] + n=sum(x) + UE <- sum(x/n*(digamma(n)-digamma(x))) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0) + { + A <-1-ifelse(f2 > 0, (n-1)*f1/((n-1)*f1+2*f2), (n-1)*f1/((n-1)*f1+2)) + B=sum(x==1)/n*(1-A)^(-n+1)*(-log(A)-sum(sapply(1:(n-1),function(k){1/k*(1-A)^k}))) + } + if(f1==0){B=0} + if(f1==1 & f2==0){B=0} + UE+B +} +entropy_HT_equ<-function(X) +{ + x=X + x=x[x>0] + n=sum(x) + f1=sum(x==1) + C_head=1-f1/n + a=-sum(C_head*(x/n)*log(C_head*(x/n))/(1-(1-C_head*(x/n))^n)) + a +} +entropy_J1_equ=function(X) +{ + X=X[X>0] + Y=X[X>1] + n=sum(X) + -n*sum(X/n*log(X/n))-(n-1)/n*sum( (n-X)*(-X/(n-1)*log(X/(n-1))) )-(n-1)/n*sum(-Y*(Y-1)/(n-1)*log((Y-1)/(n-1))) +} +entropy_MLE_equ=function(X) +{ + X=X[X>0] + n=sum(X) + -sum(X/n*log(X/n)) +} +entropy_MLE_bc_equ=function(X) +{ + entropy_MLE_equ(X)+(SpecAbunChao1(X,k=10,conf=0.95)[1]-1)/2/sum(X) +} +Shannon_index=function(x,boot=50) +{ + x=x[x>0] + n=sum(x) + MLE=entropy_MLE_equ(x) + MLE_bc=entropy_MLE_bc_equ(x) + J1=entropy_J1_equ(x) + HT=entropy_HT_equ(x) + MEE=entropy_MEE_equ(x) + p_hat=EstiBootComm.Ind(x) + Boot.X=rmultinom(boot,n,p_hat) + temp1=apply(Boot.X,2,entropy_MLE_equ) + temp2=apply(Boot.X,2,entropy_MLE_bc_equ) + temp3=apply(Boot.X,2,entropy_J1_equ) + temp4=apply(Boot.X,2,entropy_HT_equ) + temp5=apply(Boot.X,2,entropy_MEE_equ) + MLE_sd=sd(temp1) + MLE_bc_sd=sd(temp2) + J1_sd=sd(temp3) + HT_sd=sd(temp4) + MEE_sd=sd(temp5) + + MLE_exp_sd=sd(exp(temp1)) + MLE_bc_exp_sd=sd(exp(temp2)) + J1_exp_sd=sd(exp(temp3)) + HT_exp_sd=sd(exp(temp4)) + MEE_exp_sd=sd(exp(temp5)) + + a=matrix(0,10,4) + a[1,]=c(MLE,MLE_sd,MLE-1.96*MLE_sd,MLE+1.96*MLE_sd) + a[2,]=c(MLE_bc,MLE_bc_sd,MLE_bc-1.96*MLE_bc_sd,MLE_bc+1.96*MLE_bc_sd) + a[3,]=c(J1,J1_sd,J1-1.96*J1_sd,J1+1.96*J1_sd) + a[4,]=c(HT,HT_sd,HT-1.96*HT_sd,HT+1.96*HT_sd) + a[5,]=c(MEE,MEE_sd,MEE-1.96*MEE_sd,MEE+1.96*MEE_sd) + a[6,]=c(exp(MLE),MLE_exp_sd,exp(MLE)-1.96*MLE_exp_sd,exp(MLE)+1.96*MLE_exp_sd) + a[7,]=c(exp(MLE_bc),MLE_bc_exp_sd,exp(MLE_bc)-1.96*MLE_bc_exp_sd,exp(MLE_bc)+1.96*MLE_bc_exp_sd) + a[8,]=c(exp(J1),J1_exp_sd,exp(J1)-1.96*J1_exp_sd,exp(J1)+1.96*J1_exp_sd) + a[9,]=c(exp(HT),HT_exp_sd,exp(HT)-1.96*HT_exp_sd,exp(HT)+1.96*HT_exp_sd) + a[10,]=c(exp(MEE),MEE_exp_sd,exp(MEE)-1.96*MEE_exp_sd,exp(MEE)+1.96*MEE_exp_sd) + return(a) +} + +simpson_MLE_equ=function(X) +{ + X=X[X>0] + n=sum(X) + a=sum((X/n)^2) + a +} +simpson_MVUE_equ=function(X) +{ + X=X[X>0] + n=sum(X) + a=sum(X*(X-1))/n/(n-1) + a +} + +Simpson_index=function(x,boot=50) +{ + x=x[x>0] + n=sum(x) + MVUE=simpson_MVUE_equ(x) + MLE=simpson_MLE_equ(x) + + #ACE=SpecAbunAce(x)[1] + #AA=sum( ( x*(x-1)/n/(n-1)-x*(2*n-1)/n/(n-1)*MVUE )^2 ) + #BB=sum( x*(x-1)/n/(n-1)-x*(2*n-1)/n/(n-1)*MVUE ) + #MVUE_sd=(AA-BB^2/ACE)^0.5 + + #AA=sum( ( (x/n)^2-2*x/n*MLE )^2 ) + #BB=sum( (x/n)^2-2*x/n*MLE ) + #MLE_sd=(AA-BB^2/ACE)^0.5 + #MVUE_recip_sd=MVUE_sd/MVUE + #MLE_recip_sd=MLE_sd/MLE + p_hat=EstiBootComm.Ind(x) + Boot.X=rmultinom(boot,n,p_hat) + temp1=apply(Boot.X,2,simpson_MVUE_equ) + temp2=apply(Boot.X,2,simpson_MLE_equ) + MVUE_sd=sd(temp1) + MVUE_recip_sd=sd(1/temp1) + MLE_sd=sd(temp2) + MLE_recip_sd=sd(1/temp2) + + a=matrix(0,4,4) + a[1,]=c(MVUE,MVUE_sd,MVUE-1.96*MVUE_sd,MVUE+1.96*MVUE_sd) + a[2,]=c(MLE,MLE_sd,MLE-1.96*MLE_sd,MLE+1.96*MLE_sd) + a[3,]=c(1/MVUE,MVUE_recip_sd,1/MVUE-1.96*MVUE_recip_sd,1/MVUE+1.96*MVUE_recip_sd) + a[4,]=c(1/MLE,MLE_recip_sd,1/MLE-1.96*MLE_recip_sd,1/MLE+1.96*MLE_recip_sd) + return(a) +} +######################################################2015.09.14 +SpecInci <- function(data, k=10, conf=0.95) +{ + Chao2 <- SpecInciChao2(data, k = k, conf = conf) + Chao2bc <- SpecInciChao2bc(data, k = k, conf = conf) + iChao2 <- SpecInciiChao2(data, k=k, conf=conf)[1,] + Modelh <- SpecInciModelh(data, k = k, conf = conf)[-c(5)] + Modelh1 <- SpecInciModelh1(data, k = k, conf = conf)[-c(5)] + table <- rbind(Chao2, Chao2bc, iChao2, Modelh, Modelh1) + table <- round(table,1) + colnames(table) <- c("Estimate", "s.e.", "95%Lower", "95%Upper") + rownames(table) <- c("Chao2 (Chao, 1987)", "Chao2-bc","iChao2", "ICE (Lee & Chao, 1994)", "ICE-1 (Lee & Chao, 1994)") + return(table) +} + +EstiBootComm.Sam <- function(data) +{ + data = data[data>0] + T1 <- data[1] + X = data[-c(1)] + U = sum(X) #observed species + Q1 = length(X[X==1]) #singleton + Q2 = length(X[X==2]) #doubleton + if(Q2>0) + { + A = 2*Q2/((T1-1)*Q1+2*Q2) + } + else if(Q1>1) + { + A=2/((T1-1)*(Q1-1)+2) + }else + { + A=0 + } + C1 = 1 - Q1/U*(1 - A) + W = U/T1*(1 - C1)/sum(X/T1*(1-X/T1)^T1) #adjusted factor for rare species in the sample + Q0 = ceiling(ifelse(Q2>0, (T1-1)/T1*Q1^2/2/Q2, (T1-1)/T1*Q1*(Q1-1)/2)) #estimation of unseen species via Chao2 + Prob.hat = X/T1*(1-W*(1-X/T1)^T1) #estimation of detection probability of observed species in the sample + Prob.hat.Unse <- rep(U/T1*(1-C1)/Q0, Q0) #estimation of detection probability of unseen species in the sample + return(c(Prob.hat, Prob.hat.Unse)) #Output: a vector of estimated detection probability +} +entropy_MLE_Inci_equ <- function(X) +{ + X <- X[-1] + X <- X[X > 0] + U <- sum(X) + H_MLE <- -sum(X/U*log(X/U)) + return(H_MLE) +} +entropy_MLE_bc_Inci_equ <- function(X) +{ + t <- X[1] + X <- X[-1] + X <- X[X > 0] + U <- sum(X) + X_freq <- X[X > 10] + X_infreq <- X[X <= 10] + D_freq <- length(X_freq) + D_infreq <- length(X_infreq) + Q1 <- sum(X == 1) + Q2 <- sum(X == 2) + if(Q1 > 0 & Q2 > 0) + { + A <- 2*Q2/((t-1)*Q1 + 2*Q2) + } + else if (Q1 > 0 & Q2 == 0) + { + A <- 2/((t-1)*(Q1 - 1) + 2) + } + else + { + A <- 1 + } + C_infreq <- 1 - Q1/sum(X_infreq)*(1-A) + + j <- c(1:10) + b1 <- sum(sapply(j, function(j){j*(j-1)*sum(X == j)})) + b2 <- sum(sapply(j, function(j){j*sum(X == j)})) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t-1)*b1/b2/(b2-1) - 1, 0) + + ICE <- D_freq + D_infreq/C_infreq + Q1/C_infreq*gamma_infreq_square + + H_MLE <- -sum(X/U*log(X/U)) + H_MLE_bc <- H_MLE + (ICE/U + 1/t)/1 + + return(H_MLE_bc) +} +entropy_HT_Inci_equ <- function(X) +{ + t <- X[1] + X <- X[-1] + X <- X[X > 0] + U <- sum(X) + Q1 <- sum(X == 1) + Q2 <- sum(X == 2) + if(Q1 > 0 & Q2 > 0){ + A <- 2*Q2/((t-1)*Q1 + 2*Q2) + } else if (Q1 > 0 & Q2 == 0){ + A <- 2/((t-1)*(Q1 - 1) + 2) + } else { + A <- 1 + } + C <- 1 - Q1/U*(1-A) + H_HT <- t/U*(-sum(C*X/t*log(C*X/t)/(1-(1-C*X/t)^t))) + log(U/t) + return(H_HT) +} +entropy_MEE_Inci_equ <- function(X) +{ + t <- X[1] + X <- X[-1] + X <- X[X > 0] + U <- sum(X) + Q1 <- sum(X == 1) + Q2 <- sum(X == 2) + if(Q1 > 0 & Q2 > 0){ + A <- 2*Q2/((t-1)*Q1 + 2*Q2) + } else if (Q1 > 0 & Q2 == 0){ + A <- 2/((t-1)*(Q1 - 1) + 2) + } else { + A <- 1 + } + + UE <- sum(X/t*(digamma(t)-digamma(X))) + if(Q1 > 0 & A!=1){ + B <- Q1/t*(1-A)^(-t+1)*(-log(A)-sum(sapply(1:(t-1), function(k){1/k*(1-A)^k}))) + H_MEE <- t/U*(UE + B) + log(U/t) + }else{ + H_MEE <- t/U*UE + log(U/t) + } + return(H_MEE) +} +Shannon_Inci_index=function(x,boot=50) +{ + x = unlist(x) + t = x[1] + MLE=entropy_MLE_Inci_equ(x) + #MLE_bc=entropy_MLE_bc_Inci_equ(x) + #HT=entropy_HT_Inci_equ(x) + MEE=entropy_MEE_Inci_equ(x) + p_hat=EstiBootComm.Sam(x) + Boot.X = sapply(1:length(p_hat), function(i){ + rbinom(boot,t,p_hat[i])}) + Boot.X = cbind(rep(t,boot), Boot.X) + temp1=apply(Boot.X,1,entropy_MLE_Inci_equ) + #temp2=apply(Boot.X,1,entropy_MLE_bc_Inci_equ) + #temp4=apply(Boot.X,1,entropy_HT_Inci_equ) + temp5=apply(Boot.X,1,entropy_MEE_Inci_equ) + MLE_sd=sd(temp1) + #MLE_bc_sd=sd(temp2) + #HT_sd=sd(temp4) + MEE_sd=sd(temp5) + + MLE_exp_sd=sd(exp(temp1)) + #MLE_bc_exp_sd=sd(exp(temp2)) + #HT_exp_sd=sd(exp(temp4)) + MEE_exp_sd=sd(exp(temp5)) + + a=matrix(0,8,4) + a[1,]=c(MLE,MLE_sd,MLE-1.96*MLE_sd,MLE+1.96*MLE_sd) + #a[2,]=c(MLE_bc,MLE_bc_sd,MLE_bc-1.96*MLE_bc_sd,MLE_bc+1.96*MLE_bc_sd) + #a[3,]=c(HT,HT_sd,HT-1.96*HT_sd,HT+1.96*HT_sd) + a[4,]=c(MEE,MEE_sd,MEE-1.96*MEE_sd,MEE+1.96*MEE_sd) + a[5,]=c(exp(MLE),MLE_exp_sd,exp(MLE)-1.96*MLE_exp_sd,exp(MLE)+1.96*MLE_exp_sd) + #a[6,]=c(exp(MLE_bc),MLE_bc_exp_sd,exp(MLE_bc)-1.96*MLE_bc_exp_sd,exp(MLE_bc)+1.96*MLE_bc_exp_sd) + #a[7,]=c(exp(HT),HT_exp_sd,exp(HT)-1.96*HT_exp_sd,exp(HT)+1.96*HT_exp_sd) + a[8,]=c(exp(MEE),MEE_exp_sd,exp(MEE)-1.96*MEE_exp_sd,exp(MEE)+1.96*MEE_exp_sd) + return(a) +} +simpson_Inci_MVUE_equ=function(Y) +{ + t=Y[1] + Y=Y[-1] + Y=Y[Y>0] + U=sum(Y) + a=(sum(Y*(Y-1))/U^2/(1-1/t)) +} +simpson_Inci_MLE_equ=function(Y) +{ + t=Y[1] + Y=Y[-1] + Y=Y[Y>0] + a=(sum(Y^2)/sum(Y)^2) +} +Simpson_Inci_index=function(x,boot=200) +{ + x=x[x>0] + t = x[1] + MVUE=simpson_Inci_MVUE_equ(x) + MLE=simpson_Inci_MLE_equ(x) + + p_hat=EstiBootComm.Sam(x) + #set.seed(1) + Boot.X = sapply(1:length(p_hat), function(i){ + rbinom(boot,t,p_hat[i])}) + Boot.X = cbind(rep(t,boot), Boot.X) + temp1=apply(Boot.X,1,simpson_Inci_MVUE_equ) + temp2=apply(Boot.X,1,simpson_Inci_MLE_equ) + + MVUE_sd=sd(temp1) + MLE_sd=sd(temp2) + + #MVUE_recip_sd=MVUE_sd/MVUE + #MLE_recip_sd=MLE_sd/MLE + MVUE_recip_sd=sd(1/temp1) + MLE_recip_sd=sd(1/temp2) + + a=matrix(0,4,4) + a[1,]=c(MVUE,MVUE_sd,MVUE-1.96*MVUE_sd,MVUE+1.96*MVUE_sd) + a[2,]=c(MLE,MLE_sd,MLE-1.96*MLE_sd,MLE+1.96*MLE_sd) + a[3,]=c(1/MVUE,MVUE_recip_sd,1/MVUE-1.96*MVUE_recip_sd,1/MVUE+1.96*MVUE_recip_sd) + a[4,]=c(1/MLE,MLE_recip_sd,1/MLE-1.96*MLE_recip_sd,1/MLE+1.96*MLE_recip_sd) + return(a) +} +######################################################2015.09.14 +conf.reg=function(x,LCL,UCL,...) polygon(c(x,rev(x)),c(LCL,rev(UCL)), ...) + + +#X=read.table("Data4a.txt") +#Y=read.table("Data4b1_t.txt") +#Diversity(datatype="Abundance",X) +#Diversity(datatype="Frequencies_of_Frequencies",Y) + +print.spadeDiv <- function(x, digits = max(3L, getOption("digits") - 3L), ...){ + + if(x$datatype=="abundance"){ + + cat("\n(1) BASIC DATA INFORMATION:\n") + print(x$Basic_data) + cat("\n(2) ESTIMATION OF SPECIES RICHNESS (DIVERSITY OF ORDER 0):\n\n") + print(x$Species_richness) + cat(" + Descriptions of richness estimators (See Species Part) + ") + cat("\n(3a) SHANNON ENTROPY:\n\n") + print(x$Shannon_index) + #cat("\n") + #cat(" For a review of the four estimators, see Chao and Shen (2003).\n") + #MLE_bc: bias-corrected empirical estimator. + cat(" + MLE: empirical or observed entropy. + Jackknife: see Zahl (1977). + Chao & Shen: based on the Horvitz-Thompson estimator and sample coverage method; see Chao and Shen (2003). + see Chao and Shen (2003). + Chao et al. (2013): A nearly optimal estimator of Shannon entropy; see Chao et al. (2013). + Estimated standard error is computed based on a bootstrap method. + \n") + + cat("(3b) SHANNON DIVERSITY (EXPONENTIAL OF SHANNON ENTROPY):\n\n") + print(x$Shannon_diversity) + + cat("\n(4a) SIMPSON CONCENTRATION INDEX:\n\n") + print(x$Simpson_index) + + cat(" + MVUE: minimum variance unbiased estimator; see Eq. (2.27) of Magurran (1988). + MLE: maximum likelihood estimator or empirical index; see Eq. (2.26) of Magurran (1988). + ") + + cat("\n(4b) SIMPSON DIVERSITY (INVERSE OF SIMPSON CONCENTRATION):\n\n") + print(x$Simpson_diversity) + + cat("\n(5) CHAO AND JOST (2015) ESTIMATES OF HILL NUMBERS \n\n") + print(x$Hill_numbers) + + cat(" + ChaoJost: diversity profile estimator derived by Chao and Jost (2015). + Empirical: maximum likelihood estimator (observed index). + ") + }else{ + cat("\n(1) BASIC DATA INFORMATION:\n") + print(x$Basic_data) + cat("\n(2) ESTIMATION OF SPECIES RICHNESS (DIVERSITY OF ORDER 0):\n\n") + print(x$Species_richness) + cat(" + Descriptions of richness estimators (See Species Part) + ") + cat("\n(3a) SHANNON INDEX:\n\n") + print(x$Shannon_index) + + cat("\n(3b) EXPONENTIAL OF SHANNON INDEX (DIVERSITY OF ORDER 1):\n\n") + print(x$Shannon_diversity) + cat("\n(4a) SIMPSON INDEX:\n\n") + print(x$Simpson_index) + cat("\n(4b) INVERSE OF SIMPSON INDEX (DIVERSITY OF ORDER 2):\n\n") + print(x$Simpson_diversity) + cat("\n(5) Chao and Jost (2015) estimates of Hill numbers of order q from 0 to 3\n\n") + print(x$Hill_numbers) + + cat(" + ChaoJost: diversity profile estimator derived by Chao and Jost (2015). + Empirical: maximum likelihood estimator (observed index). + ") + + } + Lower=min(x$Hill_numbers[,3],x$Hill_numbers[,6]) + Upper=max(x$Hill_numbers[,4],x$Hill_numbers[,7]) + plot(0,type="n",xlim=c(min(x$Hill_numbers[,1]),max(x$Hill_numbers[,1])),ylim=c(Lower,Upper),xlab="Order q",ylab="Hill numbers") + conf.reg(x$Hill_numbers[,1],x$Hill_numbers[,3],x$Hill_numbers[,4], col=adjustcolor(2, 0.2), border=NA) + conf.reg(x$Hill_numbers[,1],x$Hill_numbers[,6],x$Hill_numbers[,7], col=adjustcolor(4, 0.2), border=NA) + lines(x$Hill_numbers[,1],x$Hill_numbers[,2],col=2,lwd=3) + lines(x$Hill_numbers[,1],x$Hill_numbers[,5],col=4,lty=3,lwd=3) + legend("topright", c("ChaoJost","Empirical"),col=c(2,4),lwd=c(3,3),lty=c(1,3),bty="n",cex=0.8) +} + + + +Chao_Hill_abu = function(x,q){ + x = x[x>0] + n = sum(x) + f1 = sum(x==1) + f2 = sum(x==2) + p1 = ifelse(f2>0,2*f2/((n-1)*f1+2*f2),ifelse(f1>0,2/((n-1)*(f1-1)+2),1)) + + Sub <- function(q){ + if(q==0){ + sum(x>0) + (n-1)/n*ifelse(f2>0, f1^2/2/f2, f1*(f1-1)/2) + } + else if(q==1){ + r <- 1:(n-1) + A <- sum(x/n*(digamma(n)-digamma(x))) + B <- ifelse(f1==0|p1==1,0,f1/n*(1-p1)^(1-n)*(-log(p1)-sum((1-p1)^r/r))) + exp(A+B) + }else if(abs(q-round(q))==0){ + A <- sum(exp(lchoose(x,q)-lchoose(n,q))) + ifelse(A==0,NA,A^(1/(1-q))) + }else { + sort.data = sort(unique(x)) + tab = table(x) + term = sapply(sort.data,function(z){ + k=0:(n-z) + sum(choose(k-q,k)*exp(lchoose(n-k-1,z-1)-lchoose(n,z))) + }) + r <- 0:(n-1) + A = sum(tab*term) + B = ifelse(f1==0|p1==1,0,f1/n*(1-p1)^(1-n)*(p1^(q-1)-sum(choose(q-1,r)*(p1-1)^r))) + (A+B)^(1/(1-q)) + } + } + sapply(q, Sub) +} + + + +Chao_Hill_inc = function(x,q){ + n = x[1] + x = x[-1];x = x[x>0] + U = sum(x) + f1 = sum(x==1) + f2 = sum(x==2) + p1 = ifelse(f2>0,2*f2/((n-1)*f1+2*f2),ifelse(f1>0,2/((n-1)*(f1-1)+2),1)) + r <- 0:(n-1) + Sub <- function(q){ + if(q==0){ + sum(x>0) + (n-1)/n*ifelse(f2>0, f1^2/2/f2, f1*(f1-1)/2) + } + else if(q==1){ + A <- sum(x/U*(digamma(n)-digamma(x))) + B <- ifelse(f1==0|p1==1,0,f1/U*(1-p1)^(1-n)*(-log(p1)-sum(sapply(1:(n-1), function(r)(1-p1)^r/r)))) + exp(A+B)*U/n + }else if(abs(q-round(q))==0){ + A <- sum(exp(lchoose(x,q)-lchoose(n,q))) + ifelse(A==0,NA,((n/U)^q*A)^(1/(1-q))) + }else { + sort.data = sort(unique(x)) + tab = table(x) + term = sapply(sort.data,function(z){ + k=0:(n-z) + sum(choose(k-q,k)*exp(lchoose(n-k-1,z-1)-lchoose(n,z))) + }) + A = sum(tab*term) + B = ifelse(f1==0|p1==1,0,f1/n*(1-p1)^(1-n)*(p1^(q-1)-sum(choose(q-1,r)*(p1-1)^r))) + ((n/U)^q*(A+B))^(1/(1-q)) + } + } + sapply(q, Sub) +} + + + +Chao_Hill = function(x,q,datatype = c("abundance","incidence_freq")){ + datatype = match.arg(datatype,c("abundance","incidence_freq")) + if(datatype == "abundance"){ + est = Chao_Hill_abu(x,q) + }else{ + est = Chao_Hill_inc(x,q) + } + return(est) +} + + + +Hill <- function(x,q,datatype = c("abundance","incidence_freq")){ + if(datatype=="incidence_freq"){x = x[-1]} + p <- x[x>0]/sum(x) + Sub <- function(q){ + if(q==0) sum(p>0) + else if(q==1) exp(-sum(p*log(p))) + else exp(1/(1-q)*log(sum(p^q))) + } + sapply(q, Sub) +} + + + +Bt_prob_abu = function(x){ + x = x[x>0] + n = sum(x) + f1 = sum(x==1) + f2 = sum(x==2) + C = 1 - f1/n*ifelse(f2>0,(n-1)*f1/((n-1)*f1+2*f2),ifelse(f1>0,(n-1)*(f1-1)/((n-1)*(f1-1)+2),0)) + W = (1-C)/sum(x/n*(1-x/n)^n) + + p.new = x/n*(1-W*(1-x/n)^n) + f0 = ceiling(ifelse(f2>0,(n-1)/n*f1^2/(2*f2),(n-1)/n*f1*(f1-1)/2)) + p0 = (1-C)/f0 + p.new=c(p.new,rep(p0,f0)) + return(p.new) +} + + + +Bt_prob_inc = function(x){ + n = x[1] + x = x[-1] + U = sum(x) + f1 = sum(x==1) + f2 = sum(x==2) + A = ifelse(f2>0,2*f2/((n-1)*f1+2*f2),ifelse(f1>0,2/((n-1)*(f1-1)+2),1)) + C=1-f1/U*(1-A) + W=U/n*(1-C)/sum(x/n*(1-x/n)^n) + + p.new=x/n*(1-W*(1-x/n)^n) + f0 = ceiling(ifelse(f2>0,(n-1)/n*f1^2/(2*f2),(n-1)/n*f1*(f1-1)/2)) + p0=U/n*(1-C)/f0 + p.new=c(p.new,rep(p0,f0)) + return(p.new) +} + + + +Bt_prob = function(x,datatype = c("abundance","incidence_freq")){ + datatype = match.arg(datatype,c("abundance","incidence_freq")) + if(datatype == "abundance"){ + prob = Bt_prob_abu(x) + }else{ + prob = Bt_prob_inc(x) + } + return(prob) +} + + +Bootstrap.CI = function(x,q,B = 200,datatype = c("abundance","incidence_freq"),conf = 0.95){ + datatype = match.arg(datatype,c("abundance","incidence_freq")) + p.new = Bt_prob(x,datatype) + n = ifelse(datatype=="abundance",sum(x),x[1]) + # set.seed(456) + if(datatype=="abundance"){ + data.bt = rmultinom(B,n,p.new) + }else{ + data.bt = rbinom(length(p.new)*B,n,p.new) + data.bt = matrix(data.bt,ncol=B) + #data.bt = rbind(rep(n,B),data.bt) + } + + mle = apply(data.bt,2,function(x)Hill(x,q,datatype)) + + ################ Abundance ############### + if(datatype == "abundance"){ + d1 = sort(unique(data.bt[data.bt>0])) + M = max(d1) + Sub = function(q){ + d2 = sapply(1:length(d1),function(i){ + k=0:(n-d1[i]) + sum(choose(k-q,k)*exp(lchoose(n-k-1,d1[i]-1)-lchoose(n,d1[i]))) + }) + d3 = rep(0,M) + d3[d1] = d2 + bt.pro = sapply(1:B,function(b){ + f1=sum(data.bt[,b]==1) + f2=sum(data.bt[,b]==2) + if(f2>0){ + A=2*f2/((n-1)*f1+2*f2) + }else if(f1!=0){ + A=2/((n-1)*(f1-1)+2) + }else{ + A=1 + } + if(q!=1){ + t1=table(data.bt[,b][data.bt[,b]>0]) + t2=as.numeric(names(t1)) + aa=d3[t2] + e1=sum(t1*aa) + + if(A==1){e2=0}else{ + r=0:(n-1) + e2=f1/n*(1-A)^(-n+1)*(A^(q-1)-sum(choose(q-1,r)*(A-1)^r)) + } + + if(e1+e2!=0){ + e=(e1+e2)^(1/(1-q)) + }else{e=NA} + + }else{ + y2=data.bt[,b][which(data.bt[,b]>0 & data.bt[,b]<=(n-1))] + e1=sum(y2/n*(digamma(n)-digamma(y2))) + + if(A==1){e2=0}else{ + r=1:(n-1) + e2=f1/n*(1-A)^(-n+1)*(-log(A)-sum((1-A)^r/r)) + } + e=exp(e1+e2) + } + e + }) + bt.pro + } + pro = t(sapply(q,Sub)) + }else{ + d1 = sort(unique(data.bt[data.bt>0])) + M = max(d1) + Sub = function(q){ + d2 = sapply(1:length(d1),function(i){ + k = 0:(n-d1[i]) + sum(choose(k-q,k)*exp(lchoose(n-k-1,d1[i]-1)-lchoose(n,d1[i]))) + }) + d3 = rep(0,M) + d3[d1] = d2 + + bt.pro = sapply(1:B,function(b){ + y2=data.bt[,b];y2 = y2[y2>0] + U = sum(y2) + Q1=sum(y2==1) + Q2=sum(y2==2) + if(Q2>0){ + A=2*Q2/((n-1)*Q1+2*Q2) + }else if(Q1!=0){ + A=2/((n-1)*(Q1-1)+2) + }else{ + A=1 + } + if(q!=1){ + t1=table(data.bt[,b][data.bt[,b]>0]) + t2=as.numeric(names(t1)) + aa=d3[t2] + e1=sum(t1*aa) + + if(A==1){e2=0}else{ + r=0:(n-1) + e2=Q1/n*(1-A)^(-n+1)*(A^(q-1)-sum(choose(q-1,r)*(A-1)^r)) + } + + if(e1+e2!=0){ + e=((n/U)^q*(e1+e2))^(1/(1-q)) + }else{e=NA} + + }else{ + + e1=sum(y2/U*(digamma(n)-digamma(y2))) + r =1:(n-1) + e2 = ifelse(Q1==0|A==1,0,Q1/U*(1-A)^(1-n)*(-log(A)-sum((1-A)^r/r))) + e = exp(e1+e2)*U/n + } + e + }) + bt.pro + } + + + pro = t(sapply(q,Sub)) + + } + + + #pro = apply(data.bt,2,function(x)Chao_Hill(x,q,datatype)) + + mle.mean = rowMeans(mle) + pro.mean = rowMeans(pro) + + LCI.mle = -apply(mle,1,function(x)quantile(x,probs = (1-conf)/2)) + mle.mean + UCI.mle = apply(mle,1,function(x)quantile(x,probs = 1-(1-conf)/2)) - mle.mean + + LCI.pro = -apply(pro,1,function(x)quantile(x,probs = (1-conf)/2)) + pro.mean + UCI.pro = apply(pro,1,function(x)quantile(x,probs = 1-(1-conf)/2)) - pro.mean + + LCI = rbind(LCI.mle,LCI.pro) + UCI = rbind(UCI.mle,UCI.pro) + + sd.mle = apply(mle,1,sd) + sd.pro = apply(pro,1,function(x)sd(x,na.rm = T)) + se = rbind(sd.mle,sd.pro) + + return(list(LCI=LCI,UCI=UCI,se=se)) + +} + + +ChaoHill <- function(dat, datatype=c("abundance", "incidence_freq"),q=NULL, from=0, to=3, interval=0.1, B=1000, conf=0.95){ + datatype = match.arg(datatype,c("abundance","incidence_freq")) + # for real data estimation + + if (is.matrix(dat) == T || is.data.frame(dat) == T){ + if (ncol(dat) != 1 & nrow(dat) != 1) + stop("Error: The data format is wrong.") + if (ncol(dat) == 1){ + dat <- dat[, 1] + } else { + dat <- dat[1, ] + } + } + dat <- as.numeric(dat) + if(is.null(q)){q <- seq(from, to, by=interval)} + if(!is.null(q)){q <- q} + #------------- + #Estimation + #------------- + MLE=Hill(dat,q,datatype) + + qD_pro=Chao_Hill(dat,q,datatype) + + CI_bound = Bootstrap.CI(dat,q,B,datatype,conf) + se = CI_bound$se + #------------------- + #Confidence interval + #------------------- + tab.est=data.frame(rbind(MLE,qD_pro)) + + LCI <- tab.est - CI_bound$LCI + UCI <- tab.est + CI_bound$UCI + + colnames(tab.est) <- colnames(se) <- colnames(LCI) <- colnames(UCI) <- paste("q = ", q, sep="") + rownames(tab.est) <- rownames(se) <- rownames(LCI) <- rownames(UCI) <- c("Observed", "Chao_2015") + return(list(EST = tab.est, + SD = se, + LCI = LCI, + UCI = UCI)) + +} + + + + +conf.reg=function(x_axis,LCL,UCL,...) { + x.sort <- order(x_axis) + x <- x_axis[x.sort] + LCL <- LCL[x.sort] + UCL <- UCL[x.sort] + polygon(c(x,rev(x)),c(LCL,rev(UCL)), ...) +} + + +reshapeChaoHill <- function(out){ + + tab <- data.frame(q=as.numeric(substring(colnames(out$EST), 5)), + method=rep(rownames(out$EST), each=ncol(out$EST)), + est=c(t(out$EST)[,1],t(out$EST)[,2]), + se=c(t(out$SD)[,1],t(out$SD)[,2]), + qD.95.LCL=c(t(out$LCI)[,1],t(out$LCI)[,2]), + qD.95.UCL=c(t(out$UCI)[,1],t(out$UCI)[,2])) + tab$est <- round(tab$est,3) + tab$se <- round(tab$se,3) + tab$qD.95.LCL <- round(tab$qD.95.LCL,3) + tab$qD.95.UCL <- round(tab$qD.95.UCL,3) + + tab +} + + +Diversity_Inc=function(X) +{ + X=X[,1] + + Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence_freq", from=0, to=3, interval=0.25, B=50, conf=0.95)) + #df$method <- factor(df$method, c("Observed", "Chao_2013"), c("Empirical", "Estimation")) + Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Hill[14:26,4],Hill[1:13,4]) + Hill<-round(Hill,3) + Hill <- data.frame(Hill) + colnames(Hill)<-c("q","Chao","Empirical","Chao(s.e.)","Empirical(s.e.)") + + + z <- list("HILL.NUMBERS"= Hill) + class(z) <- c("spadeDiv_Inc") + return(z) + + #cat("\n") + #cat("(5) FISHER ALPHA INDEX:\n\n") + #table_alpha=round(alpha(X),3) + #colnames(table_alpha)<-c("Estimator", "Est_s.e.", paste("95% Lower Bound"), paste("95% Upper Bound")) + #rownames(table_alpha)<-c(" alpha") + #print( table_alpha) + #cat("\n") + #cat(" See Eq. (2.9) of Magurran (1988) for a definition of Fisher's alpha index.\n") +} +#X=read.table("Data4a.txt") +#Y=read.table("Data4b1_t.txt") +#Diversity(datatype="Abundance",X) +#Diversity(datatype="Frequencies_of_Frequencies",Y) + +print.spadeDiv_Inc <- function(x, digits = max(3L, getOption("digits") - 3L), ...){ + + + cat("\n(5) The estimates of Hill's number at order q from 0 to 3\n\n") + print(x$HILL.NUMBERS) + + cat(" + Chao: see Chao and Jost (2015). + Empirical: maximum likelihood estimator. + ") + +} diff --git a/R/ExtenProbFunMa.R b/R/ExtenProbFunMa.R new file mode 100644 index 0000000..ffb98d3 --- /dev/null +++ b/R/ExtenProbFunMa.R @@ -0,0 +1,58 @@ +ExtenProbFunMa <- +function(z1, z2) { + x1 <- z1; x2 <- z2 # Sorted data + n1 <- sum(x1); n2 <- sum(x2) + D1 <- sum(x1 > 0); D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1; K2 <- (n2 - 1) / n2 + f0p <- ifelse(f2p == 0, + f0p <- ceiling(K1 * f1p * (f1p - 1) / 2 / (f2p + 1)), + f0p <- ceiling(K1 * f1p^2 / 2 / f2p)) + fp0 <- ifelse(fp2 == 0, + fp0 <- ceiling(K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1)), + fp0 <- ceiling(K2 * fp1^2 / 2 / fp2 )) + f00 <- ifelse(f22 == 0, + f00 <- ceiling(K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1)), + f00 <- ceiling(K1 * K2 * f11^2 / 4 / f22)) + + ga1 <- D1 - D12; ga2 <- D2 - D12 + # d1 <- max(f0p, ga2) # maximum of f0+ & gamma2 + # d2 <- max(fp0, ga1) # maximum of f+0 & gamma1 + + + ## Community 1 + tmp1 <- Cf0Fun(x1) + Chat1 <- tmp1[1] ; f0_1 <- tmp1[2] + more1 <- max(f0_1, (max(ga1, fp0) - ga1 + f0p + f00)) # unseen species in commuity 1 + add1 <- max(f0_1 - max(ga1, fp0) + ga1 - f0p - f00, 0) # unseen endemic in commuity 1 + lambda1 <- (1 - Chat1) / sum(x1 / n1 * (1 - x1 / n1)^n1) + pi1 <- x1 / n1 * (1 - lambda1 * (1 - x1 /n1)^n1) + p0_1 <- (1 - Chat1) / more1 + + ## Community 2 + tmp2 <- Cf0Fun(x2) + Chat2 <- tmp2[1] ; f0_2 <- tmp2[2] + more2 <- max(f0_2, (max(ga2, f0p) - ga2 + fp0 + f00)) # unseen species in commuity 2 + add2 <- max(f0_2 - max(ga2, f0p) + ga2 - fp0 - f00, 0) # unseen endemic in commuity 2 + lambda2 <- (1 - Chat2) / sum(x2 / n2 * (1 - x2 / n2)^n2) + pi2 <- x2 / n2 * (1 - lambda2 * (1 - x2 /n2)^n2) + p0_2 <- (1 - Chat2) / more2 + + ## Extension probility for Community 1 + prob1 <- c(pi1[pi1 > 0], rep(p0_1, max(fp0-ga1, 0)), rep(p0_1, f0p), + rep(0, max(ga2-f0p, 0)), rep(p0_1, f00), rep(p0_1, add1), + rep(0, add2)) + + ## Extension probility for Community 2 + prob2 <- c(pi2[1:D12], rep(p0_2, fp0), rep(0, max(ga1-fp0, 0)), + pi2[(D12 + ga1 + 1):(D12 + ga1 + ga2)], rep(p0_2, max(f0p-ga2, 0)), + rep(p0_2, f00), rep(0, add1), rep(p0_2, add2)) + out <- list(prob1=prob1, prob2=prob2) + return(out) +} diff --git a/R/Genetic_subroutine.R b/R/Genetic_subroutine.R new file mode 100644 index 0000000..7b304a5 --- /dev/null +++ b/R/Genetic_subroutine.R @@ -0,0 +1,438 @@ +GST_equ=function(X,method=c("est","mle")) +{ + if(method=="est"){ + X=as.matrix(X) + no.community=length(X[1,]) + n=colSums(X) + temp1=X%*%matrix(c(1/n),no.community,1) + temp2=sum(sapply(1:no.community,function(k) sum((X[,k]/n[k])^2) )) + + Hs_hat=1-1/no.community*sum(sapply(1:no.community,function(k) sum(X[,k]*(X[,k]-1)/n[k]/(n[k]-1)))) + Ht_hat=1-1/no.community^2*sum(sapply(1:no.community,function(k) sum(X[,k]*(X[,k]-1)/n[k]/(n[k]-1))))-1/no.community^2*(sum(temp1^2)-temp2) + GST=1-Hs_hat/Ht_hat + }else{ + X = as.matrix(X) + N <- length(X[1,]) + n <- colSums(X) + ps <- sapply(1:N, FUN = function(i){ + (X[,i] /n[i] )^2 + }) + Hs <- 1 - sum(ps)/N + Ht <- 1 - sum((apply(sapply(1:N, FUN = function(i){ + (X[,i] /n[i] ) + }),MARGIN = 1, FUN = sum)/N)^2) + GST=1-Hs/Ht + } + return(GST) +} +GST_se_equ <- function(X, nboot=50) +{ + nboot=50 + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + boot.gst.mle=rep(0,nboot) + boot.gst.est=rep(0,nboot) + gst.mle=GST_equ(X, method="mle") + gst.est=GST_equ(X, method="est") + for(i in 1:nboot) + { + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(j) rmultinom(1,sum(X[,j]),p[,j] ) ) + boot.gst.mle[i]=GST_equ(boot.X, method="mle") + boot.gst.est[i]=GST_equ(boot.X, method="est") + } + se_mle=sd(boot.gst.mle) + se_est=sd(boot.gst.est) + out1= plus_CI(c(gst.mle, se_mle)) + out2= plus_CI(c(gst.est, se_est)) + out <- rbind(out1, out2) + return(out) +} +print.spadeGenetic <- function(x, ...) +{ + cat('\n(1) BASIC DATA INFORMATION:\n\n') + cat(' The loaded set includes abundance (or frequency) data from',x$info[1],'subpopulations\n') + cat(' and a total of',x$info[2],'distinct alleles are found.\n\n') + cat(' Sample size in each subpopulation n1 =', x$info[3],'\n') + N <- x$info[1] + q <- x$q + for(j in 2:N){ + cat(' ','n') + cat(j,' =',x$info[2+j],'\n') + } + cat('\n') + cat(' Number of observed alleles in one subpopulation D1 =', x$info[N+3],'\n') + + for(j in 2:N){ + cat(' ','D') + cat(j,' =',x$info[N+2+j],'\n') + } + cat('\n') + cat(' Number of observed shared alleles in two subpopulations D12 =', x$info[3+2*N], '\n') + + if(N>2){ + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + if(i==1 & j==2) next + cat(' ','D') + cat(i,j,' = ', x$info[3+2*N+k], '\n', sep="") + k <- k + 1 + } + } + } + cat('\n') + if(N==3) + { + cat(' Number of shared alleles in three subpopulations D123 =',rev(x$info)[2],'\n\n') + } + cat(' Number of bootstrap replications for s.e. estimate ',rev(x$info)[1],'\n\n') + cat('(2) EMPIRICAL DIS-SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based dis-similarity\n\n') + temp <- apply(as.matrix(x$Empirical_richness), 2, as.numeric) + cat(' 1-C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n') + cat(' 1-U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n') + cat(' (b) Measures for comparing alleles relative abundances\n\n') + temp <- apply(as.matrix(x$Empirical_relative), 2, as.numeric) + cat(' 1-C1');cat(N);cat('=1-U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' 1-C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' 1-U2');cat(N);cat(' (q=2, Regional diff.) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Gst ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted alleles relative abundances\n\n') + temp <- x$Empirical_WtRelative + cat(' Horn size-weighted(q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat('(3) ESTIMATED DIS-SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based dis-similarity\n\n') + temp <- apply(as.matrix(x$estimated_richness), 2, as.numeric) + if(temp[1,1]>1) {cat(' 1-C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[1,1]<=1){cat(' 1-C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[2,1]>1) {cat(' 1-U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",1) ,' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + if(temp[2,1]<=1){cat(' 1-U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + cat(' (b) Measures for comparing alleles relative abundances\n\n') + temp <- apply(as.matrix(x$estimated_relative), 2, as.numeric) + cat(' 1-C1');cat(N);cat('=1-U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' 1-C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' 1-U2');cat(N);cat(' (q=2, Regional diff.) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Gst ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted alleles relative abundances\n\n') + temp <- x$estimated_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat('(4) ESTIMATED PAIRWISE DIS-SIMILARITY:\n\n') + if(q == 0){ + cat(' -----------------------Measure 1-C02------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + ###################################################CqN_ Equal weight + Cqn_PC <- x$pairwise$C02 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' 1-C02(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM=x$dissimilarity_matrix$C02 + cat(' 1-C02(i,j)') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM <- x$dissimilarity_matrix$U02 + + cat(' 1-U02(i,j)') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM=x$dissimilarity_matrix$C12 + cat(' 1-C12(i,j)') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM <- x$dissimilarity_matrix$Horn + + cat(' Horn(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM=x$dissimilarity_matrix$C22 + cat(' 1-C22(i,j)') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise dis-similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise dis-similarity matrix: \n\n') + C_SM <- x$dissimilarity_matrix$U22 + + cat(' 1-U22(i,j)') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i==j){cat(round(0,0),' ')} + if(i 0) + D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + x1_share <- x1[which(x1 > 0 & x2 > 0)] + x2_share <- x2[which(x1 > 0 & x2 > 0)] + f11 <- sum(x1_share == 1 & x2_share == 1) + f1.plus <- sum(x1_share == 1 & x2_share <= 10) + fplus.1 <- sum(x2_share == 1 & x1_share <= 10) + f2.plus <- sum(x1_share == 2 & x2_share <= 10) + fplus.2 <- sum(x2_share == 2 & x1_share <= 10) + D12_rare <- sum(x1_share <= 10 & x2_share <= 10) + + pos <- (x1 > 0 & x2 > 0) & (x1 > 10 | x2 > 10) + n1_rare <- n1 - sum(x1[pos]) + n2_rare <- n2 - sum(x2[pos]) + + # correct when n1_rare = 1 or n2_rare = 0 by Y.H. Lee + if (n1_rare == 1) + n1_rare <- 2 + if (n2_rare == 1) + n2_rare <- 2 + + pos_r <- (x1_share <= 10 & x2_share <= 10) + pos1_r <- (x1_share == 1 & x2_share <= 10) + pos2_r <- (x2_share == 1 & x1_share <= 10) + + tmp <- sum(x1_share[pos_r] * x2_share[pos_r]) + if (tmp == 0) # correct when number of Xi * Yi = 10 equal to 0 by Y.H. Lee + tmp <- 1 + + C12_rare <- 1 - (sum(x2_share[pos1_r]) + sum(x1_share[pos2_r]) - f11) / tmp + if (C12_rare == 0 || C12_rare > 1) # Correct when C12 = 0 or C12 > 1 !!! by c++ + C12_rare <- 1 + # C12_rare <- round(C12_rare, 4) + + T10 <- sum(x1_share[x1_share <= 10 & x2_share <= 10]) + T01 <- sum(x2_share[x1_share <= 10 & x2_share <= 10]) + T11 <- tmp + T21 <- sum(x1_share[pos_r] * (x1_share - 1)[pos_r] * x2_share[pos_r]) + T12 <- sum(x1_share[pos_r] * (x2_share - 1)[pos_r] * x2_share[pos_r]) + + T22 <- sum(x1_share[pos_r] * x2_share[pos_r] * + (x1_share - 1)[pos_r] * (x2_share - 1)[pos_r]) + + if (T10 == 0) # correct when equal to 0 by Y.H. Lee + T10 <- 1 + if (T11 == 0) + T11 <-1 + if (T01 == 0) + T01 <- 1 + + S12_0 <- D12_rare / C12_rare + CCV_1 <- S12_0 * n1_rare * T21 / (n1_rare - 1) / T10 / T11 - 1 + CCV_2 <- S12_0 * n2_rare * T12 / (n2_rare - 1) / T01 / T11 - 1 + CCV_12 <- n1_rare * n2_rare * S12_0^2 * T22 / + ((n1_rare - 1) * (n2_rare - 1) * T10 * T01 * T11) - + S12_0 * T11 / T10 / T01 - CCV_1 - CCV_2 + + tmp1 <- D12 - D12_rare + D12_rare / C12_rare + tmp2 <- (f1.plus * CCV_1 + fplus.1 * CCV_2 + f11 * CCV_12) / C12_rare + est <- tmp1 + tmp2 + return(est) +} diff --git a/R/HeteroFun.R b/R/HeteroFun.R new file mode 100644 index 0000000..2f75318 --- /dev/null +++ b/R/HeteroFun.R @@ -0,0 +1,13 @@ +HeteroFun <- +function(x1, x2, B, conf=0.95) { + est <- HeteroEstFun(x1, x2) + se <- BootstrapFunMa(x1, x2, B, FunName=HeteroEstFun) + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + out <- data.frame(out) + rownames(out) <- c("Heterogeneous(ACE-shared)") + + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/HomoEstFun.R b/R/HomoEstFun.R new file mode 100644 index 0000000..8b4689c --- /dev/null +++ b/R/HomoEstFun.R @@ -0,0 +1,25 @@ +HomoEstFun <- +function(x1, x2) { + D1 <- sum(x1 > 0) + D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + x1_share <- x1[which(x1 > 0 & x2 > 0)] + x2_share <- x2[which(x1 > 0 & x2 > 0)] + f11 <- sum(x1_share == 1 & x2_share == 1) + + D12_rare <- sum(x1_share <= 10 & x2_share <= 10) + + pos_r <- (x1_share <= 10 & x2_share <= 10) + pos1_r <- (x1_share == 1 & x2_share <= 10) + pos2_r <- (x2_share == 1 & x1_share <= 10) + + tmp <- sum(x1_share[pos_r] * x2_share[pos_r]) + if (tmp == 0) # correct when number of Xi * Yi = 0 equal to 0 by Y.H. Lee + tmp <- 1 + + C12_rare <- 1 - (sum(x2_share[pos1_r]) + sum(x1_share[pos2_r]) - f11) / tmp + if (C12_rare == 0 || C12_rare > 1) # Correct when C12 = 0 or C12 > 1 !!! by c++ + C12_rare <- 1 + est <- D12 - D12_rare + D12_rare / C12_rare + return(est) +} diff --git a/R/HomoFun.R b/R/HomoFun.R new file mode 100644 index 0000000..771c357 --- /dev/null +++ b/R/HomoFun.R @@ -0,0 +1,11 @@ +HomoFun <- +function(x1, x2, B, conf=0.95) { + est <- HomoEstFun(x1, x2) + se <- BootstrapFunMa(x1, x2, B, FunName=HomoEstFun) + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Homogeneous") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/InfreqSpeciesGroupInprove.R b/R/InfreqSpeciesGroupInprove.R new file mode 100644 index 0000000..9218668 --- /dev/null +++ b/R/InfreqSpeciesGroupInprove.R @@ -0,0 +1,12 @@ +InfreqSpeciesGroup <- function(data, k){ + data <- data[-1] + data <- as.numeric(data) + Q <- function(i, data){length(data[which(data == i)])} + + x <- data[which(data != 0)] + r <- c(1:k) + Rare.Species.Group <- matrix(sapply(r, function(r)Q(r, x)), 1, k) + rownames(Rare.Species.Group) <- c(" Incidence freq. counts") + colnames(Rare.Species.Group) <- paste("Q", r, sep="") + return(Rare.Species.Group) +} diff --git a/R/Multiple_Community_Measure_subroutine.R b/R/Multiple_Community_Measure_subroutine.R new file mode 100644 index 0000000..df80807 --- /dev/null +++ b/R/Multiple_Community_Measure_subroutine.R @@ -0,0 +1,1530 @@ +###########################################2016.07.24-(P.L.Lin) +Horn_MLE_Multi = function(X, method=c("equal effort", "equal weight")) +{ + N <- ncol(X) + n <- apply(X = X, MARGIN = 2, FUN = sum) + p <- sapply(1:N, FUN = function(i){ + X[, i]/n[i] + }) + if(method == "equal effort"){ + w <- n/sum(n) + }else{w <- rep(1/N,N)} + pbar <- w%*%t(p) + pbar <- pbar[pbar>0] + Hr <- sum(- pbar*log( pbar)) + Ha <- sum(sapply(1:N, FUN = function(i){ + p <- p[, i][p[, i]>0] + w[i]*sum(-p*log(p)) + })) + Ch <- 1-(Hr-Ha)/sum(-w*log(w)) + out=c(Ch) + return(out) +} +Multi_Esti_GammaEntropy <- function(Mat, method="H-T") +{ + n <- apply(Mat, 2, sum) + N <- ncol(Mat) + pij <- sweep(Mat, 2, colSums(Mat), "/") + pi. <- rowMeans(pij) + + if(method=="H-T"){ + Cj <- 1 - apply(Mat,2,function(x)sum(x==1)/sum(x)) + tmpFun <- function(Mat){ + tmp <- rowSums(Mat)==1 + if(sum(tmp)==0){ + rep(0,ncol(Mat)) + } else if(sum(tmp)==1){ + Mat[tmp,] + } else { + colSums(Mat[tmp,]) + } + } + + Ct <- 1 - mean(tmpFun(Mat)/n) + A <- matrix(0,ncol=ncol(Mat), nrow=nrow(Mat)) + for(j in 1:N){ + A[,j] <- (1-Cj[j]*pij[,j])^n[j] + } + + -sum(Ct*pi.*log(Ct*pi.)/(1-apply(A,1,prod)), na.rm=TRUE) + } else{ + pi. <- pi.[pi.>0] + -sum(pi.*log(pi.)) #MLE + } +} +Equal_weight_Horn_Esti_equ=function(X, datatype="abundance") +{ + nboot=50 + boot.Horn=rep(0,nboot) + for(i in 1:nboot) + { + if(datatype=="abundance"){ + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(k) rmultinom(1, sum(X[,k]), p[,k])) + }else{ + p <- Boots.pop_inc(X) + boot.X=sapply(1:dim(X)[2],function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, X[1,k], p[i,1]) )) + } + boot.Hr=Multi_Esti_GammaEntropy(boot.X, method="H-T") + boot.Ha=mean(sapply(1:dim(X)[2],function(j) entropy_MEE_equ(boot.X[,j]))) + boot.Horn[i]=1-(boot.Hr-boot.Ha)/log(dim(X)[2]) + } + if(datatype=="incidence") X <- X[-1, ] + Hr=Multi_Esti_GammaEntropy(X, method="H-T") + Ha=mean(sapply(1:dim(X)[2],function(j) entropy_MEE_equ(X[,j]))) + Horn=1-(Hr-Ha)/log(dim(X)[2]) + se_hat=sd(boot.Horn) + out=c(Horn,se_hat,Horn-1.96*se_hat,Horn+1.96*se_hat) + return(out) +} +Boots.pop=function(data) +{ + N=ncol(data);n=colSums(data); + pool=rowSums(data);OBS=length(pool[pool>0]); + data=data[pool>0,]; + obs=sapply(1:N,function(k) length(data[,k][data[,k]>0])); + F1=sum(pool==1);F2=sum(pool==2); + F0=round(ifelse(F2==0,F1*(F1-1)/2,F1^2/(2*F2))); + + f1=sapply(1:N,function(k) sum(data[,k]==1)); + f2=sapply(1:N,function(k) sum(data[,k]==2)); + C=sapply(1:N,function(k) 1-f1[k]/n[k]); + + f0=round(sapply(1:N,function(k) ifelse(f2[k]==0,f1[k]*(f1[k]-1)/2,f1[k]^2/(2*f2[k])))); + r.data=sapply(1:N,function(k) data[,k]/n[k]); + W=sapply(1:N,function(k) (1-C[k])/sum(r.data[,k]*(1-r.data[,k])^n[k])) + + if(F0>0){ boots.pop=rbind(r.data,matrix(0,ncol=N,nrow=F0)) + }else{boots.pop=r.data} + + for(i in 1:N) + { + if(f0[i]>0) + { + f0[i]=ifelse(f0[i]+obs[i]>OBS+F0, OBS+F0-obs[i],f0[i]) + boots.pop[,i][1:OBS]=boots.pop[,i][1:OBS]*(1-W[i]*(1-boots.pop[,i][1:OBS])^n[i]) # + I=which(boots.pop[,i]==0);II=sample(I,f0[i]) + boots.pop[II,i]=rep((1-C[i])/f0[i],f0[i]) + } + } + return(boots.pop) +} +Boots.pop_inc=function(data) +{ + data <- as.matrix(data) + t=data[1,];Tt=sum(t);data=data[-1,]; + N=ncol(data); + pool=rowSums(data);OBS=length(pool[pool>0]); + data=data[pool>0,]; + obs=sapply(1:N,function(k) length(data[,k][data[,k]>0])); + Q1=sum(pool==1);Q2=sum(pool==2); + Q0=round(((Tt-1)/Tt)*ifelse(Q2==0,Q1*(Q1-1)/2,Q1^2/(2*Q2))); + + q1=sapply(1:N,function(k) sum(data[,k]==1)); + q2=sapply(1:N,function(k) sum(data[,k]==2)); + P1=sapply(1:N,function(k) ifelse(q1[k]+q2[k]==0,0,2*q2[k]/((t[k]-1)*q1[k]+2*q2[k]))); + + q0=round(sapply(1:N,function(k) ((t[k]-1)/t[k])*ifelse(q2[k]==0,q1[k]*(q1[k]-1)/2,q1[k]^2/(2*q2[k])))); + r.data=sapply(1:N,function(k) data[,k]/t[k]); + W=sapply(1:N,function(k) (1-P1[k])*(q1[k]/t[k])/sum(r.data[,k]*(1-r.data[,k])^t[k])); + + if(Q0>0){ boots.pop=rbind(r.data,matrix(0,ncol=N,nrow=Q0)) + }else{boots.pop=r.data} + + for(i in 1:N){ + if(q0[i]>0){ + q0[i]=ifelse(q0[i]+obs[i]>OBS+Q0, OBS+Q0-obs[i],q0[i]) + boots.pop[,i][1:OBS]=boots.pop[,i][1:OBS]*(1-W[i]*(1-boots.pop[,i][1:OBS])^t[i]) # + I=which(boots.pop[,i]==0);II=sample(I,q0[i]) + boots.pop[II,i]=rep((q1[i]/t[i])/q0[i],q0[i]) + } + } + return(boots.pop) +} +Horn_Multi_equ <- function(X, datatype="abundance", nboot=50,method=c("equal", "unequal")) +{ + boot=matrix(0,2,nboot) + for(i in 1:nboot) + { + if(datatype=="abundance"){ + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(k) rmultinom(1, sum(X[,k]), p[,k])) + }else{ + p <- Boots.pop_inc(X) + boot.X=sapply(1:dim(X)[2],function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, X[1,k], p[i,1]) )) + } + boot[,i]=Horn.Est(boot.X, method) + } + se <- apply(boot, MARGIN = 1, FUN = sd) + if(datatype=="incidence") X <- X[-1, ] + value <- Horn.Est(as.matrix(X), method) + out <- c(value[1], se[1], max(0,value[1]-1.96*se[1]), min(1,value[1]+1.96*se[1])) + out2 <- c(value[2], se[2],max(0,value[2]-1.96*se[2]), min(1,value[2]+1.96*se[2])) + return(list(est=out,mle=out2)) + return(out) +} +SimilarityMul=function(X ,q, nboot=50, datatype="abundance", method=c("equal weight","unequal weight")) +{ + if(datatype=="incidence"){ + Y <- X ; X <- X[-1, ] + } + N=ncol(X);ni=colSums(X);n=sum(X); + pool=rowSums(X); + bX=apply(X,2,function(x) x/sum(x));pool.x=rowSums(bX)/N; + + if(q==0){ + f1=apply(X,2,function(x) sum(x==1)); + f2=apply(X,2,function(x) sum(x==2)); + Sobs=apply(X,2,function(x) sum(x>0)); + Si=Sobs+sapply(1:N, function(k) ifelse(f2[k]==0, f1[k]*(f1[k]-1)/2,f1[k]^2/(2*f2[k]))); + Sa=mean(Si); + UqN.mle=(1/N-mean(Sobs)/sum(pool>0))/(1/N-1); + CqN.mle=(N-sum(pool>0)/mean(Sobs))/(N-1); + + F1=sum(pool==1);F2=sum(pool==2); + Sg=sum(pool>0)+ifelse(F2==0,F1*(F1-1)/2,F1^2/(2*F2)); + UqN=min(1,(1/N-Sa/Sg)/(1/N-1));UqN=max(0,UqN); + CqN=min(1,(N-Sg/Sa)/(N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot); b.UqN.mle=numeric(nboot); + b.CqN=numeric(nboot); b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + p <- Boots.pop(X) + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + p <- Boots.pop_inc(Y) + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, Y[1,k], p[i,1]) )) + } + f1=apply(XX,2,function(x) sum(x==1)); + f2=apply(XX,2,function(x) sum(x==2)); + Sobs=apply(XX,2,function(x) sum(x>0)); + Si=Sobs+sapply(1:N,function(k) ifelse(f2[k]==0,f1[k]*(f1[k]-1)/2,f1[k]^2/(2*f2[k]))) + Sa=mean(Si); + pool=rowSums(XX); + b.UqN.mle[i]=(1/N-mean(Sobs)/sum(pool>0))/(1/N-1); + b.CqN.mle[i]=(N-sum(pool>0)/mean(Sobs))/(N-1); + + F1=sum(pool==1);F2=sum(pool==2); + Sg=sum(pool>0)+ifelse(F2==0,F1*(F1-1)/2,F1^2/(2*F2)); + b.UqN[i]=min(1,(1/N-Sa/Sg)/(1/N-1)); b.UqN[i]=max(0,b.UqN[i]); + b.CqN[i]=min(1,(N-Sg/Sa)/(N-1));b.CqN[i]=max(0,b.CqN[i]); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle); #standard deviations of UqN est. and mle + se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); #standard deviations of CqN est. and mle + + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + + if(q==2){ + if(method=="equal weight"){ + a.mle=N/sum(bX^2) + g.mle=1/sum(pool.x^2); + b.mle=g.mle/a.mle; + UqN.mle=(N-b.mle)/(N-1); + CqN.mle=(1/N-1/b.mle)/(1/N-1); + + Ai=sapply(1:N,function(k) sum(X[,k]*(X[,k]-1)/(ni[k]*(ni[k]-1)))); + bX.1=apply(X,2,function(x) (x-1)/(sum(x)-1)); + temp=sapply(1:nrow(X),function(j) (sum(bX[j,]%*%t(bX[j,]))-sum(bX[j,]^2))+sum(bX[j,]*bX.1[j,])); + G=1/(sum(temp)/N^2); + + B=G/(1/mean(Ai)); + UqN=min(1,(N-B)/(N-1));UqN=max(0,UqN); + CqN=min(1,(1/N-1/B)/(1/N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot);b.UqN.mle=numeric(nboot); b.CqN=numeric(nboot);b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + p <- Boots.pop(X) + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + p <- Boots.pop_inc(Y) + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, Y[1,k], p[i,1]) )) + } + bXX=apply(XX,2,function(x) x/sum(x)); + pool.x=rowSums(bXX)/N; + a.mle=N/sum(bXX^2);g.mle=1/sum(pool.x^2);b.mle=g.mle/a.mle; + b.UqN.mle[i]=(N-b.mle)/(N-1); + b.CqN.mle[i]=(1/N-1/b.mle)/(1/N-1); + + Ai=sapply(1:N,function(k) sum(XX[,k]*(XX[,k]-1)/(ni[k]*(ni[k]-1)))); + bXX.1=apply(XX,2,function(x) (x-1)/(sum(x)-1)); + temp=sapply(1:nrow(XX),function(j) (sum(bXX[j,]%*%t(bXX[j,]))-sum(bXX[j,]^2))+sum(bXX[j,]*bXX.1[j,])); + G=1/(sum(temp)/N^2); + + B=G/(1/mean(Ai)); + b.UqN[i]=(N-B)/(N-1); + b.CqN[i]=(1/N-1/B)/(1/N-1); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle);se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + if(method=="unequal weight"){ + a.mle=1/(N*sum((X/n)^2));g.mle=1/sum((pool/n)^2);b.mle=g.mle/a.mle; + UqN.mle=(N-b.mle)/(N-1); + CqN.mle=(1/N-1/b.mle)/(1/N-1); + + A=(1/N)*(1/sum(X*(X-1)/(n*(n-1)))); + G=1/sum(pool*(pool-1)/(n*(n-1))); + B=G/A; + UqN=min(1,(N-B)/(N-1));UqN=max(0,UqN); + CqN=min(1,(1/N-1/B)/(1/N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot);b.UqN.mle=numeric(nboot);b.CqN=numeric(nboot);b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + p <- Boots.pop(X) + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + p <- Boots.pop_inc(Y) + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, Y[1,k], p[i,1]) )) + } + pool=rowSums(XX); + a.mle=1/(N*sum((XX/n)^2));g.mle=1/sum((pool/n)^2);b.mle=g.mle/a.mle; + b.UqN.mle[i]=(N-b.mle)/(N-1); + b.CqN.mle[i]=(1/N-1/b.mle)/(1/N-1); + + A=(1/N)*(1/sum(XX*(XX-1)/(n*(n-1)))); + G=1/sum(pool*(pool-1)/(n*(n-1))); + B=G/A; + b.UqN[i]=(N-B)/(N-1); + b.CqN[i]=(1/N-1/B)/(1/N-1); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle);se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + } + out1 <- cbind(out1[,c(1, 2)], out1[, 4], out1[, 3]) + colnames(out1)=c("UqN","se","95%.Lower","95%.Upper") + rownames(out1)=c("Emperical","Estimate") + out2 <- cbind(out2[,c(1, 2)], out2[, 4], out2[, 3]) + colnames(out2)=c("CqN","se","95%.Lower","95%.Upper") + rownames(out2)=c("Emperical","Estimate") + return(list(UqN=out1,CqN=out2)); +} +Cq2_est_equ <- function(X, q, boot, datatype="abundance" ,method=c("equal effort", "equal weight")) +{ + N <- ncol(X) + if(datatype=="abundance"){ + n <- apply(X = X, MARGIN = 2, FUN = sum) + }else{ + n <- apply(X = X[-1,], MARGIN = 2, FUN = sum) + } + weight <- n/sum(n) + weight <- - sum(weight*log(weight)) / log(N) + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + if(q == 0){ + mat <- Jaccard_Sorensen_Abundance_equ(datatype, X[, 1], X[, 2], boot)[, c(1, 2)] + out1 <- plus_CI(c(mat[4,1],mat[4,2])) + out2 <- plus_CI(c(mat[2,1],mat[2,2])) + out=rbind(out1, out2) + } + if(method == "equal effort"){ + if(q == 1){ + out2 = Two_Horn_equ(X[, 1], X[, 2], weight = "unequal", datatype, method = "est", boot) + out1 = plus_CI(c(weight*out2[1],out2[2])) + out=rbind(out1, out2) + } + if(q == 2){ + if(datatype=="incidence"){ + out=C2N_ee_se_inc(X, boot) + out<-rbind(plus_CI(out[3,]),plus_CI(out[4,])) + }else{ + out=SimilarityTwo(X, q, boot, datatype, method="unequal weight") + out<-rbind(out$CqN[2,], out$UqN[2,]) + } + } + } + if(method == "equal weight"){ + if(q == 1){ + out = Two_Horn_equ(X[, 1], X[, 2], weight = "equal", datatype, method = "est", boot) ; out=rbind(out, out) + } + if(q == 2){ + out=SimilarityTwo(X, q, boot, datatype, method="equal weight") + out<-rbind(out$CqN[2,], out$UqN[2,]) + } + } + colnames(out) <- c("Est","se","95%.Lower","95%.Upper") + return(out) +} +BC_equ <- function(X, datatype="abundance", nboot) +{ + boot=matrix(0,2,nboot) + for(i in 1:nboot) + { + if(datatype=="abundance"){ + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(k) rmultinom(1, sum(X[,k]), p[,k])) + }else{ + p <- Boots.pop_inc(X) + boot.X=sapply(1:dim(X)[2],function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, X[1,k], p[i,1]) )) + } + boot[,i]=BC.Est(boot.X) + } + se <- apply(boot, MARGIN = 1, FUN = sd) + if(datatype=="incidence") X <- X[-1, ] + value <- BC.Est(as.matrix(X)) + out <- c(value[1], se[1], max(0,value[1]-1.96*se[1]), min(1,value[1]+1.96*se[1])) + out2 <- c(value[2], se[2],max(0,value[2]-1.96*se[2]), min(1,value[2]+1.96*se[2])) + return(list(est=out,mle=out2)) +} +########################################### +C0n_equ=function(X) +{ + X=as.matrix(X) + I=which(X>0) + X[I]=rep(1,length(I)) + no.community=length(X[1,]) + C0n_num=sum(rowSums(X)>0) + C0n_dem=sum(X) + C0n=no.community/(1-no.community)*( C0n_num-C0n_dem)/C0n_dem + return( C0n) +} +correct_obspi<- function(X) +{ + Sobs <- sum(X > 0) + n <- sum(X) + f1 <- sum(X == 1) + f2 <- sum(X == 2) + if(f1>0 & f2>0){a= (n - 1) * f1 / ((n - 1) * f1 + 2 * f2) * f1 / n} + if(f1>0 & f2==0){a= f1 / n*(n-1)*(f1-1)/((n-1)*(f1-1)+2)} + if(f1==1 & f2==0){a=0} + if(f1==0){a=0} + b <- sum(X / n * (1 - X / n) ^ n) + w <- a / b + Prob.hat <- X / n * (1 - w * (1 - X / n) ^ n) + Prob.hat +} +entropy_MEE_equ=function(X) +{ + x=X + x=x[x>0] + n=sum(x) + UE <- sum(x/n*(digamma(n)-digamma(x))) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0) + { + A <-1-ifelse(f2 > 0, (n-1)*f1/((n-1)*f1+2*f2), (n-1)*f1/((n-1)*f1+2)) + B=sum(x==1)/n*(1-A)^(-n+1)*(-log(A)-sum(sapply(1:(n-1),function(k){1/k*(1-A)^k}))) + } + if(f1==0){B=0} + if(f1==1 & f2==0){B=0} + UE+B +} +Two_com_correct_obspi=function(X1,X2) +{ + n1=sum(X1) + n2=sum(X2) + f11=sum(X1==1) + f12=sum(X1==2) + f21=sum(X2==1) + f22=sum(X2==2) + C1=1-f11/n1*(n1 - 1) * f11 / ((n1 - 1) * f11 + 2 * f12) + C2=1-f21/n2*(n2 - 1) * f21 / ((n2 - 1) * f21 + 2 * f22) + + PP1=correct_obspi(X1) + PP2=correct_obspi(X2) + D12=which(X1>0 & X2>0) + + f0hat_1=ceiling( ifelse(f12 == 0, f11 * (f11 - 1) / 2, f11 ^ 2/ 2 / f12) ) + f0hat_2=ceiling( ifelse(f22 == 0, f21 * (f21 - 1) / 2, f21 ^ 2/ 2 / f22) ) + #----------------------------------------------------------------------------- + + r1=which(X1>0 & X2==0) + f.1=length(which(X1>0 & X2==1)) + f.2=length(which(X1>0 & X2==2)) + f.0=ceiling(ifelse(f.2>0,f.1^2/2/f.2,f.1*(f.1-1)/2)) + #------------------------------------------------------------------------------ + r2=which(X1==0 & X2>0) + f1.=length(which(X1==1 & X2>0)) + f2.=length(which(X1==2 & X2>0)) + f0.=ceiling(ifelse(f2.>0,f1.^2/2/f2.,f1.*(f1.-1)/2)) + #------------------------------------------------------------------------------ + t11=length(which(X1==1 & X2==1)) + t22=length(which(X1==2 & X2==2)) + f00hat=ceiling( ifelse(t22 == 0, t11 * (t11 - 1) / 4, t11 ^ 2/ 4 / t22) ) + #------------------------------------------------------------------------------ + temp1=max(length(r1),f.0)-length(r1)+f0.+f00hat + temp2=max(length(r2),f0.)-length(r2)+f.0+f00hat + p0hat_1=(1-C1)/max(f0hat_1,temp1) + p0hat_2=(1-C2)/max(f0hat_2,temp2) + #------------------------------------------------------------------------------ + P1=PP1[D12] + P2=PP2[D12] + if(length(r1)> f.0) + { + P1=c(P1,PP1[r1]) + Y=c(rep(p0hat_2, f.0), rep(0,length(r1)-f.0)) + P2=c(P2,sample(Y,length(Y)) ) + } + if(length(r1)< f.0) + { + P1=c(P1,PP1[r1],rep( p0hat_1,f.0-length(r1))) + P2=c(P2,rep(p0hat_2, f.0) ) + } + #---------------------------------------------------------------------------- + if(length(r2)> f0.) + { + Y=c(rep(p0hat_1,f0.),rep(0,length(r2)- f0.)) + P1=c(P1,sample(Y,length(Y))) + P2=c(P2,PP2[r2] ) + } + if(length(r2)< f0.) + { + P1=c(P1,rep(p0hat_1,f0.)) + P2=c(P2,PP2[r2],rep( p0hat_2,f0.-length(r2)) ) + } + P1=c(P1,rep( p0hat_1,f00hat)) + P2=c(P2,rep( p0hat_2,f00hat)) + P1=c(P1, rep(p0hat_1,max( f0hat_1-temp1,0)) , rep( 0 ,max( f0hat_2-temp2,0)) ) + P2=c(P2, rep( 0 ,max( f0hat_1-temp1,0)) , rep(p0hat_2,max( f0hat_2-temp2,0)) ) + #------------------------------------------------------------------------------------ + a=cbind(P1,P2) + return(a) +} +Horn_MLE_Multi_equ <- function(X, datatype="abundance", nboot=50,method=c("equal effort", "equal weight")) +{ + boot.Horn=rep(0,nboot) + for(i in 1:nboot) + { + if(datatype=="abundance"){ + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(k) rmultinom(1, sum(X[,k]), p[,k])) + }else{ + p <- Boots.pop_inc(X) + boot.X=sapply(1:dim(X)[2],function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, X[1,k], p[i,1]) )) + } + boot.Horn[i]=Horn_MLE_Multi(boot.X, method) + } + if(datatype=="incidence") X <- X[-1, ] + Horn=Horn_MLE_Multi(X, method) + se_hat=sd(boot.Horn) + out=c(Horn,se_hat,Horn-1.96*se_hat,Horn+1.96*se_hat) + return(out) +} +C1n_equ=function(method=c("absolute","relative"), X, datatype="abundance" , boot=200) +{ + X=as.matrix(X) + if(datatype=="incidence"){ + Y <- X ; X <- X[-1, ] + } + n=colSums(X) + min_n=min(n) + temp=sum(ifelse(n==min_n,0,1)) + no.community=length(X[1,]) + if(method=="relative") + { + if(temp>0) + { + D1_alpha=sum( sapply(1:no.community,function(k) entropy_MEE_equ(X[,k])) )/no.community + Horn=rep(0,200) + rarefy.X=matrix(0,length(X[,1]),no.community) + for(h in 1:200) + { + for(j in 1:no.community) + { + if(n[j]==min_n){rarefy.X[,j]=X[,j]} + if(n[j]>min_n) + { + Y=X[,j] + total=n[j] + y=0 + for(i in 1:min_n) + { + P=Y/total + z=rmultinom(1,1,P) + Y=Y-z + total=total-1 + y=y+z + } + rarefy.X[,j]=y + } + } + Horn[h]=( entropy_MEE_equ(rowSums(rarefy.X))-D1_alpha)/log(no.community) + } + C1n=1-mean(Horn) + C1n_se=sd(Horn) + a=c( C1n, C1n_se, max(0,C1n-1.96*C1n_se), min(1,C1n+1.96*C1n_se)) + return(a) + } + if(temp==0) + { + D1_alpha=sum( sapply(1:no.community,function(k) entropy_MEE_equ(X[,k])) )/(no.community) + D1_gamma=entropy_MEE_equ(rowSums(X)) + Horn=(D1_gamma-D1_alpha)/log(no.community) + if(no.community==2) + { + p_hat=Two_com_correct_obspi(X[,1],X[,2]) + boot.Horn=rep(0,boot) + for(h in 1:boot) + { + boot.X=cbind(rmultinom(1,n[1],p_hat[,1]),rmultinom(1,n[2],p_hat[,2]) ) + boot.D1_alpha=sum( sapply(1:2,function(k) entropy_MEE_equ(boot.X[,k])) )/no.community + boot.Horn[h]=( entropy_MEE_equ(rowSums(boot.X))-boot.D1_alpha)/log(no.community) + } + C1n=1-Horn + C1n_se=sd(boot.Horn) + a=c( C1n, C1n_se, max(0,C1n-1.96*C1n_se), min(1,C1n+1.96*C1n_se)) + return(a) + } + if(no.community>2) + { + boot.Horn=rep(0,boot) + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(k) rmultinom(1,n[k],X[,k]/n[k]) ) + boot.D1_alpha=sum( sapply(1:no.community,function(k) entropy_MEE_equ(boot.X[,k])) )/no.community + boot.Horn[h]=( entropy_MEE_equ(rowSums(boot.X))-boot.D1_alpha)/log(no.community) + } + C1n=1-Horn + C1n_se=sd(boot.Horn) + a=c( C1n, C1n_se, max(0,C1n-1.96*C1n_se), min(1,C1n+1.96*C1n_se)) + return(a) + } + } + } + if(method=="absolute") + { + pool.size=sum(n) + w=n/pool.size + D1_alpha=sum( sapply(1:no.community,function(k) w[k]*entropy_MEE_equ(X[,k])) ) + D1_gamma=entropy_MEE_equ(rowSums(X)) + Horn=(D1_gamma-D1_alpha)/sum( -w*log(w)) + if(no.community==2) + { + p_hat=Two_com_correct_obspi(X[,1],X[,2]) + boot.Horn=rep(0,boot) + for(h in 1:boot) + { + boot.X=cbind(rmultinom(1,n[1],p_hat[,1]),rmultinom(1,n[2],p_hat[,2]) ) + boot.D1_alpha=sum( sapply(1:2,function(k) w[k]*entropy_MEE_equ(boot.X[,k])) ) + boot.Horn[h]=( entropy_MEE_equ(rowSums(boot.X))-boot.D1_alpha)/sum( -w*log(w)) + } + C1n=1-Horn + C1n_se=sd(boot.Horn) + a=c( C1n, C1n_se, max(0,C1n-1.96*C1n_se), min(1,C1n+1.96*C1n_se)) + return(a) + } + if(no.community>2) + { + boot.Horn=rep(0,boot) + for(h in 1:boot) + { + if(datatype=="abundance"){ + p <- Boots.pop(X) + boot.X=sapply(1:dim(X)[2],function(k) rmultinom(1, sum(X[,k]), p[,k])) + }else{ + p <- Boots.pop_inc(Y) + boot.X=sapply(1:dim(Y)[2],function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, Y[1,k], p[i,1]) )) + } + boot.D1_alpha=sum( sapply(1:no.community,function(k) w[k]*entropy_MEE_equ(boot.X[,k])) ) + boot.Horn[h]=( entropy_MEE_equ(rowSums(boot.X))-boot.D1_alpha)/sum( -w*log(w)) + } + C1n=1-Horn + C1n_se=sd(boot.Horn) + a=c( C1n, C1n_se, max(0,C1n-1.96*C1n_se), min(1,C1n+1.96*C1n_se)) + return(a) + } + } +} +C2n_equ=function(method=c("MVUE","MLE"),X) +{ + X=as.matrix(X) + no.community=length(X[1,]) + sobs=sum(rowSums(X)>0) + n=colSums(X) + temp1=X%*%matrix(c(1/n),no.community,1) + temp2=sum(sapply(1:no.community,function(k) sum((X[,k]/n[k])^2) )) + if(method=="MVUE") + { + c2n_dem=sum(sapply(1:no.community,function(k) sum(X[,k]*(X[,k]-1)/n[k]/(n[k]-1)))) + } + if(method=="MLE") + { + c2n_dem=sum(sapply(1:no.community,function(k) sum(X[,k]*(X[,k])/n[k]/(n[k])))) + } + c2n_num=(sum(temp1^2)-temp2)/(no.community-1) + c2n=c2n_num/c2n_dem + return(c2n) +} +Cqn_se_equ=function(X,q=2,boot=200,method=c("relative","absolute")) +{ + X=as.matrix(X) + no.community=length(X[1,]) + n=colSums(X) + p_hat=sapply(1:no.community,function(j) X[,j]/n[j]) + + if(q==0) + { + C0n=C0n_equ(X) + boot.C0n=rep(0,boot) + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(j) rmultinom(1,n[j],p_hat[,j]) ) + boot.C0n[h]=C0n_equ(boot.X) + } + C0n_se=sd(boot.C0n) + a=c(C0n,C0n_se,max(0,C0n-1.96*C0n_se),min(1,C0n+1.96*C0n_se) ) + return(a) + } + if(q==1) + { + a=C1n_equ(method,X,boot) + return(a) + } + if(q==2) + { + C2n=C2n_equ(method="MVUE",X) + boot.C2n=rep(0,boot) + if(C2n>1) + { + C2n=C2n_equ(method="MLE",X) + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(j) rmultinom(1,n[j],p_hat[,j]) ) + boot.C2n[h]=C2n_equ(method="MLE",boot.X) + } + } + if(C2n<=1) + { + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(j) rmultinom(1,n[j],p_hat[,j]) ) + boot.C2n[h]=C2n_equ(method="MVUE",boot.X) + } + } + C2n_se=sd(boot.C2n) + Bootmean=mean(boot.C2n) + a=c(C2n,C2n_se,max(0,C2n-Bootmean+quantile(boot.C2n, probs = 0.025)),min(1,C2n-Bootmean+quantile(boot.C2n, probs = 0.975)), + max(0,1-C2n-(1-Bootmean)+(1-quantile(boot.C2n, probs = 0.975))),min(1,1-C2n-(1-Bootmean)+(1-quantile(boot.C2n, probs = 0.025))) + ) + return(a) + } +} +C33_equ=function(method=c("MVUE","MLE"),X) +{ + X=as.matrix(X) + n=colSums(X) + if(method=="MVUE") + { + C33_num=sum( 3*X[,1]*(X[,1]-1)/n[1]/(n[1]-1)*(X[,2]/n[2]+X[,3]/n[3])+ + 3*X[,2]*(X[,2]-1)/n[2]/(n[2]-1)*(X[,1]/n[1]+X[,3]/n[3])+ + 3*X[,3]*(X[,3]-1)/n[3]/(n[3]-1)*(X[,1]/n[1]+X[,2]/n[2])+ + 6*X[,1]/n[1]*X[,2]/n[2]*X[,3]/n[3])/8 + C33_dem=sum(sapply(1:3,function(k) sum( X[,k]*(X[,k]-1)*(X[,k]-2)/n[k]/(n[k]-1)/(n[k]-2)))) + } + if(method=="MLE") + { + C33_num=sum( 3*X[,1]*(X[,1])/n[1]/(n[1])*(X[,2]/n[2]+X[,3]/n[3])+ + 3*X[,2]*(X[,2])/n[2]/(n[2])*(X[,1]/n[1]+X[,3]/n[3])+ + 3*X[,3]*(X[,3])/n[3]/(n[3])*(X[,1]/n[1]+X[,2]/n[2])+ + 6*X[,1]/n[1]*X[,2]/n[2]*X[,3]/n[3])/8 + C33_dem=sum(sapply(1:3,function(k) sum( X[,k]*(X[,k])*(X[,k])/n[k]/(n[k])/(n[k])))) + } + return(C33_num/C33_dem) +} +C33_se_equ=function(X,boot=200) +{ + X=as.matrix(X) + no.community=length(X[1,]) + C33=C33_equ(method="MVUE",X) + n=colSums(X) + p_hat=sapply(1:no.community,function(j) X[,j]/n[j]) + boot.C33=rep(0,boot) + if(C33>1) + { + C33=C33_equ(method="MLE",X) + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(j) rmultinom(1,n[j],p_hat[,j]) ) + boot.C33[h]=C33_equ(method="MLE",boot.X) + } + } + if(C33<=1) + { + for(h in 1:boot) + { + boot.X=sapply(1:no.community,function(j) rmultinom(1,n[j],p_hat[,j]) ) + boot.C33[h]=C33_equ(method="MVUE",boot.X) + } + } + C33_se=sd(boot.C33) + Bootmean=mean(boot.C33) + a=c(C33,C33_se,max(0,C33-Bootmean+quantile(boot.C33, probs = 0.025)),min(1,C33-Bootmean+quantile(boot.C33, probs = 0.975)), + max(0,1-C33-(1-Bootmean)+(1-quantile(boot.C33, probs = 0.975))),min(1,1-C33-(1-Bootmean)+(1-quantile(boot.C33, probs = 0.025))) + ) + return(a) +} + +#Multiple_Community_Measure=function(X,q=2,boot=200,method=c("relative","absolute")) + +print.spadeMult <- function(x, ...){ + if(x$datatype=="abundance"){ + cat('\n(1) BASIC DATA INFORMATION:\n\n') + cat(' The loaded set includes abundance/incidence data from',x$info[1],'communities\n') + cat(' and a total of',x$info[2],'species.\n\n') + cat(' Sample size in each community n1 =', x$info[3],'\n') + N <- x$info[1] + q <- x$q + method <- x$goal + for(j in 2:N){ + cat(' ','n') + cat(j,' =',x$info[2+j],'\n') + } + cat('\n') + cat(' Number of observed species in one community D1 =', x$info[N+3],'\n') + + for(j in 2:N){ + cat(' ','D') + cat(j,' =',x$info[N+2+j],'\n') + } + cat('\n') + cat(' Number of observed shared species in two communities D12 =', x$info[3+2*N], '\n') + + if(N>2){ + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + if(i==1 & j==2) next + cat(' ','D') + cat(i,j,' = ', x$info[3+2*N+k], '\n', sep="") + k <- k + 1 + } + } + } + cat('\n') + if(N==3) + { + cat(' Number of observed shared species in three communities D123 =',rev(x$info)[2],'\n\n') + } + cat(' Number of bootstrap replications for s.e. estimate ',rev(x$info)[1],'\n\n') + cat('(2) EMPIRICAL SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$Empirical_richness), 2, as.numeric) + cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n') + cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$Empirical_relative), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$Empirical_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$Empirical_absolute), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat('(3) ESTIMATED SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$estimated_richness), 2, as.numeric) + if(temp[1,1]>1) {cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[1,1]<=1){cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[2,1]>1) {cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",1) ,' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + if(temp[2,1]<=1){cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + #cat(' Lennon et al (2001) ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$estimated_relative), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$estimated_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$estimated_absolute), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + + }else{ + + cat('\n(1) BASIC DATA INFORMATION:\n\n') + cat(' The loaded set includes abundance/incidence data from',x$info[[1]],'communities\n') + cat(' and a total of',x$info[2],'species.\n\n') + + cat(' Number of sample units in each community T1 =', x$info[3],'\n') + N <- x$info[1] + q <- x$q + method <- x$goal + for(j in 2:N){ + cat(' ','T') + cat(j,' =',x$info[2+j],'\n') + } + cat('\n') + cat(' Number of total incidences in each community U1 =', x$info[N+3],'\n') + N <- x$info[1] + q <- x$q + method <- x$goal + for(j in 2:N){ + cat(' ','U') + cat(j,' =',x$info[N+2+j],'\n') + } + cat('\n') + cat(' Number of observed species in one community D1 =', x$info[2*N+3],'\n') + for(j in 2:N){ + cat(' ','D') + cat(j,' =',x$info[2*N+2+j],'\n') + } + cat('\n') + cat(' Number of observed shared species in two communities D12 =', x$info[3+3*N], '\n') + + if(N>2){ + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + if(i==1 & j==2) next + cat(' ','D') + cat(i,j,' = ', x$info[3+3*N+k], '\n', sep="") + k <- k + 1 + } + } + } + cat('\n') + if(N==3) + { + cat(' Number of observed shared species in three communities D123 =',rev(x$info)[2],'\n\n') + } + cat(' Number of bootstrap replications for s.e. estimate ',rev(x$info)[1],'\n\n') + cat('(2) EMPIRICAL SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$Empirical_richness), 2, as.numeric) + cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n') + cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$Empirical_relative), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$Empirical_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$Empirical_absolute), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat('(3) ESTIMATED SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$estimated_richness), 2, as.numeric) + if(temp[1,1]>1) {cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[1,1]<=1){cat(' C0');cat(N);cat(' (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[2,1]>1) {cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",1) ,' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + if(temp[2,1]<=1){cat(' U0');cat(N);cat(' (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + #cat(' Lennon et al (2001) ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$estimated_relative), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$estimated_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$estimated_absolute), 2, as.numeric) + cat(' C1');cat(N);cat('=U1');cat(N);cat(' (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C2');cat(N);cat(' (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U2');cat(N);cat(' (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + } + cat('(4) ESTIMATED PAIRWISE SIMILARITY:\n\n') + if(q == 0){ + cat(' -------------------------Measure C02---------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + Cqn_PC <- x$pairwise$C02 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' C02(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise similarity=',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$C02 + cat(' C02(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + cat(' -------------------------Measure U02---------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + Cqn_PC <- x$pairwise$U02 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' U02(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + + cat('\n') + cat(' Average pairwise similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$U02 + cat(' U02(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + } + if(q == 1){ + cat(' ----------------------Measure C12 (=U12)------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + ###################################################CqN_ Equal weight + Cqn_PC <- x$pairwise$C12 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' C12(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise similarity=',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$C12 + cat(' C12(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + if(method=="relative"){ + cat(' -------------------Measure Horn size-weighted--------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + Cqn_PC <- x$pairwise$Horn + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' Horn(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + + cat('\n') + cat(' Average pairwise similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$Horn + + cat(' Horn(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + } + } + if(q == 2){ + cat(' -------------------------Measure C22---------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + Cqn_PC <- x$pairwise$C22 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' C22(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise similarity=',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$C22 + cat(' C22(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + ###################################################UqN_ Equal weight + cat(' -------------------------Measure U22---------------------------\n\n') + cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') + Cqn_PC <- x$pairwise$U22 + no.temp=1 + for(i in 1:(N-1)) + { + for(j in (i+1):N) + { + temp=Cqn_PC[no.temp,] + cat(' U22(') + cat(i) + cat(',') + cat(j) + if(temp[1]>1) + {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') + } + if(temp[1]<=1) + {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', + sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} + no.temp=no.temp+1 + } + } + cat('\n') + cat(' Average pairwise similarity =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n\n') + cat(' Pairwise similarity matrix: \n\n') + C_SM=x$similarity.matrix$U22 + cat(' U22(i,j) ') + for(i in 1:N) + { + cat(i," ") + } + cat('\n') + for(i in 1:N) + { + cat(' ',i,' ') + for(j in 1:N) + { + if(i>j){cat(' ')} + if(i<=j) { + if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' ') + else cat(sprintf("%.3f#",1),' ') + } + } + cat('\n') + } + cat('\n') + } + + cat(' NOTE: Any estimate greater than 1 is replaced by 1; any estimate less than 0 is replaced by 0.') + #cat(' + # References: + # + # Chao, A., Jost, L., Chiang, S. C., Jiang, Y.-H. and Chazdon, R. (2008). A Two- + # stage probabilistic approach to multiple-community similarity indices. + # Biometrics, 64, 1178-1186. + # + # Jost, L. (2008). GST and its relatives do not measure differentiation. Molecular + # Ecology, 17, 4015-4026. + # ') + cat('\n') +} + + + + +#print.spadeMult <- function(x, ...){ +# cat('\n(1) BASIC DATA INFORMATION:\n\n') +# cat(' The loaded set includes abundance (or frequency) data from',x$info[1],'communities\n') +# cat(' and a total of',x$info[2],'distinct species.\n\n') +# cat(' (Number of observed individuals in each community) n1 =', x$info[3],'\n') +# N <- x$info[1] +# q <- x$q +# for(j in 2:N){ +# cat(' ','n') +# cat(j,'=',x$info[2+j],'\n') +# } +# cat('\n') +# cat(' (Number of observed species in one community) D1 =', x$info[N+3],'\n') +# +# for(j in 2:N){ +# cat(' ','D') +# cat(j,'=',x$info[N+2+j],'\n') +# } +# cat('\n') +# cat(' (Number of observed shared species in two communities) ','D12 =', x$info[3+2*N], '\n') +# +# if(N>2){ +# k <- 1 +# for(i in 1:(N-1)){ +# for(j in (i+1):N){ +# if(i==1 & j==2) next +# cat(' ','D') +# cat(i,j,' = ', x$info[3+2*N+k], '\n', sep="") +# k <- k + 1 +# } +# } +# } +# cat('\n') +# if(N==3) +# { +# cat(' (Number of observed shared species in three communities)','D123','=',rev(x$info)[2],'\n\n') +# } +# cat(' (Bootstrap replications for s.e. estimate) ',rev(x$info)[1],'\n\n') +# cat('(2) ESTIMATION OF OVERLAP MEASURE IN',N,'COMMUNITIES:\n\n') +# cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') +# temp0n=x$overlap[1,] +# if(temp0n[1]>1) +# { +# cat(' C0') +# cat(N,'(Sorensen)',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp0n[2]),' (', +# sprintf("%.3f",temp0n[3]),',',sprintf("%.3f",temp0n[4]),')\n') +# } +# if(temp0n[1]<=1) +# { +# cat(' C0') +# cat(N,'(Sorensen)',sprintf("%.3f",temp0n[1]),' ',sprintf("%.3f",temp0n[2]),' (', +# sprintf("%.3f",temp0n[3]),',',sprintf("%.3f",temp0n[4]),')\n') +# } +# #temp1n=x$overlap[2,] +# #cat(' C1') +# #cat(N,' ',sprintf("%.3f",temp1n[1]),' ',sprintf("%.3f",temp1n[2]),' (', +# # sprintf("%.3f",temp1n[3]),',',sprintf("%.3f",temp1n[4]),')\n') +# temp1n=x$overlap[2,] +# if(temp1n[1]>1) +# { +# cat(' C1') +# cat(N) +# cat('*(Horn) ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp1n[2]),' (', +# sprintf("%.3f",temp1n[3]),',',sprintf("%.3f",temp1n[4]),')\n') +# } +# if(temp1n[1]<=1) +# { +# cat(' C1') +# cat(N) +# cat('*(Horn) ',sprintf("%.3f",temp1n[1]),' ',sprintf("%.3f",temp1n[2]),' (', +# sprintf("%.3f",temp1n[3]),',',sprintf("%.3f",temp1n[4]),')\n') + +# } + +# #cat('* ',sprintf("%.3f",1-temp1n[1]),' ',sprintf("%.3f",temp1n[2]),' (', +# # sprintf("%.3f",max(1-temp1n[1]-1.96*temp1n[2],0)),',',sprintf("%.3f",min(1-temp1n[1]+1.96*temp1n[2],1)),')\n') +# temp2n=x$overlap[3,] +# cat(' C2') +# cat(N,'(Morisita)',sprintf("%.3f",temp2n[1]),' ',sprintf("%.3f",temp2n[2]),' (', +# sprintf("%.3f",temp2n[3]),',',sprintf("%.3f",temp2n[4]),')\n') + +# if(N==3) +# { +# temp33=x$overlap[4,] +# cat(' C33',' ',sprintf("%.3f",temp33[1]),' ',sprintf("%.3f",temp33[2]),' (', +# sprintf("%.3f",temp33[3]),',',sprintf("%.3f",temp33[4]),')\n') +# } +# cat('\n') +# cat(' C0') +# cat(N,': A similarity measure of comparing',N, +# 'communities using empirical method.\n') +# #cat(' C1') +# #cat(N,': A similarity measure of comparing',N, +# # 'communities based on equal sample size among all communities.\n') +# cat(' C1') +# cat(N) +# cat('*') +# cat(': A similarity measure of comparing',N, +# 'communities based on equal-effort sample size among all communities.\n') +# cat(' C2') +# cat(N,': A similarity measure of comparing',N,'communities based on shared information between any two communities.\n') +# if(N==3) +# { +# cat(' C33',': A similarity measure of comparing 3 communities using all shared information.\n') +# } +# cat(' +# Confidence Interval: Based on an improved bootstrap percentile method. (recommend for use in the case when +# similarity is close to 0 or 1 ) \n\n') +# cat(' # if the estimate is greater than 1, it is replaced by 1.\n\n') +# cat(' Pairwise Comparison:\n\n') +# cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') +# Cqn_PC <- x$pairwise +# no.temp=1 +# for(i in 1:(N-1)) +# { +# for(j in (i+1):N) +# { +# temp=Cqn_PC[no.temp,] +# if(q==0){cat(' C02(')} +# if(q==1 & x$method=="relative"){cat(' C12(')} +# if(q==1 & x$method=="absolute"){cat(' C12*(')} +# if(q==2){cat(' C22(')} +# cat(i) +# cat(',') +# cat(j) +# if(temp[1]>1) +# {cat(')',' ',sprintf("%.3f",1) ,'# ',sprintf("%.3f",temp[2]),' (', +# sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n') +# } +# if(temp[1]<=1) +# {cat(')',' ',sprintf("%.3f",temp[1]),' ',sprintf("%.3f",temp[2]),' (', +# sprintf("%.3f",temp[3]),',',sprintf("%.3f",temp[4]),')\n')} +# no.temp=no.temp+1 +# } +# } + +# cat('\n') +# cat(' Average Pairwise =',sprintf("%.3f",mean(Cqn_PC[,1])),'\n') +# if(q==0 || q==1) +# { +# cat(' +# If the lower bound is less than 0, it is replaced by 0; if the upper bound +# is greater than 1, it is replaced by 1.\n\n') +# } +# if(q==2) +# { +# cat(' +# MLE is used for replacing nearly unbiased estimate because the estimate +# is greater than 1. +# If the lower bound is less than 0, it is replaced by 0; if the upper bound +# is greater than 1, it is replaced by 1.\n\n') +# } +# cat(' Similarity Matrix: \n\n') +# C_SM=x$similarity.matrix +# +# if(q==0){cat(' C02(i,j) \t')} +# if(q==1 & x$method=="relative"){cat(' C12(i,j) \t')} +# if(q==1 & x$method=="absolute"){cat(' C12*(i,j) ')} +# if(q==2){cat(' C22(i,j) \t')} +# for(i in 1:N) +# { +# cat(i,'\t') +# } +# cat('\n') +# for(i in 1:N) +# { +# cat(' ',i,'\t') +# for(j in 1:N) +# { +# if(i>j){cat('\t')} +# if(i<=j) { +# if(C_SM[i,j]<=1) cat(sprintf("%.3f",C_SM[i,j]),' \t') +# else cat(sprintf("%.3f#",1),'\t') +# } +# } +# cat('\n') +# } +# if(q==2) +# { +# cat('\n') +# cat('(3) ESTIMATION OF MORISITA DISSIMILARITY IN',N,'COMMUNITIES\n\n') +# cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') +# cat(' 1 - C2') +# cat(N,' ',sprintf("%.3f",1-temp2n[1]),' ',sprintf("%.3f",temp2n[2]),' (', +# sprintf("%.3f",max(0,1-temp2n[1]-1.96*temp2n[2])),',',sprintf("%.3f",min(1,1-temp2n[1]+1.96*temp2n[2])),')\n') +# if(N==3) +# { +# cat(' 1 - C33',' ',sprintf("%.3f",1-temp33[1]),' ',sprintf("%.3f",temp33[2]),' (', +# sprintf("%.3f",max(0,1-temp33[1]-1.96*temp33[2])),',',sprintf("%.3f",min(1,1-temp33[1]+1.96*temp33[2])),')\n') +# } +# cat('\n') +# cat(' 1 - C2') +# cat(N,': This is the genetic diversity measure D defined in Jost (2008) for +# comparing',N,'communities.\n') +# if(N==3) +# { +# cat(' 1 - C33',': A genetic diversity measure for comparing 3 subpopulations based on +# all shared information.\n') +# } +# cat('\n') +# cat(' Pairwise Comparison:\n\n') +# cat(' Estimator',' Estimate',' s.e.',' 95% Confidence Interval\n\n') +# no.temp=1 +# for(i in 1:(N-1)) +# { +# for(j in (i+1):N) +# { +# temp=Cqn_PC[no.temp,] +# cat(' 1-C22(') +# cat(i) +# cat(',') +# cat(j) +# cat(')',' ',sprintf("%.3f",1-temp[1]),' ',sprintf("%.3f",temp[2]),' (', +# sprintf("%.3f",temp[5]),',',sprintf("%.3f",temp[6]),')\n') +# no.temp=no.temp+1 +# } +# } +# cat('\n') +# cat(' Average Pairwise =',sprintf("%.3f",1-mean(Cqn_PC[,1])),'\n') +# cat(' +# MLE is used for replacing nearly unbiased estimate because the estimate +# is greater than 1. +# If the lower bound is less than 0, it is replaced by 0; if the upper bound +# is greater than 1, it is replaced by 1.\n\n') +# cat(' 1-C22: This is the genetic diversity measure D defined in Jost (2008) for +# comparing 2 subpopulations.') +# cat('\n\n') +# cat(' Dissimilarity Matrix: \n\n') +# cat(' 1-C22(i,j)\t') +# for(i in 1:N) +# { +# cat(i,'\t') +# } +# cat('\n') +# for(i in 1:N) +# { +# cat(' ',i,'\t') +# for(j in 1:N) +# { +# if(i>j){cat('\t')} +# if(i<=j){ +# if(C_SM[i,j]<=1) cat(sprintf("%.3f",1-C_SM[i,j]),' \t') +# else cat(sprintf("%.3f#",1),' \t') +# } +# } +# cat('\n') +# } +# cat('\n') +# } +# cat(' +# References: + +# Chao, A., Jost, L., Chiang, S. C., Jiang, Y.-H. and Chazdon, R. (2008). A Two- +# stage probabilistic approach to multiple-community similarity indices. +# Biometrics, 64, 1178-1186. +# +# Jost, L. (2008). GST and its relatives do not measure differentiation. Molecular +# Ecology, 17, 4015-4026. +# ') +# cat('\n') +# } + + + diff --git a/R/PanEstFun.R b/R/PanEstFun.R new file mode 100644 index 0000000..6e614de --- /dev/null +++ b/R/PanEstFun.R @@ -0,0 +1,23 @@ +PanEstFun <- +function(x1, x2) { + n1 <- sum(x1) + n2 <- sum(x2) + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + if (f2p == 0 || fp2 == 0 || f22 == 0) { + est <- D12 + K1 * f1p * (f1p - 1) / 2 / (f2p + 1) + + K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1) + + K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1) + } else { + est <- D12 + K1 * f1p^2 / 2 / f2p + K2 * fp1^2 / 2 / fp2 + + K1 * K2 * f11^2 / 4 / f22 + } + return(est) +} diff --git a/R/PanEstFun.Sam.R b/R/PanEstFun.Sam.R new file mode 100644 index 0000000..42114a7 --- /dev/null +++ b/R/PanEstFun.Sam.R @@ -0,0 +1,25 @@ +PanEstFun.Sam <- +function(y1, y2) { + n1 <- y1[1] + n2 <- y2[1] + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + if (f2p == 0 || fp2 == 0 || f22 == 0) { + est <- D12 + K1 * f1p * (f1p - 1) / 2 / (f2p + 1) + + K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1) + + K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1) + } else { + est <- D12 + K1 * f1p^2 / 2 / f2p + K2 * fp1^2 / 2 / fp2 + + K1 * K2 * f11^2 / 4 / f22 + } + return(est) +} diff --git a/R/PanFun.R b/R/PanFun.R new file mode 100644 index 0000000..b7f14a7 --- /dev/null +++ b/R/PanFun.R @@ -0,0 +1,18 @@ +PanFun <- +function(x1, x2, conf=0.95) { + f22 <- sum(x1 == 2 & x2 == 2) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + est <- PanEstFun(x1, x2) + if (f2p == 0 || fp2 == 0 || f22 == 0) { + se <- VarEstFun(x1, x2, diffFun=diff_Panbc, FunName=PanbcEstFun) + } else { + se <- VarEstFun(x1, x2, diffFun=diff_Pan, FunName=PanEstFun) + } + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Lower-bound") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/PanFun.Sam.R b/R/PanFun.Sam.R new file mode 100644 index 0000000..708e647 --- /dev/null +++ b/R/PanFun.Sam.R @@ -0,0 +1,20 @@ +PanFun.Sam <- +function(y1, y2, conf=0.95) { + x1 <- y1[-1] + x2 <- y2[-1] + f22 <- sum(x1 == 2 & x2 == 2) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + est <- PanEstFun.Sam(y1, y2) + if (f2p == 0 || fp2 == 0 || f22 == 0) { + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Panbc, FunName=PanbcEstFun.Sam) + } else { + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Pan, FunName=PanEstFun.Sam) + } + CI <- logCI(y1, y2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Lower-bound") + colnames(out) <- c("Estimate", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/PanbcEstFun.R b/R/PanbcEstFun.R new file mode 100644 index 0000000..3462b26 --- /dev/null +++ b/R/PanbcEstFun.R @@ -0,0 +1,18 @@ +PanbcEstFun <- +function(x1, x2) { + n1 <- sum(x1) + n2 <- sum(x2) + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + est <- D12 + K1 * f1p * (f1p - 1) / 2 / (f2p + 1) + + K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1) + + K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1) + return(est) +} diff --git a/R/PanbcEstFun.Sam.R b/R/PanbcEstFun.Sam.R new file mode 100644 index 0000000..cebb2fd --- /dev/null +++ b/R/PanbcEstFun.Sam.R @@ -0,0 +1,20 @@ +PanbcEstFun.Sam <- +function(y1, y2) { + n1 <- y1[1] + n2 <- y2[1] + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + est <- D12 + K1 * f1p * (f1p - 1) / 2 / (f2p + 1) + + K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1) + + K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1) + return(est) +} diff --git a/R/PanbcFun.R b/R/PanbcFun.R new file mode 100644 index 0000000..1e414c5 --- /dev/null +++ b/R/PanbcFun.R @@ -0,0 +1,11 @@ +PanbcFun <- +function(x1, x2, conf=0.95) { + est <- PanbcEstFun(x1, x2) + se <- VarEstFun(x1, x2, diffFun=diff_Panbc, FunName=PanbcEstFun) + CI <- logCI(x1, x2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Lower-bound-bc") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/PanbcFun.Sam.R b/R/PanbcFun.Sam.R new file mode 100644 index 0000000..2991003 --- /dev/null +++ b/R/PanbcFun.Sam.R @@ -0,0 +1,11 @@ +PanbcFun.Sam <- +function(y1, y2, conf=0.95) { + est <- PanbcEstFun.Sam(y1, y2) + se <- VarEstFun.Sam(y1, y2, diffFun=diff_Panbc, FunName=PanbcEstFun.Sam) + CI <- logCI(y1, y2, est, se, conf) + out <- matrix(c(est, se, CI), nrow = 1) + rownames(out) <- c("Lower-bound-bc") + colnames(out) <- c("Estimator", "Est_s.e.", + paste(conf*100, "% Lower"), paste(conf*100, "% Upper")) + return(out) +} diff --git a/R/Q.R b/R/Q.R new file mode 100644 index 0000000..a30fd1c --- /dev/null +++ b/R/Q.R @@ -0,0 +1,2 @@ +Q <- +function(i, data){length(data[which(data == i)])} diff --git a/R/RareSpeciesGroup.R b/R/RareSpeciesGroup.R new file mode 100644 index 0000000..f42397b --- /dev/null +++ b/R/RareSpeciesGroup.R @@ -0,0 +1,14 @@ +RareSpeciesGroup <- +function(data, k){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + + x <- data[which(data != 0)] + r <- c(1:k) + Rare.Species.Group <- matrix(sapply(r, function(r)f(r, x)), 1, k) + #rownames(Rare.Species.Group) <- c(" f(i)") + #colnames(Rare.Species.Group) <- c(1:k) + rownames(Rare.Species.Group) <- c(" Rare.Species.Group") + colnames(Rare.Species.Group) <- paste("f", r, sep="") + return(Rare.Species.Group) +} diff --git a/R/RareSpeciesGroupInprove.R b/R/RareSpeciesGroupInprove.R new file mode 100644 index 0000000..789d049 --- /dev/null +++ b/R/RareSpeciesGroupInprove.R @@ -0,0 +1,21 @@ +RareSpeciesGroup <- function(data, k){ + if (is.matrix(data) == T || is.data.frame(data) == T){ + if (ncol(data) != 1 & nrow(data) != 1) + stop("Error: The data format is wrong.") + if (ncol(data) == 1){ + data <- data[, 1] + } else { + data <- data[1, ] + } + } + + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + + x <- data[which(data != 0)] + r <- c(1:k) + Rare.Species.Group <- matrix(sapply(r, function(r)f(r, x)), 1, k) + rownames(Rare.Species.Group) <- c(" Frequency counts") + colnames(Rare.Species.Group) <- paste("f", r, sep="") + return(Rare.Species.Group) +} diff --git a/R/SortDataFun.R b/R/SortDataFun.R new file mode 100644 index 0000000..a5d2581 --- /dev/null +++ b/R/SortDataFun.R @@ -0,0 +1,19 @@ +SortDataFun <- +function(x1, x2) { + D1 <- sum(x1 > 0); D2 <- sum(x2 > 0) + D12 <- sum(x1 > 0 & x2 > 0) + sit <- order(x1 > 0 & x2 > 0, decreasing=T) + # set.seed(123) + # sit <- sample(sit[1:D12], replace=F) # random sample + common <- cbind(x1[sit[1:D12]], x2[sit[1:D12]]) + sit1 <- order(x1 > 0 & x2 == 0, decreasing=T) + set.seed(123) + sit1 <- sample(sit1[1:(D1-D12)], replace=F) # random sample + special1 <- cbind(x1[sit1[1:(D1-D12)]], x2[sit1[1:(D1-D12)]]) + sit2 <- order(x1 == 0 & x2 > 0, decreasing=T) + set.seed(123) + sit2 <- sample(sit2[1:(D2-D12)], replace=F) # random sample + special2 <- cbind(x1[sit2[1:(D2-D12)]], x2[sit2[1:(D2-D12)]]) + z <- rbind(common, special1, special2) + return(z) +} diff --git a/R/SpecAbunAce.R b/R/SpecAbunAce.R new file mode 100644 index 0000000..2f9a24f --- /dev/null +++ b/R/SpecAbunAce.R @@ -0,0 +1,161 @@ +SpecAbunAce <- + function(data, k=10, conf=0.95){ + data <- as.numeric(data) + + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_ACE <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + if(a2>0){temp=D_rare/C_rare*a1/a2/(a2 - 1) - 1} + else{temp=0} + gamma_rare_hat_square <- max(temp, 0) + }else{ + gamma_rare_hat_square <- 0 + } + ####################### 2016 05 05 ####################### + if (C_rare==0) {C_rare=1} + ####################### end ####################### + S_ace <- D_abun + D_rare/C_rare + f(1, x)/C_rare*gamma_rare_hat_square + return(list(S_ace, gamma_rare_hat_square)) + } + s_ace <- S_ACE(x, k)[[1]] + gamma_rare_hat_square <- S_ACE(x, k)[[2]] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (gamma_rare_hat_square != 0){ + si <- sum(sapply(u, function(u)u*(u - 1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(D_rare*si + f(1, x)*si) - + f(1, x)*D_rare*si*(-2*(1 - f(1, x)/n_rare)*(n_rare - f(1, x))/n_rare^2*n_rare*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*(2*n_rare - 1)) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 - #g2 + (1 - f(1, x)/n_rare + f(1, x)*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g3 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)*(si + D_rare*q*(q - 1)) - + f(1, x)*D_rare*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + #g2 + (q*(f(1, x))^2/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g3 + } + return(d) + } else { + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } + return(d) + } + } + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_ace) + } else { + cov.f <- -f(i, x)*f(j, x)/s_ace + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ace <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + ############################ 2016 05 05 ############################################# + if ( is.na(var_ace)>0 ) {var_ace <- NA} + ############################ end ############################################# + else if (var_ace > 0){ + var_ace <- var_ace + } else { + var_ace <- NA + } + ###################### + t <- round(s_ace - D, 5) + if (is.nan(t) == F){ + if (t != 0){ + ############################# 2016 05 05 ########################################## + if ( is.na(var_ace)>0 ) {C <- NA; CI_ACE <- c(NaN, NaN)} + else{ + C <- exp(z*sqrt(log(1 + var_ace/(s_ace - D)^2))) + CI_ACE <- c(D + (s_ace - D)/C, D + (s_ace - D)*C) + } + ############################# end ########################################## + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_ace <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_ACE <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_ACE <- c(NaN, NaN) + } + if ( is.na(var_ace)>0 ) {sd <- NA} + else { sd=sqrt(var_ace) } + table <- matrix(c(s_ace, sd, CI_ACE), ncol = 4) + if( (sum(x==1)/sum(x[x<=k])) == 1 ){table=SpecAbunChao1(x, k, conf)} + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "ACE (Chao & Lee, 1992)" + return(table) + } diff --git a/R/SpecAbunAce1.R b/R/SpecAbunAce1.R new file mode 100644 index 0000000..ac4476d --- /dev/null +++ b/R/SpecAbunAce1.R @@ -0,0 +1,181 @@ +SpecAbunAce1 <- + function(data ,k=10, conf=0.95){ + data <- as.numeric(data) + + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_ACE1 <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + ############################# 2016 05 05 ########################################## + if (C_rare==0) {C_rare=1} + ############################# end ########################################## + s_ace1 <- D_abun + D_rare/C_rare + f(1, x)/C_rare*gamma_rare_1_square + + return(list(s_ace1, gamma_rare_1_square)) + } + s_ace1 <- S_ACE1(x, k)[[1]] + gamma_rare_1_square <- S_ACE1(x, k)[[2]] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (gamma_rare_1_square != 0){ + u <- c(1:k) + si <- sum(sapply(u, function(u)u*(u-1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(D_rare*si + f(1, x)*si) - + f(1, x)*D_rare*si*(-2*(1 - f(1, x)/n_rare)*(n_rare - f(1, x))/n_rare^2*n_rare*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*(2*n_rare - 1)) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 - #g2 + (1 - f(1, x)/n_rare + f(1, x)*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g3 + ((1 - f(1, x)/n_rare)^3*(n_rare*(n_rare - 1))^2*(2*f(1, x)*D_rare*si^2 + f(1, x)^2*si^2) - #g4 + f(1, x)^2*D_rare*si^2*(3*(1 - f(1, x)/n_rare)^2*(f(1, x) - n_rare)/(n_rare)^2*(n_rare*(n_rare - 1))^2 + + (1 - f(1, x)/n_rare)^3*2*n_rare*(n_rare - 1)^2 + (1 - f(1, x)/n_rare)^3*n_rare^2*2*(n_rare - 1)) + )/(1 - f(1, x)/n_rare)^6/n_rare^4/(n_rare - 1)^4 - + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(2*f(1, x)*si) - #g5 + f(1, x)^2*si*(2*(1 - f(1, x)/n_rare)*(f(1, x) - n_rare)/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)*(si + D_rare*q*(q - 1)) - + f(1, x)*D_rare*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + #g2 + (q*(f(1, x))^2/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g3 + ((1 - f(1, x)/n_rare)^3*n_rare^2*(n_rare - 1)^2*f(1, x)^2*(si^2 + 2*D_rare*si*q*(q - 1)) - #g4 + f(1, x)^2*D_rare*si^2*(3*(1 - f(1, x)/n_rare)^2*(f(1, x)*q/n_rare^2)*(n_rare*(n_rare - 1))^2 + + 2*(1 - f(1, x)/n_rare)^3*n_rare*q*(n_rare - 1)^2 + 2*(1 - f(1, x)/n_rare)^3*n_rare^2*(n_rare - 1)*q) + )/(1 - f(1, x)/n_rare)^6/(n_rare)^4/(n_rare - 1)^4 - + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)^2*q*(q - 1) - #g5 + f(1, x)^2*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + } + return(d) + } else { + u <- c(1:k) + si <- sum(sapply(u, function(u)u*(u-1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } + return(d) + } + } + + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_ace1) + } else { + cov.f <- -f(i, x)*f(j, x)/s_ace1 + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ace1 <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + ############################ 2016 05 05 ############################################# + if ( is.na(var_ace1)>0 ) {var_ace1 <- NA} + ############################ end ############################################# + else if (var_ace1 > 0){ + var_ace1 <- var_ace1 + } else { + var_ace1 <- NA + } + ###################### + t <- round(s_ace1 - D, 5) + if (is.nan(t) == F){ + if (t != 0){ + ############################# 2016 05 05 ########################################## + if ( is.na(var_ace1)>0 ) {C <- NA; CI_ACE1 <- c(NaN, NaN)} + else{ + C <- exp(z*sqrt(log(1 + var_ace1/(s_ace1 - D)^2))) + CI_ACE1 <- c(D + (s_ace1 - D)/C, D + (s_ace1 - D)*C) + } + ############################# end ########################################## + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_ace1 <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_ACE1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_ACE1 <- c(NaN, NaN) + } + + table <- matrix(c(s_ace1, sqrt(var_ace1), CI_ACE1), ncol = 4) + if( (sum(x==1)/sum(x[x<=k])) == 1 ){table=SpecAbunChao1(x, k, conf)} + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "ACE-1 (Chao & Lee, 1992)" + return(table) + } diff --git a/R/SpecAbunChao1.R b/R/SpecAbunChao1.R new file mode 100644 index 0000000..83103f5 --- /dev/null +++ b/R/SpecAbunChao1.R @@ -0,0 +1,89 @@ +SpecAbunChao1 <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + if (f(1, x) > 0 & f(2, x) > 0){ + S_Chao1 <- D + (n - 1)/n*f(1, x)^2/(2*f(2, x)) + var_Chao1 <- f(2, x)*((n - 1)/n*(f(1, x)/f(2, x))^2/2 + + ((n - 1)/n)^2*(f(1, x)/f(2, x))^3 + ((n - 1 )/n)^2*(f(1, x)/f(2, x))^4/4) + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } else if (f(1, x) > 1 & f(2, x) == 0){ + S_Chao1 <- D + (n - 1)/n*f(1, x)*(f(1, x) - 1)/(2*(f(2, x) + 1)) + var_Chao1 <- (n - 1)/n*f(1, x)*(f(1, x) - 1)/2 + + ((n - 1)/n)^2*f(1, x)*(2*f(1, x) - 1)^2/4 - ((n - 1)/n)^2*f(1, x)^4/4/S_Chao1 + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } else { + S_Chao1 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_Chao1 <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_Chao1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + + table <- matrix(c(S_Chao1, sqrt(var_Chao1), CI_Chao1), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Chao1 (Chao, 1984)" + return(table) +} diff --git a/R/SpecAbunChao1bc.R b/R/SpecAbunChao1bc.R new file mode 100644 index 0000000..8d2cf05 --- /dev/null +++ b/R/SpecAbunChao1bc.R @@ -0,0 +1,84 @@ +SpecAbunChao1bc <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basic <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basic(data, k)[[2]] + D <- basic(data, k)[[3]] + n_rare <- basic(data, k)[[4]] + D_rare <- basic(data, k)[[5]] + C_rare <- basic(data, k)[[6]] + CV_rare <- basic(data, k)[[7]] + CV1_rare <- basic(data, k)[[8]] + n_abun <- basic(data, k)[[9]] + D_abun <- basic(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_Chao1_bc <- D + (n - 1)/n*f(1, x)*(f(1, x) - 1)/(2*(f(2, x) + 1)) + + if (f(2, x) > 0){ + var_Chao1_bc <- (n - 1)/n*f(1, x)*(f(1, x) - 1)/2/(f(2, x) + 1) + + ((n - 1)/n)^2*f(1, x)*(2*f(1, x) - 1)^2/4/(f(2, x) + 1)^2 + ((n - 1)/n)^2*f(1, x)^2*f(2, x)*(f(1, x) - 1)^2/4/(f(2, x) + 1)^4 + }else{ + var_Chao1_bc <- (n - 1)/n*f(1, x)*(f(1, x) - 1)/2 + + ((n - 1)/n)^2*f(1, x)*(2*f(1, x) - 1)^2/4 - ((n - 1)/n)^2*f(1, x)^4/4/S_Chao1_bc + } + t <- round(S_Chao1_bc - D, 5) + if (t != 0){ + K <- exp(z*sqrt(log(1 + var_Chao1_bc/t^2))) + CI_Chao1_bc <- c(D + t/K, D + t*K) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_Chao1_bc <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_Chao1_bc <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(S_Chao1_bc, sqrt(var_Chao1_bc), CI_Chao1_bc), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Chao1-bc" + return(table) +} diff --git a/R/SpecAbunHomo.R b/R/SpecAbunHomo.R new file mode 100644 index 0000000..d83990d --- /dev/null +++ b/R/SpecAbunHomo.R @@ -0,0 +1,112 @@ +SpecAbunHomo <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################ + S_HOMO <- function(x, k){ + s_homo <- D_abun + D_rare/C_rare + return(s_homo) + } + s_homo <- S_HOMO(x, k) + #### differential #### + diff <- function(q){ + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + } else if (q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare-D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 + } + return(d) + } + + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1-f(i, x)/s_homo) + } else { + cov.f <- -f(i, x)*f(j, x)/s_homo + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each=length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_mle <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + if(var_mle=="NaN"){var_mle=0} + if (var_mle > 0){ + var_mle <- var_mle + } else { + var_mle <- NA + cat("Warning: In this case, it can't estimate the variance of Homogeneous estimation", "\n\n") + } + ###################### + if (round(s_homo - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_mle/(s_homo - D)^2))) + CI_homo <- c(D + (s_homo - D)/C, D + (s_homo - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_mle <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_homo <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_homo, sqrt(var_mle), CI_homo), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Homogenous Model" + return(table) +} diff --git a/R/SpecAbunHomoMle.R b/R/SpecAbunHomoMle.R new file mode 100644 index 0000000..d14f6ed --- /dev/null +++ b/R/SpecAbunHomoMle.R @@ -0,0 +1,97 @@ +SpecAbunHomoMle <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbunAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbunAbun(data, k)[[2]] + D <- basicAbunAbun(data, k)[[3]] + n_rare <- basicAbunAbun(data, k)[[4]] + D_rare <- basicAbunAbun(data, k)[[5]] + C_rare <- basicAbunAbun(data, k)[[6]] + CV_rare <- basicAbunAbun(data, k)[[7]] + CV1_rare <- basicAbunAbun(data, k)[[8]] + n_abun <- basicAbunAbun(data, k)[[9]] + D_abun <- basicAbunAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_HOMO_MLE <- function(x, k){ + fun <- function(S_MLE){S_MLE - S_MLE*exp(-n/S_MLE) - D} + newton <- function(f, tol=1E-12,x0=1,N=20) { + h <- 0.001 + i <- 1; x1 <- x0 + p <- numeric(N) + while (i<=N) { + df.dx <- (f(x0+h)-f(x0))/h + x1 <- (x0 - (f(x0)/df.dx)) + p[i] <- x1 + i <- i + 1 + if (abs(x1-x0) < tol) break + x0 <- x1 + } + return(p[1:(i-1)]) + } + return(max(newton(fun))) + } + s_homo_mle <- S_HOMO_MLE(x, k) + #### differential #### + var_homo_mle <- s_homo_mle/(exp(n/s_homo_mle) - n/s_homo_mle - 1) + ###################### + t <- round(s_homo_mle - D, 5) + if (t != 0){ + C <- exp(z*sqrt(log(1 + var_homo_mle/(s_homo_mle - D)^2))) + CI_MLE <- c(D + (s_homo_mle - D)/C, D + (s_homo_mle - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_homo_mle <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_MLE <- c(max(D, D/(1 - P)-z*sqrt(var_obs)/(1 - P)), D/(1-P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_homo_mle, sqrt(var_homo_mle), CI_MLE), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Homogeneous (MLE)" + return(table) +} diff --git a/R/SpecAbunJack1.R b/R/SpecAbunJack1.R new file mode 100644 index 0000000..4266a34 --- /dev/null +++ b/R/SpecAbunJack1.R @@ -0,0 +1,109 @@ +SpecAbunJack1 <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_1st_JK <- function(x, k){ + S_1st_jk <- D + (n - 1)/n*f(1, x) + } + s_1st_jk <- S_1st_JK(x, k) + #### differential #### + diff <- function(q){ + if ( q == 1){ + d <- 1 + (n - 1)/n + } else { + d <- 1 + } + return(d) + } + + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_1st_jk) + } else { + cov.f <- -f(i, x)*f(j, x)/s_1st_jk + } + return(cov.f) + } + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_1st <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + if (var_1st > 0){ + var_1st <- var_1st + } else { + var_1st <- NA + cat("Warning: In this case, it can't estimate the variance of 1st-order-jackknife estimation", "\n\n") + } + ###################### + t <- round(s_1st_jk - D, 5) + if (t != 0){ + C <- exp(z*sqrt(log(1 + var_1st/(s_1st_jk - D)^2))) + CI_1st_jk <- c(D + (s_1st_jk - D)/C, D + (s_1st_jk - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_1st <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_1st_jk <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_1st_jk, sqrt(var_1st), CI_1st_jk), ncol = 4) + colnames(table) <- c("Estimator", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "1st order jackknife" + return(table) +} diff --git a/R/SpecAbunJack2.R b/R/SpecAbunJack2.R new file mode 100644 index 0000000..cee418f --- /dev/null +++ b/R/SpecAbunJack2.R @@ -0,0 +1,121 @@ +SpecAbunJack2 <- +function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_2nd_JK <- function(x, k){ + S_2nd_jk <- D + (2*n - 3)/n*f(1, x) - (n - 2)^2/n/(n - 1)*f(2, x) + } + s_2nd_jk <- S_2nd_JK(x, k) + #### differential #### + diff <- function(q){ + if ( q == 1){ + d <- 1 + (2*n - 3)/n + } else if (q == 2){ + d <- 1 - (n-2)^2/n/(n-1) + } else { + d <- 1 + } + return(d) + } + + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_2nd_jk) + } else { + cov.f <- -f(i, x)*f(j, x)/s_2nd_jk + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each=length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_2nd <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + if (var_2nd > 0){ + var_2nd <- var_2nd + } else { + var_2nd <- NA + cat("Warning: In this case, it can't estimate the variance of 2nd-order-jackknife estimation", "\n\n") + } + ###################### + t <- round(s_2nd_jk - D, 5) + if (t > 0){ + C <- exp(z*sqrt(log(1 + var_2nd/(s_2nd_jk - D)^2))) + CI_2nd_jk <- c(D + (s_2nd_jk - D)/C, D + (s_2nd_jk - D)*C) + }else if(t < 0){ + s_2nd_jk <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_2nd <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_2nd_jk <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + }else{ + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_2nd <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_2nd_jk <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_2nd_jk, sqrt(var_2nd), CI_2nd_jk), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "2nd order jackknife" + return(table) +} diff --git a/R/SpecAbunOut.R b/R/SpecAbunOut.R new file mode 100644 index 0000000..bfabd27 --- /dev/null +++ b/R/SpecAbunOut.R @@ -0,0 +1,39 @@ +SpecAbunOut <- +function(data, + method = c("all", "Homogeneous", "Chao", "CE", "Jackknife"), + k, conf){ + data <- as.numeric(data) + method <- match.arg(method) + if (method == "all") { + a <- SpecAbunHomo(data, k, conf) + b <- SpecAbunHomoMle(data, k, conf) + c <- SpecAbunChao1(data, k, conf) + d <- SpecAbunChao1bc(data, k, conf) + e <- SpecAbuniChao1(data, k, conf) + f <- SpecAbunAce(data, k, conf) + g <- SpecAbunAce1(data, k, conf) + h <- SpecAbunJack1(data, k, conf) + i <- SpecAbunJack2(data, k, conf) + out <- rbind(a, b, c, d, e, f, g, h, i) + rownames(out) <- c(" Homogeneous Model", + " Homogeneous (MLE)", + " Chao1 (Chao, 1984)", + " Chao1-bc", + " iChao1 (Chiu et al. 2014)", + " ACE (Chao & Lee, 1992)", + " ACE-1 (Chao & Lee, 1992)", + " 1st order jackknife", + " 2nd order jackknife") + } + + if (method == "Homogeneous") + out <- rbind(SpecAbunHomo(data, k, conf), SpecAbunHomoMle(data, k, conf)) + if (method == "Chao") + out <- rbind(SpecAbunChao1(data, k, conf), SpecAbunChao1bc(data, k, conf), SpecAbuniChao1(data, k, conf)) + if (method == "CE") + out <- rbind(SpecAbunAce(data, k, conf), SpecAbunAce1(data, k, conf)) + if (method == "Jackknife") + out <- rbind(SpecAbunJack1(data, k, conf), SpecAbunJack2(data, k, conf)) + colnames(out) <- c("Estimate", "s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + return(out) +} diff --git a/R/SpecAbuniChao1.R b/R/SpecAbuniChao1.R new file mode 100644 index 0000000..a631864 --- /dev/null +++ b/R/SpecAbuniChao1.R @@ -0,0 +1,156 @@ +SpecAbuniChao1 <- + function(data, k, conf){ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + f1 <- f(1, x); f2 <- f(2, x); f3 <- f(3, x); f4 <- f(4, x) + if (f1 > 0 & f2 > 0){ + s_Chao1 <- D + (n - 1)/n*f1^2/(2*f2) + var_Chao1 <- f2*((n - 1)/n*(f1/f2)^2/2 + + ((n - 1)/n)^2*(f1/f2)^3 + ((n - 1 )/n)^2*(f1/f2)^4/4) + } else if (f1 > 1 & f2 == 0){ + s_Chao1 <- D + (n - 1)/n*f1*(f1 - 1)/(2*(f2 + 1)) + var_Chao1 <- (n - 1)/n*f1*(f1 - 1)/2 + + ((n - 1)/n)^2*f1*(2*f1 - 1)^2/4 - ((n - 1)/n)^2*f1^4/4/s_Chao1 + } else { + s_Chao1 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_Chao1 <- var_obs + } + + if (f4 != 0){ + s_iChao1 <- s_Chao1 + f3/4/f4*max(f1 - f2*f3/2/f4, 0) + } else { + s_iChao1 <- s_Chao1 + f3/4/(f4 + 1)*max(f1 - f2*f3/2/(f4 + 1), 0) + } + + diff <- function(q, x){ # fq + f1 <- f(1, x); f2 <- f(2, x); f3 <- f(3, x); f4 <- f(4, x) + if (f1 > 0 & f2 != 0){ + if (q == 1){ + d <- (n - 1)/n*f1/f2 - f3/4/f4 + } else if (q == 2){ + d <- (n - 1)/n*f1^2/2/f2^2 - f3^2/8/f4^2 + } else if (q == 3){ + d <- f1/4/f4 + } else { + d <- -f1*f3/4/f4^2 + f2*f3^2/4/f4^3 + } + } else if (f1 > 1 & f2 == 0){ + if (q == 1){ + d <- (n - 1)/n*(2*f1 - 1)/2/(f2 + 1) + f3/4/f4 + } else if (q == 2){ + d <- -(n - 1)/n*f1*(f1 - 1)/2/(f2 + 1)^2 + } else if (q == 3){ + d <- f1/4/f4 + } else { + d <- -f1*f3/4/f4^2 + } + } else { + d=0 + } + return(d) + } + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_iChao1) + } else { + cov.f <- -f(i, x)*f(j, x)/s_iChao1 + } + return(cov.f) + } + + xx <- 1:4 + i <- rep(sort(unique(xx)),each = length(unique(xx))) + j <- rep(sort(unique(xx)),length(unique(xx))) # all combination + if(f2==0){f2=1} + if (f1 - f2*f3/2/f4 > 0 & f3 != 0){ + var_iChao1 <- sum(mapply(function(i, j)diff(i, x)*diff(j, x)*COV.f(i, j), i, j)) + } else { + var_iChao1 <- var_Chao1 + } + + if (var_iChao1 > 0){ + var_iChao1 <- var_iChao1 + } else { + var_iChao1 <- NA + } + + t <- round(s_iChao1 - D, 5) + if (is.nan(t) == F){ + if (t != 0){ + C <- exp(z*sqrt(log(1 + var_iChao1/(s_iChao1 - D)^2))) + CI_iChao1 <- c(D + (s_iChao1 - D)/C, D + (s_iChao1 - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_iChao1 <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_iChao1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_iChao1 <- c(NaN, NaN) + } + + table <- matrix(c(s_iChao1, sqrt(var_iChao1), CI_iChao1), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "iChao1 (Chiu et al. 2014)" + return(table) + } diff --git a/R/SpecInciChao2.R b/R/SpecInciChao2.R new file mode 100644 index 0000000..7f14937 --- /dev/null +++ b/R/SpecInciChao2.R @@ -0,0 +1,81 @@ +SpecInciChao2 <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + if (Q(1, x)>0 & Q(2, x) > 0){ + S_Chao2 <- D + (t - 1)/t*Q(1, x)^2/(2*Q(2, x)) + var_Chao2 <- Q(2, x)*((t - 1)/t*(Q(1, x)/Q(2, x))^2/2 + ((t - 1)/t)^2*(Q(1, x)/Q(2, x))^3 + ((t - 1)/t)^2*(Q(1, x)/Q(2, x))^4/4) + + tt <- S_Chao2 - D + K <- exp(z*sqrt(log(1 + var_Chao2/tt^2))) + CI_Chao2 <- c(D + tt/K, D + tt*K) + } else if (Q(1, x)>1 & Q(2, x) == 0){ + S_Chao2 <- D+(t-1)/t*Q(1,x)*(Q(1,x)-1)/(2*(Q(2,x)+1)) + var_Chao2=(t-1)/t*Q(1,x)*(Q(1,x)-1)/2+((t-1)/t)^2*Q(1,x)*(2*Q(1,x)-1)^2/4-((t-1)/t)^2*Q(1,x)^4/4/S_Chao2 + + tt=S_Chao2-D + K=exp(z*sqrt(log(1+var_Chao2/tt^2))) + CI_Chao2=c(D+tt/K,D+tt*K) + } else { + S_Chao2 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_Chao2 <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Chao2<- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(S_Chao2, sqrt(var_Chao2), CI_Chao2), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Chao2 (Chao, 1987)" + return(table) + +} diff --git a/R/SpecInciChao2bc.R b/R/SpecInciChao2bc.R new file mode 100644 index 0000000..39c34c5 --- /dev/null +++ b/R/SpecInciChao2bc.R @@ -0,0 +1,73 @@ +SpecInciChao2bc <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c(" Number of observed species"," Number of sample/quadrats"," Cut-off point", + " Number of observed species for infrequent species"," Estimated sample coverage for infrequent species", + " Estimated CV for infrequent species", + " Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + + S_Chao2_bc <- D + (t - 1)/t*Q(1, x)*(Q(1, x) - 1)/(2*(Q(2, x) + 1)) + var_Chao2_bc <- (t - 1)/t*Q(1, x)*(Q(1, x) - 1)/2/(Q(2, x) + 1) + ((t - 1)/t)^2*Q(1, x)*(2*Q(1, x) - 1)^2/4/(Q(2, x) + 1)^2 + ((t - 1)/t)^2*Q(1, x)^2*Q(2, x)*(Q(1, x) - 1)^2/4/(Q(2, x) + 1)^4 + + tt <- S_Chao2_bc - D + if (tt != 0){ + K <- exp(z*sqrt(log(1 + var_Chao2_bc/tt^2))) + CI_Chao2_bc <- c(D + tt/K, D + tt*K) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_Chao2_bc <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Chao2_bc <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(S_Chao2_bc, sqrt(var_Chao2_bc), CI_Chao2_bc), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Chao2-bc" + return(table) +} diff --git a/R/SpecInciHomo.R b/R/SpecInciHomo.R new file mode 100644 index 0000000..d969e1a --- /dev/null +++ b/R/SpecInciHomo.R @@ -0,0 +1,132 @@ +SpecInciHomo <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + + + S_HOMO <- function(x, k){ + D_infreq <- length(x[which(x <= k)]) + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2 - 1) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + s_homo <- D_freq + D_infreq/C_infreq + return(s_homo) + } + s_homo <- S_HOMO(x, k) + #### differential #### + diff <- function(q){ + n_rare <- sum(x[which(x <= k)]) + if ( q == 1){ + d <- (C_infreq - D_infreq*( - ((n_rare*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1)) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_rare) + 2*Q(2, x))) + /(n_rare*((t - 1)*Q(1, x) + 2*Q(2, x)))^2) + )/C_infreq^2 + } else if (q == 2){ + d <- (C_infreq - D_infreq*( - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*2 + 2*(2*Q(2, x) + n_rare)) + )/(n_rare*((t - 1)*Q(1, x) + 2*Q(2, x)))^2) + )/C_infreq^2 + } else if (q > k){ + d <- 1 + } else { + d <- (C_infreq - D_infreq*( - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q) + )/(n_rare*((t - 1)*Q(1, x) + 2*Q(2, x)))^2) + )/C_infreq^2 + } + return(d) + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1-Q(i, x)/S_HOMO(x, k)) + } else { + cov.q <- -Q(i, x)*Q(j, x)/S_HOMO(x, k) + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each=length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_mle <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_mle > 0){ + var_mle <- var_mle + } else { + var_mle <- NA + cat("Warning: In this case, it can't estimate the variance of Homogeneous estimation", "\n\n") + } + ###################### + if (round(s_homo - D, 5) != 0){ + C <- exp(z*sqrt(log(1+var_mle/(s_homo-D)^2))) + CI_homo <- c(D + (s_homo-D)/C, D + (s_homo - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_mle <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_homo <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_homo, sqrt(var_mle), CI_homo), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Homogenous Model" + return(table) + +} diff --git a/R/SpecInciJack1.R b/R/SpecInciJack1.R new file mode 100644 index 0000000..4e1980d --- /dev/null +++ b/R/SpecInciJack1.R @@ -0,0 +1,103 @@ +SpecInciJack1 <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_1st_JK <- function(x,k){ + s_1st_jk <- D + (t - 1)/t*Q(1, x) + return(s_1st_jk) + } + s_1st_jk <- S_1st_JK(x, k) + #### differential #### + diff <- function(q){ + if ( q == 1){ + d <- 1 + (t - 1)/t + } else { + d <- 1 + } + return(d) + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/S_1st_JK(x, k)) + } else { + cov.q <- -Q(i, x)*Q(j, x)/S_1st_JK(x, k) + } + return(cov.q) + } + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_1st <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_1st > 0){ + var_1st <- var_1st + } else { + var_1st <- NA + cat("Warning: In this case, it can't estimate the variance of 1st-order-jackknife estimation", "\n\n") + } + ###################### + if (round(s_1st_jk - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_1st/(s_1st_jk - D)^2))) + CI_1st_jk <- c(D + (s_1st_jk - D)/C, D + (s_1st_jk - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_1st <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_1st_jk <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_1st_jk, sqrt(var_1st), CI_1st_jk), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "1st order jackknife" + return(table) + +} diff --git a/R/SpecInciJack2.R b/R/SpecInciJack2.R new file mode 100644 index 0000000..bee92a2 --- /dev/null +++ b/R/SpecInciJack2.R @@ -0,0 +1,104 @@ +SpecInciJack2 <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_2nd_JK <- function(x,k){ + x <- x[which(x != 0)] + s_2nd_jk <- D + (2*t - 3)/t*Q(1, x) - (t - 2)^2/t/(t - 1)*Q(2, x) + return(s_2nd_jk) + } + s_2nd_jk <- S_2nd_JK(x, k) + #### differential #### + diff <- function(q){ + if ( q == 1){ + d <- 1 + (2*t - 3)/t + } else if (q == 2){ + d <- 1 - (t-2)^2/t/(t-1) + } else { + d <- 1 + } + return(d) + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/S_2nd_JK(x, k)) + } else { + cov.q <- -Q(i, x)*Q(j, x)/S_2nd_JK(x, k) + } + return(cov.q) + } + + i <- rep(sort(unique(x)), each = length(unique(x))) + j <- rep(sort(unique(x)), length(unique(x))) # all combination + + var_2nd <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_2nd > 0){ + var_2nd <- var_2nd + } else { + var_2nd <- NA + cat("Warning: In this case, it can't estimate the variance of 2nd-order-jackknife estimation", "\n\n") + } + ###################### + if (round(s_2nd_jk - D, 5) != 0){ + C <- exp(z*sqrt(log(1+var_2nd/(s_2nd_jk-D)^2))) + CI_2nd_jk <- c(D + (s_2nd_jk - D)/C, D + (s_2nd_jk - D)*C) + } else { + i <- c(1:max(x)) + pos <- i[unique(x)] + P <- sum(sapply(i, function(i)Q(i,x)*exp( - i)/D)) + CI_2nd_jk <- c(max(D, D/(1 - P) - z*sqrt(var_2nd)/(1 - P)), D/(1 - P) + z*sqrt(var_2nd)/(1 - P)) + } + table <- matrix(c(s_2nd_jk, sqrt(var_2nd), CI_2nd_jk), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "2nd order jackknife" + return(table) + +} diff --git a/R/SpecInciModelh.R b/R/SpecInciModelh.R new file mode 100644 index 0000000..0c3c3f2 --- /dev/null +++ b/R/SpecInciModelh.R @@ -0,0 +1,153 @@ +SpecInciModelh <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2-1) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_ICE <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*Q(j, x))) + a2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*a1/a2/(a2 - 1) - 1,0) + s_ice <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square + CV_infreq_h <- sqrt(gamma_infreq_square) + return(c(s_ice, CV_infreq_h)) + } + s_ice <- S_ICE(x, k)[1] + CV_infreq_h <- S_ICE(x, k)[2] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (CV_infreq_h != 0){ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*n_infreq*2) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } + return(d) + }else{ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_ice) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_ice + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ice <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_ice > 0){ + var_ice <- var_ice + } else { + var_ice <- NA + cat("Warning: In this case, it can't estimate the variance of Model(h) estimation", "\n\n") + } + ###################### + if (round(s_ice - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_ice/(s_ice - D)^2))) + CI_Model_h <- c(D + (s_ice - D)/C, D + (s_ice - D)*C) + }else{ + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_ice <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_h <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_ice, sqrt(var_ice), CI_Model_h), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Model(h) (ICE) (Lee & Chao, 1994)" + + #return(list(table, CV_infreq_h)) + return(c(table, CV_infreq_h)) +} diff --git a/R/SpecInciModelh1.R b/R/SpecInciModelh1.R new file mode 100644 index 0000000..f9bd4dd --- /dev/null +++ b/R/SpecInciModelh1.R @@ -0,0 +1,170 @@ +SpecInciModelh1 <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_Model_H1 <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*Q(j, x))) + a2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*a1/a2/(a2 - 1) - 1,0) + gamma_infreq_square_1 <- max(gamma_infreq_square*(1 + Q(1, x)/C_infreq*t/(t - 1)*a1/a2/(a2 - 1)), 0) + s_Model_h1 <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square_1 + CV_infreq_h1 <- sqrt(gamma_infreq_square_1) + return(c(s_Model_h1, CV_infreq_h1)) + } + s_Model_h1 <- S_Model_H1(x, k)[1] + CV_infreq_h1 <- S_Model_H1(x, k)[2] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (CV_infreq_h1 != 0){ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(2*Q(1, x)*D_infreq*si^2 + Q(1, x)^2*si^2) - #g5 + Q(1, x)^2*D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + (t/(t - 1))*si*(C_infreq^2*n_infreq*(n_infreq - 1)*2*Q(1, x) - Q(1, x)^2*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) #g6 + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*n_infreq*2) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*Q(1, x)^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(si^2 + D_infreq*2*si*2) - #g5 + D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*2*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)*2) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + t/(t - 1)*Q(1, x)^2*(C_infreq^2*n_infreq*(n_infreq - 1)*2 - si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*Q(1, x)^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(si^2 + D_infreq*2*si*q*(q - 1)) - #g5 + D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*q*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)*q) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + t/(t - 1)*Q(1, x)^2*(C_infreq^2*n_infreq*(n_infreq - 1)*q*(q - 1) - #g6 + si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + } + return(d) + }else{ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_Model_h1) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_Model_h1 + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ice1 <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_ice1 > 0){ + var_ice1 <- var_ice1 + } else { + var_ice1 <- NA + cat("Warning: In this case, it can't estimate the variance of Model(h)-1 estimation", "\n\n") + } + ###################### + if (round(s_Model_h1 - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_ice1/(s_Model_h1 - D)^2))) + CI_Model_h1 <- c(D + (s_Model_h1 - D)/C, D + (s_Model_h1 - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_ice1 <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_h1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_Model_h1, sqrt(var_ice1), CI_Model_h1), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Model(h)-1 (ICE-1)" + + #return(list(table, CV_infreq_h1)) + return(c(table, CV_infreq_h1)) +} diff --git a/R/SpecInciModelth.R b/R/SpecInciModelth.R new file mode 100644 index 0000000..32d0063 --- /dev/null +++ b/R/SpecInciModelth.R @@ -0,0 +1,150 @@ +SpecInciModelth <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- length(data[1, ]) + dat <- apply(data, 1, sum) + + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + u <- c(1:k) + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + data.f <- data[which(x <= k), ] + nd <- as.numeric(apply(data.f, 2, sum)) + nd_c <- nd[which(nd != 0)] + e <- c(1:length(nd_c)) + o <- rep(e ,each = length(e)) + p <- rep(e ,length(e)) + s <- sum(mapply(function(o, p)nd_c[o]*nd_c[p], o, p)) - sum(sapply(e, function(e)nd_c[e]*nd_c[e])) + S_Model_TH <- function(x, k){ + si <- sum(sapply(u, function(u)u*(u - 1)*Q(u, x))) + gamma_infreq_square_th <- max(D_infreq/C_infreq*si/s - 1, 0) + s_Model_th <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square_th + CV_infreq_th <- sqrt(gamma_infreq_square_th) + return(unlist(list(s_Model_th, CV_infreq_th))) + } + s_Model_th <- S_Model_TH(x, k)[1] + CV_infreq_th <- S_Model_TH(x, k)[2] + #### differential #### + diff <- function(q){ + if (CV_infreq_th != 0){ + if ( q == 1) { + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq) + )/C_infreq^4/s - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } else if (q == 2) { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq) + )/C_infreq^4/s - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } else if(q > k) { + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq) + )/C_infreq^4/s - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } + return(d) + } else { + if ( q == 1) { + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2) { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if(q > k) { + d <- 1 + } else { + w <- c(q:k) + ss <- sum(sapply(w, function(w)nd[w])) + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, dat)*(1 - Q(i, dat)/s_Model_th) + } else { + cov.q <- -Q(i, dat)*Q(j, dat)/s_Model_th + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_th <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + ###################### + if (round(s_Model_th - D, 5) != 0){ + C <- exp(z*sqrt(log(1+var_th/(s_Model_th - D)^2))) + CI_Model_th <- c(D + (s_Model_th - D)/C, D + (s_Model_th - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_th <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_Model_th, sqrt(var_th), CI_Model_th), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Model(th) (Lee & Chao, 1994)" + return(list(table, CV_infreq_th)) +} diff --git a/R/SpecInciModelth1.R b/R/SpecInciModelth1.R new file mode 100644 index 0000000..cf27277 --- /dev/null +++ b/R/SpecInciModelth1.R @@ -0,0 +1,163 @@ +SpecInciModelth1 <- +function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- length(data[1, ]) + dat <- apply(data, 1, sum) + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + u <- c(1:k) + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + data.f <- data[which(x <= k), ] + nd <- as.numeric(apply(data.f, 2, sum)) + nd_c <- nd[which(nd != 0)] + e <- c(1:length(nd_c)) + o <- rep(e ,each = length(e)) + p <- rep(e ,length(e)) + s <- sum(mapply(function(o, p)nd_c[o]*nd_c[p], o, p)) - sum(sapply(e, function(e)nd_c[e]*nd_c[e])) + S_Model_TH1 <- function(x, k){ + si <- sum(sapply(u, function(u)u*(u - 1)*Q(u, x))) + gamma_infreq_square_th <- max(D_infreq/C_infreq*si/s - 1, 0) + gamma_infreq_square_th1 <- max(gamma_infreq_square_th*(1 + Q(1, x)/C_infreq*si/s), 0) + s_Model_th1 <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square_th1 + CV_infreq_th1 <- sqrt(gamma_infreq_square_th1) + return(unlist(list(s_Model_th1, CV_infreq_th1))) + } + s_Model_th1 <- S_Model_TH1(x, k)[1] + CV_infreq_th1 <- S_Model_TH1(x, k)[2] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (CV_infreq_th1 != 0){ + if ( q == 1) { + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq) + )/C_infreq^4/s - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (C_infreq^3*(Q(1, x)^2*si^2 + D_infreq*2*Q(1, x)*si^2) - D_infreq*Q(1, x)^2*si^2*(3*C_infreq^2*dc_infreq) #g5 + )/C_infreq^6/s^2 - + (C_infreq^2*2*Q(1, x)*si - Q(1, x)^2*si*2*C_infreq*dc_infreq #g6 + )/C_infreq^4/s + } else if (q == 2) { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq) + )/C_infreq^4/s - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (C_infreq^3*Q(1, x)^2*(si^2 + D_infreq*2*si*2) - D_infreq*Q(1, x)^2*si^2*3*C_infreq^2*dc_infreq #g5 + )/C_infreq^6/s^2 - + (C_infreq^2*Q(1, x)^2*2 - Q(1, x)^2*si*2*C_infreq*dc_infreq #g6 + )/C_infreq^4/s + } else if(q > k) { + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + (C_infreq^2*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq) + )/C_infreq^4/s - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (C_infreq^3*Q(1, x)^2*(si^2 + D_infreq*2*si*q*(q - 1)) - D_infreq*Q(1, x)^2*si^2*3*C_infreq^2*dc_infreq #g5 + )/C_infreq^6/s^2 - + (C_infreq^2*Q(1, x)^2*q*(q - 1) - Q(1, x)^2*si*2*C_infreq*dc_infreq + )/C_infreq^4/s + } + return(d) + } else { + if ( q == 1) { + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2) { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if(q > k) { + d <- 1 + } else { + w <- c(q:k) + ss <- sum(sapply(w, function(w)nd[w])) + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_Model_th1) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_Model_th1 + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_th1 <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + ###################### + if (round(s_Model_th1 - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_th1/(s_Model_th1-D)^2))) + CI_Model_th1 <- c(D + (s_Model_th1 - D)/C, D + (s_Model_th1 - D)*C) + }else{ + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_th1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_Model_th1, sqrt(var_th1), CI_Model_th1), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "Model(th)-1 (Lee & Chao, 1994)" + return(list(table, CV_infreq_th1)) +} diff --git a/R/SpecInciOut.R b/R/SpecInciOut.R new file mode 100644 index 0000000..1291e9b --- /dev/null +++ b/R/SpecInciOut.R @@ -0,0 +1,88 @@ +SpecInciOut <- +function(data, method = c("all", "Homogeneous", "Chao", "CE", "Jackknife"), k, conf){ + a <- as.numeric(round(SpecInciHomo(data, k, conf), 3)) + b <- as.numeric(round(SpecInciChao2(data, k, conf), 3)) + c <- as.numeric(round(SpecInciChao2bc(data, k, conf), 3)) + d <- as.numeric(round(SpecInciiChao2(data, k, conf), 3)) + e <- as.numeric(round(SpecInciModelh(data, k, conf)[1:4], 3)) + f <- as.numeric(round(SpecInciModelh1(data, k, conf)[1:4], 3)) + g <- as.numeric(round(SpecInciJack1(data, k, conf), 3)) + h <- as.numeric(round(SpecInciJack2(data, k, conf), 3)) + est.cv <- data.frame(c("", "", "", "", round(SpecInciModelh(data, k, conf)[5], 3), + round(SpecInciModelh1(data, k, conf)[5], 3), "", "")) + colnames(est.cv) <- "Est.CV(rare)" + if (method == "all") { + #out <- data.frame(rbind(a, b, c, d, e, f, g, h), est.cv) + out <- data.frame(rbind(a, b, c, d, e, f, g, h)) + rownames(out) <- c(" Homogeneous Model", + " Chao2 (Chao, 1987)", + " Chao2-bc", + " iChao2 (Chiu et al. 2014)", + " ICE (Lee & Chao, 1994)", + " ICE-1 (Lee & Chao, 1994)", + " 1st order jackknife", + " 2nd order jackknife") + colnames(out) <- c("Estimate", "s.e.", "95%Lower", "95%Upper") + } + + if (method == "Homogeneous") + out <- a + if (method == "Chao") + out <- rbind(b, c, d) + if (method == "CE"){ + est.cv <- matrix(c(SpecInciModelh(data, k, conf)[5], SpecInciModelh1(data, k, conf)[5]), ncol = 1) + est.cv <- round(est.cv, 3) + colnames(est.cv) <- "Est.CV(rare)" + out1 <- rbind(e, f) + out <- cbind(out1, est.cv) + } + if (method == "Jackknife") + out <- rbind(g, h) + colnames(out) <- c("Estimate", "s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + return(out) +} + +#####J.H Lin 2016.7.11 #####add raw data input +SpecInciOut_raw <- + function(data, method = c("all", "Homogeneous", "Chao", "CE", "Jackknife"), k, conf){ + a <- round(SpecInciHomo(data[-2], k, conf), 3) + b <- round(SpecInciChao2(data[-1], k, conf), 3) + c <- round(SpecInciChao2bc(data[-1], k, conf), 3) + d <- round(SpecInciiChao2(data[-1], k, conf), 3) + e <- round(SpecInciModelh(data[-2], k, conf)[1:4], 3) + f <- round(SpecInciModelh1(data[-2], k, conf)[1:4], 3) + g <- round(SpecInciJack1(data[-1], k, conf), 3) + h <- round(SpecInciJack2(data[-1], k, conf), 3) + est.cv <- data.frame(c("", "", "", "", round(SpecInciModelh(data[-2], k, conf)[5], 3), + round(SpecInciModelh1(data[-2], k, conf)[5], 3), "", "")) + colnames(est.cv) <- "Est.CV(rare)" + if (method == "all") { + #out <- data.frame(rbind(a, b, c, d, e, f, g, h), est.cv) + out <- data.frame(rbind(a, b, c, d, e, f, g, h), row.names = NULL) + rownames(out) <- c(" Homogeneous Model", + " Chao2 (Chao, 1987)", + " Chao2-bc", + " iChao2 (Chiu et al. 2014)", + " ICE (Lee & Chao, 1994)", + " ICE-1 (Lee & Chao, 1994)", + " 1st order jackknife", + " 2nd order jackknife") + colnames(out) <- c("Estimator", "Est_s.e.", "95% Lower Bound", "95% Upper Bound") + } + + if (method == "Homogeneous") + out <- a + if (method == "Chao") + out <- rbind(b, c, d) + if (method == "CE"){ + est.cv <- matrix(c(SpecInciModelh(data[-2], k, conf)[5], SpecInciModelh1(data[-2], k, conf)[5]), ncol = 1) + est.cv <- round(est.cv, 3) + colnames(est.cv) <- "Est.CV(rare)" + out1 <- rbind(e, f) + out <- cbind(out1, est.cv) + } + if (method == "Jackknife") + out <- rbind(g, h) + colnames(out) <- c("Estimate", " s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + return(out) + } diff --git a/R/SpecInciiChao2.R b/R/SpecInciiChao2.R new file mode 100644 index 0000000..8ac3ea2 --- /dev/null +++ b/R/SpecInciiChao2.R @@ -0,0 +1,150 @@ +SpecInciiChao2 <- function(data, k, conf){ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + q1 <- Q(1, x); q2 <- Q(2, x); q3 <- Q(3, x); q4 <- Q(4, x) + if (q1 > 0 & q2 != 0){ + s_Chao2 <- D + (t - 1)/t*q1^2/(2*q2) + var_Chao2 <- q2*((t - 1)/t*(q1/q2)^2/2 + ((t - 1)/t)^2*(q1/q2)^3 + ((t - 1)/t)^2*(q1/q2)^4/4) + } else if (q1 > 1 & q2 == 0){ + s_Chao2 <- D + (t - 1)/t*q1*(q1 - 1)/(2*(q2 + 1)) + var_Chao2=(t-1)/t*q1*(q1 - 1)/2 + ((t - 1)/t)^2*q1*(2*q1-1)^2/4-((t-1)/t)^2*q1^4/4/s_Chao2 + } else { + s_Chao2 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_Chao2 <- var_obs + } + + if (q4 != 0){ + s_iChao2 <- s_Chao2 + (t - 3)/t*q3/4/q4*max(q1 - (t - 3)/(t - 1)*q2*q3/2/q4, 0) + } else { + s_iChao2 <- s_Chao2 + (t - 3)/t*q3/4/(q4 + 1)*max(q1 - (t - 3)/(t - 1)*q2*q3/2, 0) + } + + diff <- function(q, x){ # fq + q1 <- Q(1, x); q2 <- Q(2, x); q3 <- Q(3, x); q4 <- Q(4, x) + if (q1 > 0 & q2 != 0){ + if (q == 1){ + d <- (t - 1)/t*q1/q2 - (t - 3)/t*q3/4/q4 + } else if (q == 2){ + d <- (t - 1)/t*q1^2/2/q2^2 - (t - 3)^2/t/(t - 1)*q3^2/8/q4^2 + } else if (q == 3){ + d <- (t - 3)/t*q1/4/q4 + } else { + d <- -(t - 3)/t*q1*q3/4/q4^2 + (t - 3)^2/t/(t - 1)*q2*q3^2/4/q4^3 + } + } else if (q1 > 1 & q2 == 0){ + if (q == 1){ + d <- (t - 1)/t*(2*q1 - 1)/2/(q2 + 1) + (t - 3)/t*q3/4/q4 + } else if (q == 2){ + d <- -(t - 1)/t*q1*(q1 - 1)/2/(q2 + 1)^2 + } else if (q == 3){ + d <- (t - 3)/t*q1/4/q4 + } else { + d <- -(t - 3)/t*q1*q3/4/q4^2 + } + } else { + d=0 + } + return(d) + } + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_iChao2) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_iChao2 + } + return(cov.q) + } + + ind <- 1:4 + i <- rep(sort(unique(ind)),each = length(unique(ind))) + j <- rep(sort(unique(ind)),length(unique(ind))) # all combination + + # if (q1 - q2*q3/2/q4 > 0 & q3 != 0){ + if (q1 - (t - 3)/(t - 1)*q2*q3/2/q4 > 0 | + q1 - (t - 3)/(t - 1)*q2*q3/2 > 0){ + var_iChao2 <- sum(mapply(function(i, j)diff(i, x)*diff(j, x)*COV.q(i, j), i, j)) + } else { + var_iChao2 <- var_Chao2 + } + + if (var_iChao2 > 0){ + var_iChao2 <- var_iChao2 + } else { + var_iChao2 <- NA + } + + m <- round(s_iChao2 - D, 5) + if (is.nan(m) == F){ + if (m != 0){ + C <- exp(z*sqrt(log(1 + var_iChao2/(s_iChao2 - D)^2))) + CI_iChao2 <- c(D + (s_iChao2 - D)/C, D + (s_iChao2 - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_iChao2 <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_iChao2 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_iChao2 <- c(NaN, NaN) + } + + table <- matrix(c(s_iChao2, sqrt(var_iChao2), CI_iChao2), ncol = 4) + colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + rownames(table) <- "improved Chao2 (Chao, 1987)" + return(table) + +} diff --git a/R/SpeciesAbundance.R b/R/SpeciesAbundance.R new file mode 100644 index 0000000..978d063 --- /dev/null +++ b/R/SpeciesAbundance.R @@ -0,0 +1,6 @@ +SpeciesAbundance <- +function(data, method = c("all", "Homogeneous", "Chao", "CE", "Jackknife"), k = 10, conf = 0.95){ + method <- match.arg(method) + return(list(BASIC.DATA.INFORMATION = basicAbun(data, k)[[1]], Rare.Species.Group = RareSpeciesGroup(data, k), + SPECIES.TABLE = round(SpecAbunOut(data, method, k, conf), 3))) +} diff --git a/R/SpeciesIncidence.R b/R/SpeciesIncidence.R new file mode 100644 index 0000000..9969985 --- /dev/null +++ b/R/SpeciesIncidence.R @@ -0,0 +1,5 @@ +SpeciesIncidence <- +function(data, method = c("all", "Homogeneous", "Chao", "CE", "Jackknife"), k = 10, conf = 0.95){ + method <- match.arg(method) + return(list(BASIC.DATA.INFORMATION = basicInci(data, k)[[1]], SPECIES.TABLE = SpecInciOut(data, method, k, conf))) +} diff --git a/R/Two_Community_similarity_subroutine.R b/R/Two_Community_similarity_subroutine.R new file mode 100644 index 0000000..f0b0afc --- /dev/null +++ b/R/Two_Community_similarity_subroutine.R @@ -0,0 +1,2025 @@ +Chat.Ind <- function(x, m) +{ + x <- x[x>0] + n <- sum(x) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0 & f2>0) + { + a=(n - 1) * f1 / ((n - 1) * f1 + 2 * f2) + } + if(f1>1 & f2==0) + { + a=(n-1)*(f1-1) / ( (n-1)*(f1-1) + 2 ) + } + if(f1==1 & f2==0) {a=0} + if(f1==0) {a=0} + + Sub <- function(m){ + if(m < n) out <- 1-sum(x / n * exp(lchoose(n - x, m)-lchoose(n - 1, m))) + if(m == n) out <- 1-f1/n*a + if(m > n) out <- 1-f1/n*a^(m-n+1) + out + } + sapply(m, Sub) +} +Chat.Ind_Inc <- function(x, m) +{ + t <- x[1] + x=x[-1]; x <- x[x>0] + Q1 <- sum(x == 1) + Q2 <- sum(x == 2) + if(Q1>0 & Q2>0) + { + a=(t - 1) * Q1 / ((t - 1) * Q1 + 2 * Q2) + } + if(Q1>1 & Q2==0) + { + a=(t-1)*(Q1-1) / ( (t-1)*(Q1-1) + 2 ) + } + if(Q1==1 & Q2==0) {a=0} + if(Q1==0) {a=0} + + Sub <- function(m){ + if(m < t) out <- 1-sum(x / t * exp(lchoose(t - x, m)-lchoose(t - 1, m))) + if(m == t) out <- 1-Q1/t*a + if(m > t) out <- 1-Q1/t*a^(m-t+1) + out + } + sapply(m, Sub) +} +U2_equ=function(X,Y) +{ + n=sum(X) + m=sum(Y) + f1.=sum(X==1 & Y>0) + p1barhat_1=p1bar_equ(X) + out3=f1./n*(1-p1barhat_1) + ############################# + f11=sum(X==1 & Y==1) + p1barhat_2=p1bar_equ(Y) + out4=f11/n/m*(1-p1barhat_1)*(1-p1barhat_2)/p1barhat_2 + output=out3+out4 + return(output) +} +U2_equ_Inc=function(X, Y) +{ + t1=X[1] + t2=Y[1] + X = X[-1]; Y = Y[-1] + Q1.=sum(X==1 & Y>0) + p1barhat_1=p1bar_equ_Inc(c(t1,X)) + out3=Q1./t1*(1-p1barhat_1) + ############################# + Q11=sum(X==1 & Y==1) + p1barhat_2=p1bar_equ_Inc(c(t2,Y)) + out4=Q11/t1/t2*(1-p1barhat_1)*(1-p1barhat_2)/p1barhat_2 + if(p1barhat_2==0){out4=0} + output=out3+out4 + return(output) +} +correct_obspi<- function(X) +{ + Sobs <- sum(X > 0) + n <- sum(X) + f1 <- sum(X == 1) + f2 <- sum(X == 2) + if(f1>0 & f2>0) + { + a=(n - 1) * f1 / ((n - 1) * f1 + 2 * f2) * f1 / n + } + if(f1>1 & f2==0) + { + a=(n-1)*(f1-1) / ( (n-1)*(f1-1) + 2 )*f1/n + } + if(f1==1 & f2==0) {a=0} + if(f1==0 ) {a=0} + b <- sum(X / n * (1 - X / n) ^ n) + w <- a / b + Prob.hat <- X / n * (1 - w * (1 - X / n) ^ n) + Prob.hat +} +correct_obspi_Inc<- function(X) +{ + t = X[1] + x = X[-1] + Sobs <- sum(x > 0) + + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0 & f2>0) + { + a=(t - 1) * f1 / ((t - 1) * f1 + 2 * f2) * f1 / t + } + if(f1>1 & f2==0) + { + a=(t-1)*(f1-1) / ( (t-1)*(f1-1) + 2 )*f1/t + } + if(f1==1 & f2==0) {a=0} + if(f1==0 ) {a=0} + b <- sum(x / t * (1 - x / t) ^ t) + w <- a / b + Prob.hat <- x / t * (1 - w * (1 - x / t) ^ t) + Prob.hat +} +p1bar_equ=function(X) +{ + n=sum(X) + f1=sum(X==1) + f2=sum(X==2) + if(f1>0 & f2>0) + { + a=2*f2/( (n-1)*f1+2*f2 ) + } + if(f1>1 & f2==0) + { + a=2/( (n-1)*(f1-1)+2 ) + } + if(f1==1 & f2==0){a=0} + if(f1==0){a=0} + return(a) +} +p1bar_equ_Inc=function(X) +{ + t = X[1] + X = X[-1] + + Q1=sum(X==1) + Q2=sum(X==2) + a=1 + if(Q1>0 & Q2>0) + { + a=2*Q2/( (t-1)*Q1+2*Q2 ) + } + if(Q1>1 & Q2==0) + { + a=2/( (t-1)*(Q1-1)+2 ) + } + if(Q1==1 & Q2==0){a=1} + if(Q1==0){a=1} + return(a) +} +Two_com_correct_obspi=function(X1,X2) +{ + n1=sum(X1) + n2=sum(X2) + f11=sum(X1==1) + f12=sum(X1==2) + f21=sum(X2==1) + f22=sum(X2==2) + C1=1-f11/n1*(n1 - 1) * f11 / ((n1 - 1) * f11 + 2 * f12) + C2=1-f21/n2*(n2 - 1) * f21 / ((n2 - 1) * f21 + 2 * f22) + + PP1=correct_obspi(X1) + PP2=correct_obspi(X2) + D12=which(X1>0 & X2>0) + + f0hat_1=ceiling( ifelse(f12 == 0, f11 * (f11 - 1) / 2, f11 ^ 2/ 2 / f12) ) + f0hat_2=ceiling( ifelse(f22 == 0, f21 * (f21 - 1) / 2, f21 ^ 2/ 2 / f22) ) + #----------------------------------------------------------------------------- + + r1=which(X1>0 & X2==0) + f.1=length(which(X1>0 & X2==1)) + f.2=length(which(X1>0 & X2==2)) + f.0=ceiling(ifelse(f.2>0,f.1^2/2/f.2,f.1*(f.1-1)/2)) + #------------------------------------------------------------------------------ + r2=which(X1==0 & X2>0) + f1.=length(which(X1==1 & X2>0)) + f2.=length(which(X1==2 & X2>0)) + f0.=ceiling(ifelse(f2.>0,f1.^2/2/f2.,f1.*(f1.-1)/2)) + #------------------------------------------------------------------------------ + t11=length(which(X1==1 & X2==1)) + t22=length(which(X1==2 & X2==2)) + f00hat=ceiling( ifelse(t22 == 0, t11 * (t11 - 1) / 4, t11 ^ 2/ 4 / t22) ) + #------------------------------------------------------------------------------ + temp1=max(length(r1),f.0)-length(r1)+f0.+f00hat + temp2=max(length(r2),f0.)-length(r2)+f.0+f00hat + p0hat_1=(1-C1)/max(f0hat_1,temp1) + p0hat_2=(1-C2)/max(f0hat_2,temp2) + #------------------------------------------------------------------------------ + P1=PP1[D12] + P2=PP2[D12] + if(length(r1)> f.0) + { + P1=c(P1,PP1[r1]) + Y=c(rep(p0hat_2, f.0), rep(0,length(r1)-f.0)) + P2=c(P2,sample(Y,length(Y)) ) + } + if(length(r1)< f.0) + { + P1=c(P1,PP1[r1],rep( p0hat_1,f.0-length(r1))) + P2=c(P2,rep(p0hat_2, f.0) ) + } + #---------------------------------------------------------------------------- + if(length(r2)> f0.) + { + Y=c(rep(p0hat_1,f0.),rep(0,length(r2)- f0.)) + P1=c(P1,sample(Y,length(Y))) + P2=c(P2,PP2[r2] ) + } + if(length(r2)< f0.) + { + P1=c(P1,rep(p0hat_1,f0.)) + P2=c(P2,PP2[r2],rep( p0hat_2,f0.-length(r2)) ) + } + P1=c(P1,rep( p0hat_1,f00hat)) + P2=c(P2,rep( p0hat_2,f00hat)) + P1=c(P1, rep(p0hat_1,max( f0hat_1-temp1,0)) , rep( 0 ,max( f0hat_2-temp2,0)) ) + P2=c(P2, rep( 0 ,max( f0hat_1-temp1,0)) , rep(p0hat_2,max( f0hat_2-temp2,0)) ) + #------------------------------------------------------------------------------------ + a=cbind(P1,P2) + return(a) +} +Two_com_correct_obspi_Inc=function(X1,X2) +{ + x1=X1; x2=X2 + t1=X1[1]; X1=X1[-1] + t2=X2[1]; X2=X2[-1] + Q11=sum(X1==1) + Q12=sum(X1==2) + Q21=sum(X2==1) + Q22=sum(X2==2) + C1=Chat.Ind_Inc(x1,t1) + C2=Chat.Ind_Inc(x2,t2) + + PP1=correct_obspi_Inc(x1) + PP2=correct_obspi_Inc(x2) + D12=which(X1>0 & X2>0) + + Q0hat_1=ceiling( (t1-1)/t1* ifelse(Q12 == 0, Q11 * (Q11 - 1) / 2, Q11 ^ 2/ 2 / Q12) ) + Q0hat_2=ceiling( (t2-1)/t2* ifelse(Q22 == 0, Q21 * (Q21 - 1) / 2, Q21 ^ 2/ 2 / Q22) ) + #----------------------------------------------------------------------------- + + r1=which(X1>0 & X2==0) + Q.1=length(which(X1>0 & X2==1)) + Q.2=length(which(X1>0 & X2==2)) + Q.0=ceiling((t2-1)/t2* ifelse(Q.2>0,Q.1^2/2/Q.2,Q.1*(Q.1-1)/2)) + #------------------------------------------------------------------------------ + r2=which(X1==0 & X2>0) + Q1.=length(which(X1==1 & X2>0)) + Q2.=length(which(X1==2 & X2>0)) + Q0.=ceiling((t1-1)/t1*ifelse(Q2.>0,Q1.^2/2/Q2.,Q1.*(Q1.-1)/2)) + #------------------------------------------------------------------------------ + t11=length(which(X1==1 & X2==1)) + t22=length(which(X1==2 & X2==2)) + Q00hat=ceiling((t1-1)/t1*(t2-1)/t2* ifelse(t22 == 0, t11 * (t11 - 1) / 4, t11 ^ 2/ 4 / t22) ) + #------------------------------------------------------------------------------ + temp1=max(length(r1),Q.0)-length(r1)+Q0.+Q00hat + temp2=max(length(r2),Q0.)-length(r2)+Q.0+Q00hat + p1_us_sum=min(U2_equ_Inc(x1,x2),1-C1) + p1_us=p1_us_sum/temp1 + p2_us_sum=min(U2_equ_Inc(x2,x1),1-C2) + p2_us=p2_us_sum/temp2 + if(Q0hat_1-temp1>0){p0_1= (1-C1-p1_us_sum)/(Q0hat_1-temp1) } + if(Q0hat_1-temp1<=0){p0_1=0} + if(Q0hat_2-temp2>0){p0_2= (1-C2-p2_us_sum)/(Q0hat_2-temp2) } + if(Q0hat_2-temp2<=0){p0_2=0} + #------------------------------------------------------------------------------ + P1=PP1[D12] + P2=PP2[D12] + if(length(r1)> Q.0) + { + P1=c(P1,PP1[r1]) + Y=c(rep(p2_us, Q.0), rep(0,length(r1)-Q.0)) + P2=c(P2,sample(Y,length(Y)) ) + } + if(length(r1)< Q.0) + { + P1=c(P1,PP1[r1],rep( p1_us,Q.0-length(r1))) + P2=c(P2,rep(p2_us, Q.0) ) + } + #---------------------------------------------------------------------------- + if(length(r2)> Q0.) + { + Y=c(rep(p1_us,Q0.),rep(0,length(r2)- Q0.)) + P1=c(P1,sample(Y,length(Y))) + P2=c(P2,PP2[r2] ) + } + if(length(r2)< Q0.) + { + P1=c(P1,rep(p1_us,Q0.)) + P2=c(P2,PP2[r2],rep( p2_us,Q0.-length(r2)) ) + } + P1=c(P1,rep( p1_us,Q00hat)) + P2=c(P2,rep( p2_us,Q00hat)) + P1=c(P1, rep( p0_1,max( Q0hat_1-temp1,0)) , rep( 0 ,max( Q0hat_2-temp2,0)) ) + P2=c(P2, rep( 0 ,max( Q0hat_1-temp1,0)) , rep( p0_2,max( Q0hat_2-temp2,0)) ) + #------------------------------------------------------------------------------------ + a=cbind(P1,P2) + return(a) +} +Horn.Est=function(data,method=c("equal", "unequal")) +{ #data is a species*plot data matrix and w is a given weight vector. + N=ncol(data);data=data[rowSums(data)>0,]; + S=nrow(data); + n=colSums(data); + if(method == "unequal"){ + w <- n/sum(n) + }else{w <- rep(1/N,N)} + W=sum(-w*log(w)); + r.data=sapply(1:N,function(k) data[,k]/n[k]); + r.pool=c(r.data%*%w); + + U=numeric(N);K=numeric(N); + for(i in 1:N){ + I=which(data[,i]*(rowSums(data)-data[,i])>0) + is=data[,i][I];pools=rowSums(data)[I]-is; + r.is=is/n[i];r.pools=r.pool[I]; + U1=sum(r.is); + sf1=sum(pools==1);sf2=sum(pools==2);sf2=ifelse(sf2==0,1,sf2); + U2=sum(r.is[pools==1])*(sf1/(2*sf2)); + U[i]=max(0,1-U1-U2)*(-w[i]*log(w[i])) + K[i]=-sum(w[i]*r.is*log(r.pools/r.is)) + } + Est=(sum(U)+sum(K))/W; + + A=sum(sapply(1:N,function(k) -w[k]*sum(r.data[,k][r.data[,k]>0]*log(r.data[,k][r.data[,k]>0])))) + G=sum(-r.pool[r.pool>0]*log(r.pool[r.pool>0])); + Mle=(G-A)/W; + return(c(1-Est,1-Mle)) +} +Two_Horn_equ <- function(X1, X2, datatype="abundance", weight="equal",nboot=50, method="all") +{ + if(datatype=="abundance"){ + p <- Two_com_correct_obspi(X1 ,X2) + commnunity1 <- rmultinom(nboot, sum(X1), p[,1]) + commnunity2 <- rmultinom(nboot, sum(X2), p[,2]) + }else{ + p <- Two_com_correct_obspi_Inc(X1 ,X2) + commnunity1 <- t(sapply(1:nrow(p),FUN = function(i) rbinom(nboot, X1[1], p[i,1]) )) + commnunity2 <- t(sapply(1:nrow(p),FUN = function(i) rbinom(nboot, X2[1], p[i,2]) )) + X1 <- X1[-1] + X2 <- X2[-1] + } + if(method=="all"){ + se <- apply(sapply(1:nboot, FUN = function(x){ + Horn.Est(data=cbind(commnunity1[,x] ,commnunity2[,x]),method=weight) + }),MARGIN = 1, sd) + value <- Horn.Est(data=cbind(X1, X2),method=weight) + out <- c(value[1], se[1], max(0,value[1]-1.96*se[1]), min(1,value[1]+1.96*se[1])) + out2 <- c(value[2], se[2],max(0,value[2]-1.96*se[2]), min(1,value[2]+1.96*se[2])) + return(list(est=out,mle=out2)) + } + if(method=="est"){ + se <-sd(sapply(1:nboot, FUN = function(x){ + Horn.Est(data=cbind(commnunity1[,x] ,commnunity2[,x]),method=weight)[1] + })) + value <- Horn.Est(data=cbind(X1, X2),method=weight) + out <- c(value[1], se, max(0,value[1]-1.96*se), min(1,value[1]+1.96*se) ) + return(out) + } + if(method=="mle"){ + se <-sd(sapply(1:nboot, FUN = function(x){ + Horn.Est(data=cbind(commnunity1[,x] ,commnunity2[,x]),method=weight)[2] + })) + value <- Horn.Est(data=cbind(X1, X2),method=weight) + out <- c(value[2], se, max(0,value[2]-1.96*se), min(1,value[2]+1.96*se) ) + return(out) + } +} +BC.Est=function(data) +{ #data is species*plot data matrix + N=ncol(data);data=data[rowSums(data)>0,]; + n=colSums(data);Tn=sum(n);w=n/Tn; + pool=rowSums(data); + Mle=sum(abs(data-matrix(rep(pool/N,N),ncol=N)))/(2*(1-1/N)*Tn); + temp=rep(0,N); + for(i in 1:N){ + I=which(data[,i]*(pool-data[,i])>0); + s.i=data[,i][I];s.pool=rowSums(data)[I]-s.i; + sf1=sum(s.pool==1);sf2=sum(s.pool==2);sf2=ifelse(sf2==0,1,sf2); + U1=sum(s.i)/Tn; + U2=sf1/(2*sf2)*sum(s.i[s.pool==1])/Tn + temp1=((N-1)/N)*max(0,w[i]-U1-U2) #(w[i]-U1-U2)# + + II=which(data[,i]==pool) + temp2=sum(abs(data[,i][-II]-pool[-II]/N))/Tn; + temp[i]=temp1+temp2; + } + Est=sum(temp)/(2-2/N); + return(c(1-Est,1-Mle)) +} +Two_BC_equ <- function(X1, X2, datatype="abundance", nboot) +{ + if(datatype=="abundance"){ + p <- Two_com_correct_obspi(X1 ,X2) + commnunity1 <- rmultinom(nboot, sum(X1), p[,1]) + commnunity2 <- rmultinom(nboot, sum(X2), p[,2]) + }else{ + p <- Two_com_correct_obspi_Inc(X1 ,X2) + commnunity1 <- t(sapply(1:nrow(p),FUN = function(i) rbinom(nboot, X1[1], p[i,1]) )) + commnunity2 <- t(sapply(1:nrow(p),FUN = function(i) rbinom(nboot, X2[1], p[i,2]) )) + X1 <- X1[-1] + X2 <- X2[-1] + } + se <- apply(sapply(1:nboot, FUN = function(x){ + BC.Est(data=cbind(commnunity1[,x] ,commnunity2[,x])) + }),MARGIN = 1, sd) + value <- BC.Est(data=cbind(X1, X2)) + out <- c(value[1], se[1], max(0,value[1]-1.96*se[1]), min(1,value[1]+1.96*se[1])) + out2 <- c(value[2], se[2],max(0,value[2]-1.96*se[2]), min(1,value[2]+1.96*se[2])) + return(list(est=out,mle=out2)) +} +SimilarityTwo=function(X, q, nboot=50, datatype="abundance",method=c("equal weight","unequal weight")) +{ + ###############################(2016.07.19 P.L.Lin) + if(datatype=="abundance"){ + p <- Two_com_correct_obspi(X[,1] ,X[,2]) + }else{ + p <- Two_com_correct_obspi_Inc(X[,1] ,X[,2]) + t <- X[1, ] + X <- X[-1, ] + } + ############################### + N=ncol(X);ni=colSums(X);n=sum(X); + pool=rowSums(X); + bX=apply(X,2,function(x) x/sum(x));pool.x=rowSums(bX)/N; + + if(q==0){ + f1=apply(X,2,function(x) sum(x==1)); + f2=apply(X,2,function(x) sum(x==2)); + Sobs=apply(X,2,function(x) sum(x>0)); + Si=Sobs+sapply(1:N, function(k) ifelse(f2[k]==0, f1[k]*(f1[k]-1)/2,f1[k]^2/(2*f2[k]))); + Sa=mean(Si); + UqN.mle=(1/N-mean(Sobs)/sum(pool>0))/(1/N-1); + CqN.mle=(N-sum(pool>0)/mean(Sobs))/(N-1); + + F1=sum(pool==1);F2=sum(pool==2); + Sg=sum(pool>0)+ifelse(F2==0,F1*(F1-1)/2,F1^2/(2*F2)); + UqN=min(1,(1/N-Sa/Sg)/(1/N-1));UqN=max(0,UqN); + CqN=min(1,(N-Sg/Sa)/(N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot); b.UqN.mle=numeric(nboot); + b.CqN=numeric(nboot); b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, t[1,k], p[i,1]) )) + } + f1=apply(XX,2,function(x) sum(x==1)); + f2=apply(XX,2,function(x) sum(x==2)); + Sobs=apply(XX,2,function(x) sum(x>0)); + Si=Sobs+sapply(1:N,function(k) ifelse(f2[k]==0,f1[k]*(f1[k]-1)/2,f1[k]^2/(2*f2[k]))) + Sa=mean(Si); + pool=rowSums(XX); + b.UqN.mle[i]=(1/N-mean(Sobs)/sum(pool>0))/(1/N-1); + b.CqN.mle[i]=(N-sum(pool>0)/mean(Sobs))/(N-1); + + F1=sum(pool==1);F2=sum(pool==2); + Sg=sum(pool>0)+ifelse(F2==0,F1*(F1-1)/2,F1^2/(2*F2)); + b.UqN[i]=min(1,(1/N-Sa/Sg)/(1/N-1)); b.UqN[i]=max(0,b.UqN[i]); + b.CqN[i]=min(1,(N-Sg/Sa)/(N-1));b.CqN[i]=max(0,b.CqN[i]); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle); #standard deviations of UqN est. and mle + se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); #standard deviations of CqN est. and mle + + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + + if(q==2){ + if(method=="equal weight"){ + a.mle=N/sum(bX^2) + g.mle=1/sum(pool.x^2); + b.mle=g.mle/a.mle; + UqN.mle=(N-b.mle)/(N-1); + CqN.mle=(1/N-1/b.mle)/(1/N-1); + + Ai=sapply(1:N,function(k) sum(X[,k]*(X[,k]-1)/(ni[k]*(ni[k]-1)))); + bX.1=apply(X,2,function(x) (x-1)/(sum(x)-1)); + temp=sapply(1:nrow(X),function(j) (sum(bX[j,]%*%t(bX[j,]))-sum(bX[j,]^2))+sum(bX[j,]*bX.1[j,])); + G=1/(sum(temp)/N^2); + + B=G/(1/mean(Ai)); + UqN=min(1,(N-B)/(N-1));UqN=max(0,UqN); + CqN=min(1,(1/N-1/B)/(1/N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot);b.UqN.mle=numeric(nboot); b.CqN=numeric(nboot);b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, t[1,k], p[i,1]) )) + } + bXX=apply(XX,2,function(x) x/sum(x)); + pool.x=rowSums(bXX)/N; + a.mle=N/sum(bXX^2);g.mle=1/sum(pool.x^2);b.mle=g.mle/a.mle; + b.UqN.mle[i]=(N-b.mle)/(N-1); + b.CqN.mle[i]=(1/N-1/b.mle)/(1/N-1); + + Ai=sapply(1:N,function(k) sum(XX[,k]*(XX[,k]-1)/(ni[k]*(ni[k]-1)))); + bXX.1=apply(XX,2,function(x) (x-1)/(sum(x)-1)); + temp=sapply(1:nrow(XX),function(j) (sum(bXX[j,]%*%t(bXX[j,]))-sum(bXX[j,]^2))+sum(bXX[j,]*bXX.1[j,])); + G=1/(sum(temp)/N^2); + + B=G/(1/mean(Ai)); + b.UqN[i]=(N-B)/(N-1); + b.CqN[i]=(1/N-1/B)/(1/N-1); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle);se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + if(method=="unequal weight"){ + a.mle=1/(N*sum((X/n)^2));g.mle=1/sum((pool/n)^2);b.mle=g.mle/a.mle; + UqN.mle=(N-b.mle)/(N-1); + CqN.mle=(1/N-1/b.mle)/(1/N-1); + + A=(1/N)*(1/sum(X*(X-1)/(n*(n-1)))); + G=1/sum(pool*(pool-1)/(n*(n-1))); + B=G/A; + UqN=min(1,(N-B)/(N-1));UqN=max(0,UqN); + CqN=min(1,(1/N-1/B)/(1/N-1));CqN=max(0,CqN); + + b.UqN=numeric(nboot);b.UqN.mle=numeric(nboot);b.CqN=numeric(nboot);b.CqN.mle=numeric(nboot); + for(i in 1:nboot){ + if(datatype=="abundance"){ + XX=sapply(1:N,function(k) rmultinom(1, ni[k], p[,k])) + }else{ + XX=sapply(1:N,function(k) sapply(1:nrow(p),FUN = function(i) rbinom(1, t[1,k], p[i,1]) )) + } + pool=rowSums(XX); + a.mle=1/(N*sum((XX/n)^2));g.mle=1/sum((pool/n)^2);b.mle=g.mle/a.mle; + b.UqN.mle[i]=(N-b.mle)/(N-1); + b.CqN.mle[i]=(1/N-1/b.mle)/(1/N-1); + + A=(1/N)*(1/sum(XX*(XX-1)/(n*(n-1)))); + G=1/sum(pool*(pool-1)/(n*(n-1))); + B=G/A; + b.UqN[i]=(N-B)/(N-1); + b.CqN[i]=(1/N-1/B)/(1/N-1); + } + se.U=sd(b.UqN);se.U.mle=sd(b.UqN.mle);se.C=sd(b.CqN);se.C.mle=sd(b.CqN.mle); + out1=rbind(c(UqN.mle,se.U.mle,min(1,UqN.mle+1.96*se.U.mle),max(0,UqN.mle-1.96*se.U.mle)), + c(UqN,se.U,min(1,UqN+1.96*se.U),max(0,UqN-1.96*se.U))); + out2=rbind(c(CqN.mle,se.C.mle,min(1,CqN.mle+1.96*se.C.mle),max(0,CqN.mle-1.96*se.C.mle)), + c(CqN,se.C,min(1,CqN+1.96*se.C),max(0,CqN-1.96*se.C))); + } + } + out1 <- cbind(out1[,c(1, 2)], out1[, 4], out1[, 3]) + colnames(out1)=c("UqN","se","95%.Lower","95%.Upper") + rownames(out1)=c("Emperical","Estimate") + out2 <- cbind(out2[,c(1, 2)], out2[, 4], out2[, 3]) + colnames(out2)=c("CqN","se","95%.Lower","95%.Upper") + rownames(out2)=c("Emperical","Estimate") + return(list(UqN=out1,CqN=out2)); +} +C2N_ee_equ_inc <- function(X) +{ + X <- as.data.frame(X) + N <- ncol(X) + t <- X[1 ,] + Y <- X[-1,] + p1 <- sapply(1:N, function(i){ + Y[, i]/t[1,i] + }) + p2 <- sapply(1:N, function(i){ + (Y[, i]-1) / (t[1,i]-1) + }) + a <- b <- c <- 0 + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N ){ + a <- a + 2*sum(p1[, i]*p1[, j]) + } + } + for(i in 1:N){ + b <- b + sum(p1[, i]*p1[, i]) + c <- c + sum(p1[, i]*p2[, i]) + } + c_mle = a / (b*(N-1)) + u_mle = ( N/ (N-1) )*a / (b+a) + c_est = a / (c*(N-1)) + u_est = ( N/ (N-1) )*a / (c+a) + out <- c(c_mle, u_mle, c_est, u_est) + return(out) +} +C2N_ee_se_inc <- function(X, nboot=50) +{ + N <- ncol(X) + t <- as.matrix(X[1 ,]) + Y <- X[-1,] + mat <- matrix(0, nboot, 4) + value <- C2N_ee_equ_inc(X) + for(i in 1:nboot){ + if(N == 2){ + p <- Two_com_correct_obspi_Inc(X1=X[, 1],X2=X[, 2]) + }else{ + p <- Boots.pop_inc(X) + } + boot.X <- matrix(0, nrow = nrow(p), ncol = N ) + for(j in 1:N){ + boot.X[,j] <- sapply(1:nrow(p), function(x) rbinom(1, t[1,j], p[x,j])) + } + boot.X = data.frame(rbind(t, boot.X),row.names = NULL) + mat[i, ] <- C2N_ee_equ_inc(boot.X) + } + se <- apply(mat, MARGIN = 2, sd) + out <- cbind(value,se) + return(out) +} +Jaccard_Sorensen_Abundance_equ=function(datatype = c("abundance", "incidence"),X1,X2,boot) +{ + if(datatype == "incidence") + { + shared.species.hat=PanEstFun.Sam(X1,X2) + alpha.species.hat=SpecInciChao2(X1, k=10, conf=0.95)[1,1]+SpecInciChao2(X2, k=10, conf=0.95)[1,1] + Esti.Jaccard=shared.species.hat/(alpha.species.hat-shared.species.hat) + Esti.Sorensen=2*shared.species.hat/alpha.species.hat + w=X1[1];z=X2[1] + X1=X1[-1];X2=X2[-1] + } + n1=sum(X1);n2=sum(X2) + if(datatype == "abundance"){w=n1;z=n2} + I=which(X1>0 & X2>0) + + MLE.Jaccard=length(I)/sum(X1+X2>0) + MLE.Sorensen=2*length(I)/(sum(X1>0)+sum(X2>0)) + ############################################# + if(datatype == "abundance") + { + shared.species.hat=PanEstFun(X1,X2) + alpha.species.hat=SpecAbunChao1(X1, k=10, conf=0.95)[1,1]+SpecAbunChao1(X2, k=10, conf=0.95)[1,1] + Esti.Jaccard=shared.species.hat/(alpha.species.hat-shared.species.hat) + Esti.Sorensen=2*shared.species.hat/alpha.species.hat + } + ############################################# + MLE.Lennon=length(I)/(min(sum(X1>0),sum(X2>0))) + MLE.Bray_Curtis=sum(sapply(I,function(I) 2*min(X1[I],X2[I])))/(n1+n2) + Morisita_Horn=2*sum(X1[I]/n1*X2[I]/n2)/sum((X1/n1)^2+(X2/n2)^2) + Morisita_Original=2*sum(X1[I]/n1*X2[I]/n2)/sum( X1*(X1-1)/n1/(n1-1) +X2*(X2-1)/n2/(n2-1)) + + U_tilde=sum(X1[I])/n1;V_tilde=sum(X2[I])/n2 + fplus1=sum(X1>0 & X2==1);fplus2=sum(X1>0 & X2==2) + fplus2=ifelse(fplus2>0,fplus2,1) + f1plus=sum(X1==1 & X2>0);f2plus=sum(X1==2 & X2>0) + f2plus=ifelse(f2plus>0,f2plus,1) + U_hat=U_tilde+(z-1)/z*fplus1/2/fplus2*sum(X1[I][X2[I]==1])/n1 + U_hat=ifelse(U_hat<=1,U_hat,1) + V_hat=V_tilde+(w-1)/w*f1plus/2/f2plus*sum(X2[I][X1[I]==1])/n2 + V_hat=ifelse(V_hat<=1,V_hat,1) + JAu=U_tilde*V_tilde/(U_tilde+V_tilde-U_tilde*V_tilde) + JAa=U_hat*V_hat/(U_hat+V_hat-U_hat*V_hat) + SAu=2*U_tilde*V_tilde/(U_tilde+V_tilde) + SAa=2*U_hat*V_hat/(U_hat+V_hat) + + if(datatype=="incidence"){ + p <- Two_com_correct_obspi_Inc(X1=c(w,X1),X2=c(z,X2)) + p1 <- p[,1];p2 <- p[,2] + }else{ + p <- Two_com_correct_obspi(X1,X2) + p1 <- p[,1] ; p2 <- p[,2] + } + + boot.Jaccard=rep(0,boot) + boot.Esti.Jaccard=rep(0,boot) + boot.Sorensen=rep(0,boot) + boot.Esti.Sorensen=rep(0,boot) + boot.Lennon=rep(0,boot) + boot.Bray_Curtis=rep(0,boot) + boot.Morisita_Horn=rep(0,boot) + boot.Morisita_Original=rep(0,boot) + + + boot.U_hat=rep(0,boot) + boot.V_hat=rep(0,boot) + boot.JAu=rep(0,boot) + boot.JAa=rep(0,boot) + boot.SAu=rep(0,boot) + boot.SAa=rep(0,boot) + for(h in 1:boot) + { + if(datatype == "abundance") + { + boot.X1=rmultinom(1,w,p1) + boot.X2=rmultinom(1,z,p2) + boot.n1=sum( boot.X1);boot.n2=sum(boot.X2) + boot.shared.species.hat=PanEstFun(boot.X1,boot.X2) + boot.alpha.species.hat=SpecAbunChao1(boot.X1, k=10, conf=0.95)[1,1]+SpecAbunChao1(boot.X2, k=10, conf=0.95)[1,1] + boot.Esti.Jaccard[h]=boot.shared.species.hat/(boot.alpha.species.hat-boot.shared.species.hat) + boot.Esti.Sorensen[h]=2*boot.shared.species.hat/boot.alpha.species.hat + } + if(datatype == "incidence") + { + boot.X1=sapply(1:length(p1), function(i) rbinom(1, w, p1[i])) + boot.X2=sapply(1:length(p2), function(i) rbinom(1, z, p2[i])) + boot.shared.species.hat=PanEstFun.Sam(c(w,boot.X1), c(z,boot.X2) ) + boot.alpha.species.hat=SpecInciChao2(c(w,boot.X1), k=10, conf=0.95)[1,1]+SpecInciChao2(c(z,boot.X2), k=10, conf=0.95)[1,1] + boot.Esti.Jaccard[h]=boot.shared.species.hat/(boot.alpha.species.hat-boot.shared.species.hat) + boot.Esti.Sorensen[h]=2*boot.shared.species.hat/boot.alpha.species.hat + n1=sum(boot.X1) + n2=sum(boot.X2) + } + I=which( boot.X1>0 & boot.X2>0) + + boot.Jaccard[h]= length(I)/sum(boot.X1+boot.X2>0) + boot.Sorensen[h]=2*length(I)/(sum(boot.X1>0)+sum(boot.X2>0)) + boot.Lennon[h]=length(I)/(min(sum(boot.X1>0),sum(boot.X2>0))) + boot.Bray_Curtis[h]=sum(sapply(I,function(I) 2*min(boot.X1[I],boot.X2[I])))/sum(boot.X1+boot.X2) + boot.Morisita_Horn[h]=2*sum(boot.X1[I]/n1*boot.X2[I]/n2)/sum((boot.X1/n1)^2+(boot.X2/n2)^2) + boot.Morisita_Original[h]=2*sum(boot.X1[I]/n1*boot.X2[I]/n2)/sum( boot.X1*(boot.X1-1)/n1/(n1-1) +boot.X2*(boot.X2-1)/n2/(n2-1)) + + boot.U_tilde=sum( boot.X1[I])/n1 + boot.V_tilde=sum( boot.X2[I])/n2 + fplus1=sum(boot.X1>0 & boot.X2==1);fplus2=sum(boot.X1>0 & boot.X2==2) + fplus2=ifelse(fplus2>0,fplus2,1) + f1plus=sum(boot.X1==1 & boot.X2>0);f2plus=sum(boot.X1==2 & boot.X2>0) + f2plus=ifelse(f2plus>0,f2plus,1) + boot.U_hat[h]=boot.U_tilde+(z-1)/z*fplus1/2/fplus2*sum(boot.X1[I][boot.X2[I]==1])/n1 + boot.U_hat[h]=ifelse(boot.U_hat[h]<=1,boot.U_hat[h],1) + boot.V_hat[h]=boot.V_tilde+(w-1)/w*f1plus/2/f2plus*sum(boot.X2[I][boot.X1[I]==1])/n2 + boot.V_hat[h]=ifelse(boot.V_hat[h]<=1,boot.V_hat[h],1) + boot.JAu[h]=boot.U_tilde*boot.V_tilde/(boot.U_tilde+boot.V_tilde-boot.U_tilde*boot.V_tilde) + boot.JAa[h]=boot.U_hat[h]*boot.V_hat[h]/(boot.U_hat[h]+boot.V_hat[h]-boot.U_hat[h]*boot.V_hat[h]) + boot.SAu[h]=2*boot.U_tilde*boot.V_tilde/(boot.U_tilde+boot.V_tilde) + boot.SAa[h]=2*boot.U_hat[h]*boot.V_hat[h]/(boot.U_hat[h]+boot.V_hat[h]) + } + a=matrix(0,12,6) + a[1,]=c(min(MLE.Jaccard,1),sd(boot.Jaccard),rep(0,4)) + a[2,]=c(Esti.Jaccard,sd(boot.Esti.Jaccard),rep(0,4)) + a[3,]=c(min(MLE.Sorensen,1),sd(boot.Sorensen),rep(0,4)) + a[4,]=c(Esti.Sorensen,sd(boot.Esti.Sorensen),rep(0,4)) + a[5,]=c(MLE.Lennon,sd(boot.Lennon),rep(0,4)) + a[6,]=c(min(MLE.Bray_Curtis,1),sd(boot.Bray_Curtis),rep(0,4)) + a[7,]=c(min(Morisita_Horn,1),sd(boot.Morisita_Horn),rep(0,4)) + a[8,]=c(Morisita_Original,sd(boot.Morisita_Original),rep(0,4)) + a[9,]=c(min(JAu,1),sd(boot.JAu),U_tilde,V_tilde,rep(0,2)) + a[10,]=c(min(JAa,1),sd(boot.JAa),U_hat,sd(boot.U_hat),V_hat,sd(boot.V_hat)) + a[11,]=c(min(SAu,1),sd(boot.SAu),U_tilde,V_tilde,rep(0,2)) + a[12,]=c(min(SAa,1),sd(boot.SAa),U_hat,sd(boot.U_hat),V_hat,sd(boot.V_hat)) + round(a,4) +} +###################################################NO USE +entropy_MEE_equ=function(X) +{ + x=X + x=x[x>0] + n=sum(x) + UE <- sum(x/n*(digamma(n)-digamma(x))) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0) + { + A <-1-ifelse(f2 > 0, (n-1)*f1/((n-1)*f1+2*f2), (n-1)*f1/((n-1)*f1+2)) + B=sum(x==1)/n*(1-A)^(-n+1)*(-log(A)-sum(sapply(1:(n-1),function(k){1/k*(1-A)^k}))) + } + if(f1==0){B=0} + if(f1==1 & f2==0){B=0} + UE+B +} +Subentropy_MEE_equ=function(X,n) +{ + x=X + x=x[x>0] + UE <- sum(x/n*(digamma(n)-digamma(x))) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0) + { + A <-1-ifelse(f2 > 0, (n-1)*f1/((n-1)*f1+2*f2), (n-1)*(f1-1)/((n-1)*(f1-1)+2)) + B=sum(x==1)/n*(1-A)^(-n+1)*(-log(A)-sum(sapply(1:(n-1),function(k){1/k*(1-A)^k}))) + } + if(f1==1 & f2==0){B=0} + if(f1==0){B=0} + UE+B +} +#propose_equ=function(X1,X2,weight) +#{ + # if(weight=="equal"){ + # w1 <- 1/2 + # Y1=X1;Y2=X2 + #w2=1-w1 + #n1=sum(X1);n2=sum(X2) + #I=which(X1>0 & X2>0) + #X1=X1[I];X2=X2[I] + #if(length(I)>0) + #{ + # temp1=shared_entropy(X1,X2,length(I),n1,n2,w1) + #temp2=shared_entropy(X2,X1,length(I),n2,n1,w2) + #temp3=sum_fun(Y1,Y2,w1)+sum_fun(X1=Y2,X2=Y1,w1=w2) + #S12_part=temp1+ temp2+temp3 + #} + #if(length(I)==0){ S12_part=0} + ##################### + #r1=which(Y1>0 & Y2 ==0) + #Z1=Y1[r1] + #r1_phat=Z1/n1 + #temp5=-w1*log(w1)*sum(r1_phat)+w1*(Subentropy_MEE_equ(Z1,n1) ) + #r2=which(Y1==0 & Y2 >0) + #Z2=Y2[r2] + #r2_phat=Z2/n2 + #temp6=-w2*log(w2)*sum(r2_phat)+w2*(Subentropy_MEE_equ(Z2,n2) ) + #D1_alpha=( w1*entropy_MEE_equ(Y1)+ w2*entropy_MEE_equ(Y2)) + + #Hr=S12_part+temp5+temp6 + #Ha=D1_alpha + #Ch=1-(Hr-Ha)/(-w1*log(w1)-w2*log(w2)) + + #}else if(weight=="unequal"){ + # w1=sum(X1)/(sum(X1)+sum(X2)) + #w2=1-w1 + #Ha=w1*entropy_MEE_equ(X1)+w2*entropy_MEE_equ(X2) + #Hr=entropy_MEE_equ(X1+X2) + #Ch=1-(Hr-Ha)/(-w1*log(w1)-w2*log(w2)) + #} + #return(Ch) +#} +Horn_MLE_equ=function(X1,X2,weight) +{ + if(weight=="equal"){ + w1 <- 1/2 + }else{ + w1=sum(X1)/(sum(X1)+sum(X2)) + } + n1=sum(X1) + n2=sum(X2) + p1=X1/n1 + p2=X2/n2 + w2=1-w1 + pbar=w1*p1+w2*p2 + + pbar=pbar[pbar>0] + p1=p1[p1>0] + p2=p2[p2>0] + Hr=sum(- pbar*log( pbar)) + Ha=w1*sum(-p1*log(p1))+w2*sum(-p2*log(p2)) + Ch=1-(Hr-Ha)/(-w1*log(w1)-w2*log(w2)) + + out=c(Ch) + return(out) +} +KH_Bray_curtis_equ=function(X1,X2,w1) +{ + w2=1-w1 + p1bar_1=p1bar_equ(X1) + p1bar_2=p1bar_equ(X2) + I=which(X1*X2>0) + Y1=X1[I];Y2=X2[I] + n1=sum(X1);n2=sum(X2) + f.1=sum(X1>0 & X2==1) + f.2=sum(X1>0 & X2==2);f.2=ifelse(f.2>0,f.2,1) + f1.=sum(X1==1 & X2>0) + f2.=sum(X1==2 & X2>0);f2.=ifelse(f2.>0,f2.,1) + + temp=0 + for(i in 1:length(I) ) + { + a1=w1^2*ifelse(Y1[i]>1,Y1[i]*(Y1[i]-1)/n1/(n1-1),p1bar_1^2)- + 2*w1*w2*ifelse(Y1[i]>1,Y1[i]/n1,p1bar_1)*ifelse(Y2[i]>1,Y2[i]/n2,p1bar_2)+ + w2^2*ifelse(Y2[i]>1,Y2[i]*(Y2[i]-1)/n2/(n2-1),p1bar_2^2) + a1=max(0,a1) + if(a1==0) + { + a1=(abs(w1*ifelse(Y1[i]>1,Y1[i]/n1,p1bar_1)-w2*ifelse(Y2[i]>1,Y2[i]/n2,p1bar_2)))^2 + } + temp=temp+a1^0.5 + } + I=which(X1>0 & X2==1) + Y1=X1[I];Y2=X2[I] + if(length(I)>0) + { + for(i in 1:length(I) ) + { + a1=w1^2*ifelse(Y1[i]>1,Y1[i]*(Y1[i]-1)/n1/(n1-1),p1bar_1^2)- + 2*w1*w2*ifelse(Y1[i]>1,Y1[i]/n1,p1bar_1)*p1bar_2+ + w2^2*p1bar_2^2 + a1=max(0,a1) + if(a1==0) + { + a1=(abs(w1*ifelse(Y1[i]>1,Y1[i]/n1,p1bar_1)-w2*ifelse(Y2[i]>1,Y2[i]/n2,p1bar_2)))^2 + + } + temp=temp+a1^0.5*f.1/2/f.2 #(1-p1bar_2)/(n2*p1bar_2) + } + } + I=which(X1==1 & X2>0) + Y1=X1[I];Y2=X2[I] + if(length(I)>0) + { + for(i in 1:length(I) ) + { + a1=w2^2*ifelse(Y2[i]>1,Y2[i]*(Y2[i]-1)/n2/(n2-1),p1bar_2^2)- + 2*w1*w2*ifelse(Y2[i]>1,Y2[i]/n2,p1bar_2)*p1bar_1+ + w1^2*p1bar_1^2 + a1=max(0,a1) + if(a1==0) + { + a1=(abs(w1*ifelse(Y1[i]>1,Y1[i]/n1,p1bar_1)-w2*ifelse(Y2[i]>1,Y2[i]/n2,p1bar_2)))^2 + + } + temp=temp+a1^0.5*f1./2/f2.#(1-p1bar_1)/(n1*p1bar_1) + } + } + f11=sum(X1==1 & X2==1) + sumtemp=f11*abs(w1*p1bar_1-w2*p1bar_2)*(1-p1bar_1)/(n1*p1bar_1)*(1-p1bar_2)/(n2*p1bar_2) + if(p1bar_1==0 | p1bar_2==0) + { + sumtemp=0 + } + temp=temp+sumtemp + + ########################## + da1=X1 + da2=X2 + I=which(da1*da2>0); sda1=da1[I];sda2=da2[I] + + + U1=sum(sda1)/n1; V1=sum(sda2)/n2; + ff1=sum(sda2==1);#ff1=ifelse(ff1==0, 1,ff1); + ff2=sum(sda2==2);ff2=ifelse(ff2==0, 1,ff2); + + f1=sum(sda1==1);#f1=ifelse(f1==0, 1,f1); + f2=sum(sda1==2);f2=ifelse(f2==0, 1,f2); + + U2=(ff1/(2*ff2))*sum(sda1[sda2==1])/n1;uC1=f1/sum(sda1); + V2=(f1/(2*f2))*sum(sda2[sda1==1])/n2;uC2=ff1/sum(sda2); + U=U2+U1;U=min(U,1) + V=V1+V2;V=min(V,1) + ########################## + mle=sum(abs(w1*X1/n1-w2*X2/n2)) + out=w1*(1-U)+w2*(1-V)+temp + if(out<0 | out>1) + { + out=mle + } + return(out) +} +MLE_Braycurtis_equ=function(X1,X2,w1) +{ + w2=1-w1 + n1=sum(X1) + n2=sum(X2) + mle=1-sum(abs(w1*X1/n1-w2*X2/n2)) + p <- Two_com_correct_obspi(X1,X2) + p1hat=p[, 1] + p2hat=p[, 2] + boot.BC=rep(0,50) + for(h in 1:50) + { + boot.X1=rmultinom(1,n1,p1hat) + boot.X2=rmultinom(1,n2,p2hat) + boot.BC[h]=sum(abs(w1*boot.X1/n1-w2*boot.X2/n2)) + } + out=c(min(mle,1),sd(boot.BC));out=round(out,4) + return(out) +} +KH_Braycurtis_equ=function(X1,X2,w1) +{ + BC=1-KH_Bray_curtis_equ(X1,X2,w1) + n1=sum(X1) + n2=sum(X2) + p <- Two_com_correct_obspi(X1,X2) + p1hat=p[, 1] + p2hat=p[, 2] + boot.BC=rep(0,50) + for(h in 1:50) + { + boot.X1=rmultinom(1,n1,p1hat) + boot.X2=rmultinom(1,n2,p2hat) + boot.BC[h]=KH_Bray_curtis_equ(boot.X1,boot.X2,w1) + } + out=c(min(BC,1),sd(boot.BC));out=round(out,4) + return(out) +} +Two_horn_MLE_equ=function(X1,X2) +{ + horn_MLE_equ=function(X1,X2){ + n1=sum(X1) + n2=sum(X2) + w1=n1/(n1+n2);w2=1-w1 + pool.X= X1+X2 + pool.n= n1+n2 + pool.phat=pool.X/pool.n;pool.phat=pool.phat[pool.phat>0] + p1hat=X1/n1;p1hat=p1hat[p1hat>0] + p2hat=X2/n2;p2hat=p2hat[p2hat>0] + Hr=-sum(pool.phat*log(pool.phat)) + Ha=-w1*sum(p1hat*log(p1hat))-w2*sum(p2hat*log(p2hat)) + horn=(Hr-Ha)/(-w1*log(w1)-w2*log(w2)) + horn + } + n1=sum(X1) + n2=sum(X2) + horn=horn_MLE_equ(X1,X2) + boot.horn=rep(0,50) + boot.p1=X1/n1 + boot.p2=X2/n2 + for(h in 1:50) + { + boot.X1=rmultinom(1,n1,boot.p1) + boot.X2=rmultinom(1,n2,boot.p2) + boot.horn[h]=horn_MLE_equ(boot.X1,boot.X2) + } + out=c(min(horn,1),sd(boot.horn));out=round(out,4) + return(out) +} +Chao1_equ=function(x,conf=0.95) +{ + z <--qnorm((1 - conf)/2) + x=x[x>0] + D=sum(x>0) + f1=sum(x==1) + f2=sum(x==2) + n=sum(x) + if (f1 > 0 & f2 > 0) + { + S_Chao1 <- D + (n - 1)/n*f1^2/(2*f2) + var_Chao1 <- f2*((n - 1)/n*(f1/f2)^2/2 + + ((n - 1)/n)^2*(f1/f2)^3 + ((n - 1 )/n)^2*(f1/f2)^4/4) + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } + else if (f1 > 1 & f2 == 0) + { + S_Chao1 <- D + (n - 1)/n*f1*(f1 - 1)/(2*(f2 + 1)) + var_Chao1 <- (n - 1)/n*f1*(f1 - 1)/2 + + ((n - 1)/n)^2*f1*(2*f1 - 1)^2/4 - ((n - 1)/n)^2*f1^4/4/S_Chao1 + + t <- S_Chao1 - D + K <- exp(z*sqrt(log(1 + var_Chao1/t^2))) + CI_Chao1 <- c(D + t/K, D + t*K) + } + else + { + S_Chao1 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i) sum(x==i)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*sum(x==i))))^2/n + var_Chao1 <- var_obs + P <- sum(sapply(i, function(i) sum(x==i)*exp(-i)/D)) + CI_Chao1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + return( c( round(c(S_Chao1,var_Chao1^0.5,CI_Chao1[1],CI_Chao1[2]),1),conf) ) +} +Chao1_bc_equ=function(x,conf=0.95) +{ + z <- -qnorm((1 - conf)/2) + x=x[x>0] + D=sum(x>0) + f1=sum(x==1) + f2=sum(x==2) + n=sum(x) + + S_Chao1_bc <- D + (n - 1)/n*f1*(f1 - 1)/(2*(f2 + 1)) + var_Chao1_bc <- (n - 1)/n*f1*(f1 - 1)/2/(f2 + 1) + + ((n - 1)/n)^2*f1*(2*f1 - 1)^2/4/(f2 + 1)^2 + ((n - 1)/n)^2*f1^2*f2*(f1 - 1)^2/4/(f2 + 1)^4 + + t <- round(S_Chao1_bc - D, 5) + if (t != 0) + { + K <- exp(z*sqrt(log(1 + var_Chao1_bc/t^2))) + CI_Chao1_bc <- c(D + t/K, D + t*K) + } + if(t == 0) + { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)sum(x==i)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*sum(x==i))))^2/n + var_Chao1_bc <- var_obs + P <- sum(sapply(i, function(i)sum(x==i)*exp(-i)/D)) + CI_Chao1_bc <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + round(c(S_Chao1_bc,var_Chao1_bc^0.5,CI_Chao1_bc[1],CI_Chao1_bc[2]) ,1) +} +SpecAbunAce_equ<- function(data, k=10, conf=0.95) +{ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_ACE <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + }else{ + gamma_rare_hat_square <- 0 + } + S_ace <- D_abun + D_rare/C_rare + f(1, x)/C_rare*gamma_rare_hat_square + return(list(S_ace, gamma_rare_hat_square)) + } + s_ace <- S_ACE(x, k)[[1]] + gamma_rare_hat_square <- S_ACE(x, k)[[2]] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (gamma_rare_hat_square != 0){ + si <- sum(sapply(u, function(u)u*(u - 1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(D_rare*si + f(1, x)*si) - + f(1, x)*D_rare*si*(-2*(1 - f(1, x)/n_rare)*(n_rare - f(1, x))/n_rare^2*n_rare*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*(2*n_rare - 1)) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 - #g2 + (1 - f(1, x)/n_rare + f(1, x)*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g3 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)*(si + D_rare*q*(q - 1)) - + f(1, x)*D_rare*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + #g2 + (q*(f(1, x))^2/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g3 + } + return(d) + } else { + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } + return(d) + } + } + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_ace) + } else { + cov.f <- -f(i, x)*f(j, x)/s_ace + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ace <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + if (var_ace > 0){ + var_ace <- var_ace + } else { + var_ace <- NA + } + ###################### + t <- round(s_ace - D, 5) + if (is.nan(t) == F){ + if (t != 0){ + C <- exp(z*sqrt(log(1 + var_ace/(s_ace - D)^2))) + CI_ACE <- c(D + (s_ace - D)/C, D + (s_ace - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_ace <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_ACE <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_ACE <- c(NaN, NaN) + } + + table <- matrix(c(s_ace, sqrt(var_ace), CI_ACE), ncol = 4) + #colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + #rownames(table) <- "ACE (Chao & Lee, 1992)" + return(round(table,1)) +} +SpecAbunAce1_equ<- function(data ,k=10, conf=0.95) +{ + data <- as.numeric(data) + f <- function(i, data){length(data[which(data == i)])} + basicAbun <- function(data, k){ + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun), 1), + sep = "="), ncol = 1) + colnames(BASIC.DATA) <- c("Value") + rownames(BASIC.DATA) <- c("Number of observed individuals", "Number of observed species","Cut-off point", + "Number of observed in dividuals for rare species", "Number of observed species for rare species", + "Estimation of the sample converage for rare species", + "Estimation of CV for rare species in ACE", "Estimation of CV1 for rare species in ACE-1", + "Number of observed species for abundant species", "Number of observed species for abundant species") + return(list(BASIC.DATA, n, D, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) + } + + z <- -qnorm((1 - conf)/2) + + n <- basicAbun(data, k)[[2]] + D <- basicAbun(data, k)[[3]] + n_rare <- basicAbun(data, k)[[4]] + D_rare <- basicAbun(data, k)[[5]] + C_rare <- basicAbun(data, k)[[6]] + CV_rare <- basicAbun(data, k)[[7]] + CV1_rare <- basicAbun(data, k)[[8]] + n_abun <- basicAbun(data, k)[[9]] + D_abun <- basicAbun(data, k)[[10]] + x <- data[which(data != 0)] + ############################# + S_ACE1 <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + s_ace1 <- D_abun + D_rare/C_rare + f(1, x)/C_rare*gamma_rare_1_square + + return(list(s_ace1, gamma_rare_1_square)) + } + s_ace1 <- S_ACE1(x, k)[[1]] + gamma_rare_1_square <- S_ACE1(x, k)[[2]] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (gamma_rare_1_square != 0){ + u <- c(1:k) + si <- sum(sapply(u, function(u)u*(u-1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(D_rare*si + f(1, x)*si) - + f(1, x)*D_rare*si*(-2*(1 - f(1, x)/n_rare)*(n_rare - f(1, x))/n_rare^2*n_rare*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*(2*n_rare - 1)) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 - #g2 + (1 - f(1, x)/n_rare + f(1, x)*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g3 + ((1 - f(1, x)/n_rare)^3*(n_rare*(n_rare - 1))^2*(2*f(1, x)*D_rare*si^2 + f(1, x)^2*si^2) - #g4 + f(1, x)^2*D_rare*si^2*(3*(1 - f(1, x)/n_rare)^2*(f(1, x) - n_rare)/(n_rare)^2*(n_rare*(n_rare - 1))^2 + + (1 - f(1, x)/n_rare)^3*2*n_rare*(n_rare - 1)^2 + (1 - f(1, x)/n_rare)^3*n_rare^2*2*(n_rare - 1)) + )/(1 - f(1, x)/n_rare)^6/n_rare^4/(n_rare - 1)^4 - + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*(2*f(1, x)*si) - #g5 + f(1, x)^2*si*(2*(1 - f(1, x)/n_rare)*(f(1, x) - n_rare)/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare) + )/(1 - f(1, x)/n_rare)^4/n_rare^2/(n_rare - 1)^2 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g1 + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)*(si + D_rare*q*(q - 1)) - + f(1, x)*D_rare*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + #g2 + (q*(f(1, x))^2/n_rare^2)/(1 - f(1, x)/n_rare)^2 + #g3 + ((1 - f(1, x)/n_rare)^3*n_rare^2*(n_rare - 1)^2*f(1, x)^2*(si^2 + 2*D_rare*si*q*(q - 1)) - #g4 + f(1, x)^2*D_rare*si^2*(3*(1 - f(1, x)/n_rare)^2*(f(1, x)*q/n_rare^2)*(n_rare*(n_rare - 1))^2 + + 2*(1 - f(1, x)/n_rare)^3*n_rare*q*(n_rare - 1)^2 + 2*(1 - f(1, x)/n_rare)^3*n_rare^2*(n_rare - 1)*q) + )/(1 - f(1, x)/n_rare)^6/(n_rare)^4/(n_rare - 1)^4 - + ((1 - f(1, x)/n_rare)^2*n_rare*(n_rare - 1)*f(1, x)^2*q*(q - 1) - #g5 + f(1, x)^2*si*(2*(1 - f(1, x)/n_rare)*f(1, x)*q/n_rare^2*n_rare*(n_rare - 1) + + (1 - f(1, x)/n_rare)^2*q*(n_rare - 1) + (1 - f(1, x)/n_rare)^2*n_rare*q) + )/(1 - f(1, x)/n_rare)^4/(n_rare)^2/(n_rare - 1)^2 + } + return(d) + } else { + u <- c(1:k) + si <- sum(sapply(u, function(u)u*(u-1)*f(u, x))) + if ( q == 1){ + d <- (1 - f(1, x)/n_rare + D_rare*(n_rare - f(1, x))/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } else if(q > k){ + d <- 1 + } else { + d <- (1 - f(1, x)/n_rare - D_rare*q*f(1, x)/n_rare^2)/(1 - f(1, x)/n_rare)^2 #g1 + } + return(d) + } + } + + COV.f <- function(i,j){ + if (i == j){ + cov.f <- f(i, x)*(1 - f(i, x)/s_ace1) + } else { + cov.f <- -f(i, x)*f(j, x)/s_ace1 + } + return(cov.f) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ace1 <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.f(i, j), i, j)) + if (var_ace1 > 0){ + var_ace1 <- var_ace1 + } else { + var_ace1 <- NA + } + ###################### + t <- round(s_ace1 - D, 5) + if (is.nan(t) == F){ + if (t != 0){ + C <- exp(z*sqrt(log(1 + var_ace1/(s_ace1 - D)^2))) + CI_ACE1 <- c(D + (s_ace1 - D)/C, D + (s_ace1 - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)f(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*f(i, x))))^2/n + var_ace1 <- var_obs + P <- sum(sapply(i, function(i)f(i, x)*exp(-i)/D)) + CI_ACE1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + }else{ + CI_ACE1 <- c(NaN, NaN) + } + + table <- matrix(c(s_ace1, sqrt(var_ace1), CI_ACE1), ncol = 4) + #colnames(table) <- c("Estimate", "Est_s.e.", paste(conf*100,"% Lower Bound"), paste(conf*100,"% Upper Bound")) + #rownames(table) <- "ACE-1 (Chao & Lee, 1992)" + return(round(table,1)) +} +SpecInciChao2 <-function(data, k=10, conf=0.95) +{ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + if (Q(1, x)>0 & Q(2, x) > 0){ + S_Chao2 <- D + (t - 1)/t*Q(1, x)^2/(2*Q(2, x)) + var_Chao2 <- Q(2, x)*((t - 1)/t*(Q(1, x)/Q(2, x))^2/2 + ((t - 1)/t)^2*(Q(1, x)/Q(2, x))^3 + ((t - 1)/t)^2*(Q(1, x)/Q(2, x))^4/4) + + tt <- S_Chao2 - D + K <- exp(z*sqrt(log(1 + var_Chao2/tt^2))) + CI_Chao2 <- c(D + tt/K, D + tt*K) + } else if (Q(1, x)>1 & Q(2, x) == 0){ + S_Chao2 <- D+(t-1)/t*Q(1,x)*(Q(1,x)-1)/(2*(Q(2,x)+1)) + var_Chao2=(t-1)/t*Q(1,x)*(Q(1,x)-1)/2+((t-1)/t)^2*Q(1,x)*(2*Q(1,x)-1)^2/4-((t-1)/t)^2*Q(1,x)^4/4/S_Chao2 + + tt=S_Chao2-D + K=exp(z*sqrt(log(1+var_Chao2/tt^2))) + CI_Chao2=c(D+tt/K,D+tt*K) + } else { + S_Chao2 <- D + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_Chao2 <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Chao2<- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(S_Chao2, sqrt(var_Chao2), CI_Chao2), ncol = 4) + return(table) +} +SpecInciChao2bc <-function(data, k=10, conf=0.95) +{ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + + S_Chao2_bc <- D + (t - 1)/t*Q(1, x)*(Q(1, x) - 1)/(2*(Q(2, x) + 1)) + var_Chao2_bc <- (t - 1)/t*Q(1, x)*(Q(1, x) - 1)/2/(Q(2, x) + 1) + ((t - 1)/t)^2*Q(1, x)*(2*Q(1, x) - 1)^2/4/(Q(2, x) + 1)^2 + ((t - 1)/t)^2*Q(1, x)^2*Q(2, x)*(Q(1, x) - 1)^2/4/(Q(2, x) + 1)^4 + + tt <- S_Chao2_bc - D + if (tt != 0){ + K <- exp(z*sqrt(log(1 + var_Chao2_bc/tt^2))) + CI_Chao2_bc <- c(D + tt/K, D + tt*K) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_Chao2_bc <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Chao2_bc <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(S_Chao2_bc, sqrt(var_Chao2_bc), CI_Chao2_bc), ncol = 4) + return(table) +} +SpecInciModelh <-function(data, k=10, conf=0.95) +{ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_ICE <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*Q(j, x))) + a2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*a1/a2/(a2 - 1) - 1,0) + s_ice <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square + CV_infreq_h <- sqrt(gamma_infreq_square) + return(c(s_ice, CV_infreq_h)) + } + s_ice <- S_ICE(x, k)[1] + CV_infreq_h <- S_ICE(x, k)[2] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (CV_infreq_h != 0){ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*n_infreq*2) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 #g4 + } + return(d) + }else{ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_ice) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_ice + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ice <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_ice > 0){ + var_ice <- var_ice + } else { + var_ice <- NA + cat("Warning: In this case, it can't estimate the variance of Model(h) estimation", "\n\n") + } + ###################### + if (round(s_ice - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_ice/(s_ice - D)^2))) + CI_Model_h <- c(D + (s_ice - D)/C, D + (s_ice - D)*C) + }else{ + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_ice <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_h <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_ice, sqrt(var_ice), CI_Model_h), ncol = 4) + return(table) +} +SpecInciModelh1 <-function(data, k=10, conf=0.95) +{ + data <- as.numeric(data) + z <- -qnorm((1 - conf)/2) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + basicInci <- function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), + c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), + sep = "="), ncol=1) + colnames(BASIC.DATA)=c("Value") + rownames(BASIC.DATA)=c("Number of observed species","Number of sample/quadrats","Cut-off point", + "Number of observed species for infrequent species","Estimated sample coverage for infrequent species", + "Estimated CV for infrequent species", + "Number of observed species for frequent species") + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq, D_freq)) + } + D <- basicInci(data, k)[[2]] + D_infreq <- basicInci(data, k)[[4]] + C_infreq <- basicInci(data, k)[[5]] + CV_infreq <- basicInci(data, k)[[6]] + D_freq <- basicInci(data, k)[[7]] + + S_Model_H1 <- function(x, k){ + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*Q(j, x))) + a2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*a1/a2/(a2 - 1) - 1,0) + gamma_infreq_square_1 <- max(gamma_infreq_square*(1 + Q(1, x)/C_infreq*t/(t - 1)*a1/a2/(a2 - 1)), 0) + s_Model_h1 <- D_freq + D_infreq/C_infreq + Q(1, x)/C_infreq*gamma_infreq_square_1 + CV_infreq_h1 <- sqrt(gamma_infreq_square_1) + return(c(s_Model_h1, CV_infreq_h1)) + } + s_Model_h1 <- S_Model_H1(x, k)[1] + CV_infreq_h1 <- S_Model_H1(x, k)[2] + #### differential #### + u <- c(1:k) + diff <- function(q){ + if (CV_infreq_h1 != 0){ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*(D_infreq*si + Q(1, x)*si) - #g3 + Q(1, x)*D_infreq*si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + (C_infreq - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(2*Q(1, x)*D_infreq*si^2 + Q(1, x)^2*si^2) - #g5 + Q(1, x)^2*D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + (t/(t - 1))*si*(C_infreq^2*n_infreq*(n_infreq - 1)*2*Q(1, x) - Q(1, x)^2*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*(n_infreq - 1) + C_infreq^2*n_infreq) #g6 + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + 2*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*n_infreq*2) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*Q(1, x)^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(si^2 + D_infreq*2*si*2) - #g5 + D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*2*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)*2) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + t/(t - 1)*Q(1, x)^2*(C_infreq^2*n_infreq*(n_infreq - 1)*2 - si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*2*(n_infreq - 1) + C_infreq^2*2*n_infreq) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 + #g2 + t/(t - 1)*(C_infreq^2*n_infreq*(n_infreq - 1)*Q(1, x)*(si + q*(q - 1)*D_infreq) - Q(1, x)*D_infreq*si*( #g3 + 2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 - + ( - Q(1, x)*dc_infreq)/C_infreq^2 + #g4 + (t/(t - 1))^2*Q(1, x)^2*(C_infreq^3*n_infreq^2*(n_infreq - 1)^2*(si^2 + D_infreq*2*si*q*(q - 1)) - #g5 + D_infreq*si^2*(3*C_infreq^2*dc_infreq*n_infreq^2*(n_infreq - 1)^2 + C_infreq^3*2*n_infreq*q*(n_infreq - 1)^2 + C_infreq^3*n_infreq^2*2*(n_infreq - 1)*q) + )/C_infreq^6/n_infreq^4/(n_infreq - 1)^4 - + t/(t - 1)*Q(1, x)^2*(C_infreq^2*n_infreq*(n_infreq - 1)*q*(q - 1) - #g6 + si*(2*C_infreq*dc_infreq*n_infreq*(n_infreq - 1) + C_infreq^2*q*(n_infreq - 1) + C_infreq^2*n_infreq*q) + )/C_infreq^4/n_infreq^2/(n_infreq - 1)^2 + } + return(d) + }else{ + n_infreq <- sum(x[which(x <= k)]) + si <- sum(sapply(u, function(u)u*(u-1)*Q(u, x))) + if ( q == 1){ + dc_infreq <- - (n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x))*2*Q(1, x)*(t - 1) - + (t - 1)*Q(1, x)^2*((t - 1)*(Q(1, x) + n_infreq) + 2*Q(2, x)))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } else if (q == 2){ + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*(2*(t - 1)*Q(1, x) + 2*(n_infreq + 2*Q(2, x))))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + }else if(q > k){ + d <- 1 + } else { + dc_infreq <- - ( - (t - 1)*Q(1, x)^2*((t - 1)*Q(1, x)*q + 2*Q(2, x)*q))/(n_infreq*((t - 1)*Q(1, x) + 2*Q(2, x)))^2 + d <- (C_infreq - D_infreq*dc_infreq)/C_infreq^2 #g2 + } + return(d) + } + } + + COV.q <- function(i,j){ + if (i == j){ + cov.q <- Q(i, x)*(1 - Q(i, x)/s_Model_h1) + } else { + cov.q <- -Q(i, x)*Q(j, x)/s_Model_h1 + } + return(cov.q) + } + + i <- rep(sort(unique(x)),each = length(unique(x))) + j <- rep(sort(unique(x)),length(unique(x))) # all combination + + var_ice1 <- sum(mapply(function(i, j)diff(i)*diff(j)*COV.q(i, j), i, j)) + if (var_ice1 > 0){ + var_ice1 <- var_ice1 + } else { + var_ice1 <- NA + cat("Warning: In this case, it can't estimate the variance of Model(h)-1 estimation", "\n\n") + } + ###################### + if (round(s_Model_h1 - D, 5) != 0){ + C <- exp(z*sqrt(log(1 + var_ice1/(s_Model_h1 - D)^2))) + CI_Model_h1 <- c(D + (s_Model_h1 - D)/C, D + (s_Model_h1 - D)*C) + } else { + i <- c(1:max(x)) + i <- i[unique(x)] + var_obs <- sum(sapply(i, function(i)Q(i, x)*(exp(-i) - exp(-2*i)))) - + (sum(sapply(i, function(i)i*exp(-i)*Q(i, x))))^2/t + var_ice1 <- var_obs + P <- sum(sapply(i, function(i)Q(i, x)*exp(-i)/D)) + CI_Model_h1 <- c(max(D, D/(1 - P) - z*sqrt(var_obs)/(1 - P)), D/(1 - P) + z*sqrt(var_obs)/(1 - P)) + } + table <- matrix(c(s_Model_h1, sqrt(var_ice1), CI_Model_h1), ncol = 4) + return(table) +} +###################################################### + + + +print.spadeTwo <- function(x, ...){ + if(x$datatype=="abundance"){ + cat('(1) BASIC DATA INFORMATION:\n\n') + cat(' The loaded set includes abundance/incidence data from 2 communities\n') + cat(' and a total of',x$info[1],'species.\n\n') + cat(' Samples size in Community 1 n1 =',x$info[2],'\n') + cat(' Samples size in Community 2 n2 =',x$info[3],'\n') + cat(' Number of observed species in Community 1 D1 =',x$info[4],'\n') + cat(' Number of observed species in Community 2 D2 =',x$info[5],'\n') + cat(' Number of observed shared species in two communities D12 =',x$info[6],'\n') + cat(' Number of bootstrap replications for s.e. estimate ',x$info[7],'\n\n') + cat(' Some statistics:\n') + cat(' f[11]=',x$info[8],'; f[1+]=',x$info[9], + '; f[+1]=',x$info[10],'; f[2+]=',x$info[11], + '; f[+2]=',x$info[12],'; f[22]=',x$info[13],'\n\n') + + cat('(2) EMPIRICAL SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$Empirical_richness), 2, as.numeric) + cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n') + cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$Empirical_relative), 2, as.numeric) + cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' ChaoJaccard-abundance ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n') + cat(' ChaoSorensen-abundance ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",temp[5,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$Empirical_WtRelative + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$Empirical_absolute), 2, as.numeric) + cat(' C12=U12 (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C22 (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U22 (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat('(3) ESTIMATED SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity:\n\n') + temp <- apply(as.matrix(x$estimated_richness), 2, as.numeric) + if(temp[1,1]>1) {cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[1,1]<=1){cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[2,1]>1) {cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",1) ,' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[2,1]<=1){cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$estimated_relative), 2, as.numeric) + if(temp[1,1]>1) {cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + if(temp[2,1]>1) {cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[2,1]<=1){cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n')} + if(temp[3,1]>1) {cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[3,1]<=1){cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n')} + #if(temp[4,1]>1) {cat(' Bray-Curtis (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n\n')} + #if(temp[4,1]<=1){cat(' Bray-Curtis (q=1) ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n')} + if(temp[4,1]>1) {cat(' ChaoJaccard-abundance ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[4,1]<=1){cat(' ChaoJaccard-abundance ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n')} + if(temp[5,1]>1) {cat(' ChaoSorensen-abundance ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[5,1]<=1){cat(' ChaoSorensen-abundance ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",temp[5,4]),'\n\n')} + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$estimated_WtRelative + if(temp[1,1]>1) {cat(' Horn size-weighted (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$estimated_absolute), 2, as.numeric) + if(temp[1,1]>1) {cat(' C12=U12 (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' C12=U12 (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + if(temp[2,1]>1) {cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[2,1]<=1){cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n')} + if(temp[3,1]>1) {cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[3,1]<=1){cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n')} + if(temp[4,1]>1) {cat(' Bray-Curtis ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[4,1]<=1){cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n')} + cat(' NOTE: If an estimate is greater than 1, it is replaced by 1.') + }else{ + cat('(1) BASIC DATA INFORMATION:\n\n') + cat(' The loaded set includes abundance/incidence data from 2 communities\n') + cat(' and a total of',x$info[1],'species.\n\n') + cat(' Number of sampling units in Community 1 T1 =',x$info[2],'\n') + cat(' Number of sampling units in Community 2 T2 =',x$info[3],'\n') + cat(' Number of total incidences in Community 1 U1 =',x$info[4],'\n') + cat(' Number of total incidences in Community 2 U2 =',x$info[5],'\n') + cat(' Number of observed species in Community 1 D1 =',x$info[6],'\n') + cat(' Number of observed species in Community 2 D2 =',x$info[7],'\n') + cat(' Number of observed shared species in two communities D12 =',x$info[8],'\n') + cat(' Number of bootstrap replications for s.e. estimate ',x$info[9],'\n\n') + cat(' Some Statistics:\n') + cat(' Q[11]=',x$info[10], + '; Q[1+]=',x$info[11], '; Q[+1]=',x$info[12], + '; Q[2+]=',x$info[13], '; Q[+2]=',x$info[14], '; Q[22]=',x$info[15],'\n\n') + + cat('(2) EMPIRICAL SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity\n\n') + temp <- apply(as.matrix(x$Empirical_richness), 2, as.numeric) + cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n') + cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n') + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$Empirical_relative), 2, as.numeric) + cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' ChaoJaccard-abundance ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n') + cat(' ChaoSorensen-abundance ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",temp[5,4]),'\n\n') + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$Empirical_size_weighted + cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$Empirical_absolute), 2, as.numeric) + cat(' C12=U12 (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n') + cat(' C22 (Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n') + cat(' U22 (Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n') + cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n') + cat('(3) ESTIMATED SIMILARITY INDICES: \n\n') + cat(' Estimate s.e. 95%Lower 95%Upper\n') + cat(' (a) Classic richness-based similarity:\n\n') + temp <- apply(as.matrix(x$estimated_richness), 2, as.numeric) + if(temp[1,1]>1) {cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[1,1]<=1){cat(' C02 (q=0, Sorensen) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n')} + if(temp[2,1]>1) {cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",1) ,' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[2,1]<=1){cat(' U02 (q=0, Jaccard) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n\n')} + cat(' (b) Measures for comparing species relative abundances\n\n') + temp <- apply(as.matrix(x$estimated_relative), 2, as.numeric) + if(temp[1,1]>1) {cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' C12=U12 (q=1, Horn) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + if(temp[2,1]>1) {cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[2,1]<=1){cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n')} + if(temp[3,1]>1) {cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[3,1]<=1){cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n')} + #if(temp[4,1]>1) {cat(' Bray-Curtis (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n\n')} + #if(temp[4,1]<=1){cat(' Bray-Curtis (q=1) ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n')} + if(temp[4,1]>1) {cat(' ChaoJaccard-abundance ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[4,1]<=1){cat(' ChaoJaccard-abundance ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n')} + if(temp[5,1]>1) {cat(' ChaoSorensen-abundance ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[5,1]<=1){cat(' ChaoSorensen-abundance ',sprintf("%.4f",temp[5,1]),' ',sprintf("%.4f",temp[5,2]),' ',sprintf("%.4f",temp[5,3]),' ',sprintf("%.4f",temp[5,4]),'\n\n')} + cat(' (c) Measures for comparing size-weighted species relative abundances\n\n') + temp <- x$estimated_WtRelative + if(temp[1,1]>1) {cat(' Horn size-weighted (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' Horn size-weighted (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + cat(' (d) Measures for comparing species absolute abundances\n\n') + temp <- apply(as.matrix(x$estimated_absolute), 2, as.numeric) + if(temp[1,1]>1) {cat(' C12=U12 (q=1) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[1,1]<=1){cat(' C12=U12 (q=1) ',sprintf("%.4f",temp[1,1]),' ',sprintf("%.4f",temp[1,2]),' ',sprintf("%.4f",temp[1,3]),' ',sprintf("%.4f",temp[1,4]),'\n\n')} + if(temp[2,1]>1) {cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",1),'\n')} + if(temp[2,1]<=1){cat(' C22 (q=2, Morisita-Horn) ',sprintf("%.4f",temp[2,1]),' ',sprintf("%.4f",temp[2,2]),' ',sprintf("%.4f",temp[2,3]),' ',sprintf("%.4f",temp[2,4]),'\n')} + if(temp[3,1]>1) {cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[3,1]<=1){cat(' U22 (q=2, Regional overlap) ',sprintf("%.4f",temp[3,1]),' ',sprintf("%.4f",temp[3,2]),' ',sprintf("%.4f",temp[3,3]),' ',sprintf("%.4f",temp[3,4]),'\n\n')} + if(temp[4,1]>1) {cat(' Bray-Curtis ',sprintf("%.4f",1),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",1),'\n\n')} + if(temp[4,1]<=1){cat(' Bray-Curtis ',sprintf("%.4f",temp[4,1]),' ',sprintf("%.4f",temp[4,2]),' ',sprintf("%.4f",temp[4,3]),' ',sprintf("%.4f",temp[4,4]),'\n\n')} + cat(' NOTE: If an estimate is greater than 1, it is replaced by 1.') + } +} diff --git a/R/VarEstFun.R b/R/VarEstFun.R new file mode 100644 index 0000000..65d8776 --- /dev/null +++ b/R/VarEstFun.R @@ -0,0 +1,62 @@ +VarEstFun <- +function(x1, x2, diffFun, FunName) { + n1 <- sum(x1) + n2 <- sum(x2) + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f12 <- sum(x1 == 1 & x2 == 2) + f21 <- sum(x1 == 2 & x2 == 1) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + ksi <- c(D12, f11, fp1, f1p, fp2, f2p, f22, K1, K2) + + S12 <- FunName(x1, x2) + CovFun <- function(i, j) { + if (i == j) { + cov <- ksi[i] - ksi[i]^2 / S12 + } else if ((i == 1 & j == 2) | (j == 1 & i == 2)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 3) | (j == 1 & i == 3)) { + cov <- fp1 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 4) | (j == 1 & i == 4)) { + cov <- f1p - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 5) | (j == 1 & i == 5)) { + cov <- fp2 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 6) | (j == 1 & i == 6)) { + cov <- f2p - ksi[i] * ksi[j] / S12 + } else if ((i == 2 & j == 3) | (j == 2 & i == 3)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 2 & j == 4) | (j == 2 & i == 4)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 3 & j == 4) | (j == 3 & i == 4)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 3 & j == 6) | (j == 3 & i == 6)) { + cov <- f21 - ksi[i] * ksi[j] / S12 + } else if ((i == 4 & j == 5) | (j == 4 & i == 5)) { + cov <- f12 - ksi[i] * ksi[j] / S12 + } else if ((i == 5 & j == 6) | (j == 5 & i == 6)) { + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 7) | (j == 1 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 5 & j == 7) | (j == 5 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 6 & j == 7) | (j == 6 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else { + cov <- 0 - ksi[i] * ksi[j] / S12 + } + return(cov) + } + + diff <- diffFun + i <- rep(c(1:7), 7) + j <- rep(c(1:7), each=7) + var <- sum(mapply(function(i, j) diff(ksi, i) * diff(ksi, j) * CovFun(i, j), i, j)) + se <- sqrt(var) + return(se) +} diff --git a/R/VarEstFun.Sam.R b/R/VarEstFun.Sam.R new file mode 100644 index 0000000..267fe28 --- /dev/null +++ b/R/VarEstFun.Sam.R @@ -0,0 +1,64 @@ +VarEstFun.Sam <- +function(y1, y2, diffFun, FunName) { + n1 <- y1[1] + n2 <- y2[1] + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + f11 <- sum(x1 == 1 & x2 == 1) + f22 <- sum(x1 == 2 & x2 == 2) + f12 <- sum(x1 == 1 & x2 == 2) + f21 <- sum(x1 == 2 & x2 == 1) + f1p <- sum(x1 == 1 & x2 >= 1) + fp1 <- sum(x1 >= 1 & x2 == 1) + f2p <- sum(x1 == 2 & x2 >= 1) + fp2 <- sum(x1 >= 1 & x2 == 2) + K1 <- (n1 - 1) / n1 + K2 <- (n2 - 1) / n2 + ksi <- c(D12, f11, fp1, f1p, fp2, f2p, f22, K1, K2) + + S12 <- FunName(y1, y2) + CovFun <- function(i, j) { + if (i == j) { + cov <- ksi[i] - ksi[i]^2 / S12 + } else if ((i == 1 & j == 2) | (j == 1 & i == 2)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 3) | (j == 1 & i == 3)) { + cov <- fp1 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 4) | (j == 1 & i == 4)) { + cov <- f1p - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 5) | (j == 1 & i == 5)) { + cov <- fp2 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 6) | (j == 1 & i == 6)) { + cov <- f2p - ksi[i] * ksi[j] / S12 + } else if ((i == 2 & j == 3) | (j == 2 & i == 3)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 2 & j == 4) | (j == 2 & i == 4)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 3 & j == 4) | (j == 3 & i == 4)) { + cov <- f11 - ksi[i] * ksi[j] / S12 + } else if ((i == 3 & j == 6) | (j == 3 & i == 6)) { + cov <- f21 - ksi[i] * ksi[j] / S12 + } else if ((i == 4 & j == 5) | (j == 4 & i == 5)) { + cov <- f12 - ksi[i] * ksi[j] / S12 + } else if ((i == 5 & j == 6) | (j == 5 & i == 6)) { + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 1 & j == 7) | (j == 1 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 5 & j == 7) | (j == 5 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else if ((i == 6 & j == 7) | (j == 6 & i == 7)) { # Pan has 7 parameter + cov <- f22 - ksi[i] * ksi[j] / S12 + } else { + cov <- 0 - ksi[i] * ksi[j] / S12 + } + return(cov) + } + + diff <- diffFun + i <- rep(c(1:7), 7) + j <- rep(c(1:7), each=7) + var <- sum(mapply(function(i, j) diff(ksi, i) * diff(ksi, j) * CovFun(i, j), i, j)) + se <- sqrt(var) + return(se) +} diff --git a/R/basicAbun.R b/R/basicAbun.R new file mode 100644 index 0000000..e93102e --- /dev/null +++ b/R/basicAbun.R @@ -0,0 +1,87 @@ +Chat.Ind <- function(x, m) +{ + x <- x[x>0] + n <- sum(x) + f1 <- sum(x == 1) + f2 <- sum(x == 2) + if(f1>0 & f2>0) + { + a=(n - 1) * f1 / ((n - 1) * f1 + 2 * f2) + } + if(f1>1 & f2==0) + { + a=(n-1)*(f1-1) / ( (n-1)*(f1-1) + 2 ) + } + if(f1==1 & f2==0) {a=0} + if(f1==0) {a=0} + + Sub <- function(m){ + if(m < n) out <- 1-sum(x / n * exp(lchoose(n - x, m)-lchoose(n - 1, m))) + if(m == n) out <- 1-f1/n*a + if(m > n) out <- 1-f1/n*a^(m-n+1) + out + } + sapply(m, Sub) +} +basicAbun <- function(data, k){ + f <- function(i, data){length(data[which(data == i)])} + data <- as.numeric(data) + + x <- data[which(data != 0)] + n <- sum(x) + D <- length(x) + n_rare <- sum(x[which(x <= k)]) + D_rare <- length(x[which(x <= k)]) + if (n_rare != 0){ + C_rare <- 1 - f(1, x)/n_rare + } else { + C_rare = 1 + } + n_abun <- n - n_rare + D_abun <- length(x[which(x > k)]) + + j <- c(1:k) + a1 <- sum(sapply(j, function(j)j*(j - 1)*f(j, x))) + a2 <- sum(sapply(j, function(j)j*f(j, x))) + if (C_rare != 0){ + gamma_rare_hat_square <- max(D_rare/C_rare*a1/a2/(a2 - 1) - 1, 0) + gamma_rare_1_square <- max(gamma_rare_hat_square*(1 + (1 - C_rare)/C_rare*a1/(a2 - 1)), 0) + }else{ + gamma_rare_hat_square <- 0 + gamma_rare_1_square <- 0 + } + CV_rare <- sqrt(gamma_rare_hat_square) + CV1_rare <- sqrt(gamma_rare_1_square) + + C<-Chat.Ind(x,n) + CV<-CV.Ind(x) + +# BASIC.DATA <- matrix(paste(c("n", "D", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), +# round(c(n, D, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun),3), +# sep = "="), ncol = 1) + BASIC.DATA <- matrix(round(c(n, D, C, CV, k, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun),3), + ncol = 1) + # + nickname <- matrix(c("n", "D", "C", "CV", "k", "n_rare", "D_rare", "C_rare", "CV_rare", "CV1_rare", "n_abun", "D_abun"), + ncol = 1) + BASIC.DATA <- cbind(nickname, BASIC.DATA) + colnames(BASIC.DATA) <- c("Variable", "Value") + rownames(BASIC.DATA) <- c(" Sample size", " Number of observed species", + " Coverage estimate for entire dataset"," CV for entire dataset"," Cut-off point", + " Number of observed individuals for rare group", " Number of observed species for rare group", + " Estimate of the sample coverage for rare group", + " Estimate of CV for rare group in ACE", " Estimate of CV1 for rare group in ACE-1", + " Number of observed individuals for abundant group", " Number of observed species for abundant group") +# rownames(BASIC.DATA) <- c("(Number of observed individuals) n =", +# "(Number of observed species) D =", +# "(Cut-off point) k =", +# "(Number of observed individuals for rare species) n_rare =", +# "(Number of observed species for rare species) D_rare =", +# "(Estimation of the sample coverage for rare species) C_rare =", +# "(Estimation of CV for rare species in ACE) CV_rare =", +# "(Estimation of CV1 for rare species in ACE-1) CV1_rare =", +# "(Number of observed individuals for abundant species) n_abun =", +# "(Number of observed species for abundant species) D_abun =") + BASIC.DATA <- data.frame(BASIC.DATA) + return(list(BASIC.DATA, n, D, C, CV, n_rare, D_rare, C_rare, CV_rare, CV1_rare, n_abun, D_abun)) +} diff --git a/R/basicInci.R b/R/basicInci.R new file mode 100644 index 0000000..c962af0 --- /dev/null +++ b/R/basicInci.R @@ -0,0 +1,74 @@ +Chat.Sam <- function(x, t) +{ + nT <- x[1] + y <- x[-1] + y <- y[y>0] + U <- sum(y) + Q1 <- sum(y == 1) + Q2 <- sum(y == 2) + Q0.hat <- ifelse(Q2 == 0, (nT - 1) / nT * Q1 * (Q1 - 1) / 2, (nT - 1) / nT * Q1 ^ 2/ 2 / Q2) #estimation of unseen species via Chao2 + A <- ifelse(Q1>0, nT*Q0.hat/(nT*Q0.hat+Q1), 1) + Sub <- function(t){ + if(t < nT) { + yy <- y[(nT-y)>=t] + out <- 1 - sum(yy / U * exp(lgamma(nT-yy+1)-lgamma(nT-yy-t+1)-lgamma(nT)+lgamma(nT-t))) + } + #if(t < nT) out <- 1 - sum(y / U * exp(lchoose(nT - y, t) - lchoose(nT - 1, t))) + if(t == nT) out <- 1 - Q1 / U * A + if(t > nT) out <- 1 - Q1 / U * A^(t - nT + 1) + out + } + sapply(t, Sub) +} + +basicInci <- +function(data, k){ + data <- as.numeric(data) + t <- data[1] + dat <- data[-1] + x <- dat[which(dat != 0)] + Q <- function(i, data){length(data[which(data == i)])} + + D <- length(x) + D_infreq <- length(x[which(x <= k)]) + + if (Q(1, x) > 0 & Q(2, x) > 0){ + A <- 2*Q(2, x)/((t-1)*Q(1, x) + 2*Q(2, x)) + } else if (Q(1, x) > 0 & Q(2, x) == 0){ + A <- 2/((t-1)*(Q(1, x) - 1) + 2) + } else { + A <- 1 + } + C_infreq <- 1 - Q(1, x)/sum(x[which(x <= k)])*(1-A) + + + j <- c(1:k) + b1 <- sum(sapply(j, function(j)j*(j-1)*Q(j, x))) + b2 <- sum(sapply(j, function(j)j*Q(j, x))) + gamma_infreq_square <- max(D_infreq/C_infreq*t/(t - 1)*b1/b2/(b2 -1) - 1, 0) + CV_infreq <- sqrt(gamma_infreq_square) + D_freq <- length(x[which(x > k)]) + + U<-sum(x) + C<-Chat.Sam(data, t) + CV_squre<-max( D/C*t/(t-1)*sum(x*(x-1))/U^2-1, 0) + CV<-CV_squre^0.5 + U_infreq<-sum(x[x<=k]) + gamma_infreq_square_1 <- max(gamma_infreq_square*(1 + Q(1, x)/C_infreq*t/(t - 1)*b1/b2/(b2 - 1)), 0) + CV1_infreq <- sqrt(gamma_infreq_square_1) +# BASIC.DATA <- matrix(paste(c("D", "t", "k", "D_infreq", "C_infreq", "CV_infreq", "D_freq"), +# round(c(D,t,k,D_infreq,C_infreq,CV_infreq,D_freq), 3), +# sep = "="), ncol = 1) + BASIC.DATA <- matrix(round(c(D,t,U,C,CV,k,U_infreq,D_infreq,C_infreq,CV_infreq,CV1_infreq,D_freq), 3), ncol = 1) + nickname <- c("D", "T", "U", "C", "CV", "k", "U_infreq", "D_infreq", "C_infreq", "CV_infreq","CV1_infreq", "D_freq") + BASIC.DATA <- cbind(nickname, BASIC.DATA) + colnames(BASIC.DATA)=c("Variable", "Value") + rownames(BASIC.DATA)=c(" Number of observed species"," Number of sampling units", " Total number of incidences", + " Coverage estimate for entire dataset", " CV for entire dataset", " Cut-off point"," Total number of incidences in infrequent group", + " Number of observed species for infrequent group"," Estimated sample coverage for infrequent group", + " Estimated CV for infrequent group in ICE", + " Estimated CV1 for infrequent group in ICE-1", + " Number of observed species for frequent group") + BASIC.DATA <- data.frame(BASIC.DATA) + return(list(BASIC.DATA, D, t, D_infreq, C_infreq, CV_infreq,CV1_infreq, D_freq)) +} diff --git a/R/diff_Chao1.R b/R/diff_Chao1.R new file mode 100644 index 0000000..3bbdfd8 --- /dev/null +++ b/R/diff_Chao1.R @@ -0,0 +1,28 @@ +diff_Chao1 <- +function(ksi, s) { + if (sum(ksi == 0) != 0) { + ksi[which(ksi == 0)] <- 1 + } + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- f1p * fp1 / (4 * f2p * fp2) + } else if (s == 3) { + d <- f11 * f1p / (4 * f2p * fp2) + fp1 / fp2 + } else if (s == 4) { + d <- f11 * fp1 / (4 * f2p * fp2) + f1p / f2p + } else if (s == 5) { + d <- - f11 * f1p * fp1 / (4 * f2p) / fp2^2 - (fp1 / fp2)^2 / 2 + } else if (s == 6) { + d <- - f11 * f1p * fp1 / (4 * fp2) / f2p^2 - (f1p / f2p)^2 / 2 + } else { + d <- 0 + } + return(d) +} diff --git a/R/diff_Chao1bc.R b/R/diff_Chao1bc.R new file mode 100644 index 0000000..09e2e54 --- /dev/null +++ b/R/diff_Chao1bc.R @@ -0,0 +1,25 @@ +diff_Chao1bc <- +function(ksi, s) { + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- f1p * fp1 / (4 * (f2p + 1) * (fp2 + 1)) + } else if (s == 3) { + d <- f11 * f1p / (4 * (f2p + 1) * (fp2 + 1)) + (2 * fp1 - 1)/ (2 * (fp2 + 1)) + } else if (s == 4) { + d <- f11 * fp1 / (4 * (f2p + 1) * (fp2 + 1)) + (2 * f1p - 1)/ (2 * (f2p + 1)) + } else if (s == 5) { + d <- - f11 * f1p * fp1 / (4 * (f2p + 1)) / (fp2 + 1)^2 - fp1 * (fp1 - 1) / 2 / (fp2 + 1)^2 + } else if (s == 6) { + d <- - f11 * f1p * fp1 / (4 * (fp2 + 1)) / (f2p + 1)^2 - f1p * (f1p - 1) / 2 / (f2p + 1)^2 + } else { + d <- 0 + } + return(d) +} diff --git a/R/diff_Chao2.R b/R/diff_Chao2.R new file mode 100644 index 0000000..d6448f2 --- /dev/null +++ b/R/diff_Chao2.R @@ -0,0 +1,30 @@ +diff_Chao2 <- +function(ksi, s) { + if (sum(ksi == 0) != 0) { + ksi[which(ksi == 0)] <- 1 + } + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + K1 <- ksi[8] + K2 <- ksi[9] + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- K1 * K2 * f1p * fp1 / (4 * f2p * fp2) + } else if (s == 3) { + d <- K1 * K2 * f11 * f1p / (4 * f2p * fp2) + K2 * fp1 / fp2 + } else if (s == 4) { + d <- K1 * K2 * f11 * fp1 / (4 * f2p * fp2) + K1 * f1p / f2p + } else if (s == 5) { + d <- - K1 * K2 * f11 * f1p * fp1 / (4 * f2p) / fp2^2 - K2 * (fp1 / fp2)^2 / 2 + } else if (s == 6) { + d <- - K1 * K2 * f11 * f1p * fp1 / (4 * fp2) / f2p^2 - K1 * (f1p / f2p)^2 / 2 + } else { + d <- 0 + } + return(d) +} diff --git a/R/diff_Chao2bc.R b/R/diff_Chao2bc.R new file mode 100644 index 0000000..2a310c4 --- /dev/null +++ b/R/diff_Chao2bc.R @@ -0,0 +1,28 @@ +diff_Chao2bc <- +function(ksi, s) { + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + K1 <- ksi[8] + K2 <- ksi[9] + + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- K1 * K2 * f1p * fp1 / (4 * (f2p + 1) * (fp2 + 1)) + } else if (s == 3) { + d <- K1 * K2 * f11 * f1p / (4 * (f2p + 1) * (fp2 + 1)) + K2 * (2 * fp1 - 1)/ (2 * (fp2 + 1)) + } else if (s == 4) { + d <- K1 * K2 * f11 * fp1 / (4 * (f2p + 1) * (fp2 + 1)) + K1 * (2 * f1p - 1)/ (2 * (f2p + 1)) + } else if (s == 5) { + d <- - K1 * K2 * f11 * f1p * fp1 / (4 * (f2p + 1)) / (fp2 + 1)^2 - K2 * fp1 * (fp1 - 1) / 2 / (fp2 + 1)^2 + } else if (s == 6) { + d <- - K1 * K2 * f11 * f1p * fp1 / (4 * (fp2 + 1)) / (f2p + 1)^2 - K1 * f1p * (f1p - 1) / 2 / (f2p + 1)^2 + } else { + d <- 0 + } + return(d) +} diff --git a/R/diff_Pan.R b/R/diff_Pan.R new file mode 100644 index 0000000..960ede0 --- /dev/null +++ b/R/diff_Pan.R @@ -0,0 +1,31 @@ +diff_Pan <- +function(ksi, s) { + if (sum(ksi == 0) != 0) { + ksi[which(ksi == 0)] <- 1 + } + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + f22 <- ksi[7] + K1 <- ksi[8] + K2 <- ksi[9] + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- K1 * K2 * f11 / 2 / f22 + } else if (s == 3) { + d <- K2 * fp1 / fp2 + } else if (s == 4) { + d <- K1 * f1p / f2p + } else if (s == 5) { + d <- - K2 * (fp1 / fp2)^2 / 2 + } else if (s == 6) { + d <- - K1 * (f1p / f2p)^2 / 2 + } else { + d <- - K1 * K2 * f11^2 / 4 / f22^2 + } + return(d) +} diff --git a/R/diff_Panbc.R b/R/diff_Panbc.R new file mode 100644 index 0000000..d018040 --- /dev/null +++ b/R/diff_Panbc.R @@ -0,0 +1,28 @@ +diff_Panbc <- +function(ksi, s) { + D12 <- ksi[1] + f11 <- ksi[2] + fp1 <- ksi[3] + f1p <- ksi[4] + fp2 <- ksi[5] + f2p <- ksi[6] + f22 <- ksi[7] + K1 <- ksi[8] + K2 <- ksi[9] + if (s == 1){ + d <- 1 + } else if (s == 2) { + d <- K1 * K2 * (2 * f11 - 1) / (4 * (f22 + 1)) + } else if (s == 3) { + d <- K2 * (2 * fp1 -1) / (2 * (fp2 + 1)) + } else if (s == 4) { + d <- K1 * (2 * f1p - 1) / (2 * (f2p + 1)) + } else if (s == 5) { + d <- - K2 * fp1 * (fp1 - 1) / (fp2 + 1)^2 / 2 + } else if (s == 6) { + d <- - K1 * f1p * (f1p - 1) / (f2p + 1)^2 / 2 + } else { + d <- - K1 * K2 * f11 * (f11 - 1) / 4 / (f22 + 1)^2 + } + return(d) +} diff --git a/R/f.R b/R/f.R new file mode 100644 index 0000000..0653698 --- /dev/null +++ b/R/f.R @@ -0,0 +1,2 @@ +f <- +function(i, data){length(data[which(data == i)])} diff --git a/R/logCI.R b/R/logCI.R new file mode 100644 index 0000000..0124c12 --- /dev/null +++ b/R/logCI.R @@ -0,0 +1,11 @@ +logCI <- +function(y1, y2, est, se, conf) { + x1 <- y1[-1] + x2 <- y2[-1] + D12 <- sum(x1 > 0 & x2 > 0) + t <- est - D12 + z <- qnorm((1-conf)/2, lower.tail=F) + K <- exp(z * sqrt(log(1 + se^2 / t^2))) + CI <- c(D12 + t / K, D12 + t * K) + return(CI) +} diff --git a/R/print.ChaoShared.R b/R/print.ChaoShared.R new file mode 100644 index 0000000..4ce0067 --- /dev/null +++ b/R/print.ChaoShared.R @@ -0,0 +1,71 @@ +print.ChaoShared <- function(x, ...){ + cat('\n(1) BASIC DATA INFORMATION:\n\n') + + if(nrow(x[[2]])==4){ + cat(" Sample size in Community 1 n1 = ", x[[1]]$n1, "\n") + cat(" Sample size in Community 2 n2 = ", x[[1]]$n2, "\n") + cat(" Number of observed species in Community 1 D1 = ", x[[1]]$D1, "\n") + cat(" Number of observed species in Community 2 D2 = ", x[[1]]$D2, "\n") + cat(" Number of observed shared species D12 = ", x[[1]]$D12, "\n") + cat(" Bootstrap replications for s.e. estimate ", x[[1]]$B, "\n\n") + cat(" \"Entire\" Shared Species Group:", "\n") + cat(" Some statistics:", "\n") + cat(" ---------------------------------------------------------------------------", "\n") + cat(" f[11] =", x[[1]]$f11, "; ", "f[1+] =", x[[1]]$f1.plus, ";", "f[+1] =", x[[1]]$fplus.1, "; ", "f[2+] =", x[[1]]$f2.plus, "; ", "f[+2] =", x[[1]]$fplus.2, ";" , "f[22] =", x[[1]]$f22 , "\n") + cat(" ---------------------------------------------------------------------------", "\n\n") + cat(" \"Rare\" Shared Species Group: (Both frequencies can only up to 10)", "\n") + cat(" Some statistics:", "\n") + cat(" -------------------------------------------------------------------", "\n") + cat(" f[1+]_rare =", x[[1]]$f1.plus.rare, ";", "f[+1]_rare =", x[[1]]$fplus.1.rare, "; ", "f[2+]_rare =", x[[1]]$f2.plus.rare, "; ", "f[+2]_rare =", x[[1]]$fplus.2.rare, "\n") + cat(" -------------------------------------------------------------------", "\n") + cat(" Number of observed individuals in Community 1 n1_rare = ", x[[1]]$n1_rare, "\n") + cat(" Number of observed individuals in Community 2 n2_rare = ", x[[1]]$n2_rare, "\n") + cat(" Number of observed shared species D12_rare = ", x[[1]]$D12_rare, "\n") + cat(" Estimated sample coverage C12_rare = ", round(x[[1]]$C12_rare,3), "\n") + cat(" Estimated CCVs CCV_1 = ", round(x[[1]]$CCV_1,3), "\n") + cat(" CCV_2 = ", round(x[[1]]$CCV_2,3), "\n") + cat(" CCV_12 = ", round(x[[1]]$CCV_12,3), "\n") + + }else{ + #cat("(1) BASIC DATA INFORMATION:", "\n\n") + cat(" Number of sampling units in Community 1 T1 = ", x[[1]]$T1, "\n") + cat(" Number of sampling units in Community 2 T2 = ", x[[1]]$T2, "\n") + cat(" Number of total incidences in Community 1 U1 = ", x[[1]]$U1, "\n") + cat(" Number of total incidences in Community 2 U2 = ", x[[1]]$U2, "\n") + cat(" Number of observed species in Community 1 D1 = ", x[[1]]$D1, "\n") + cat(" Number of observed species in Community 2 D2 = ", x[[1]]$D2, "\n") + cat(" Number of observed shared species in two communities D12 = ", x[[1]]$D12, "\n") + cat(" Bootstrap replications for s.e. estimate ", x[[1]]$B, "\n\n") + cat(" Some statistics:", "\n") + cat(" --------------------------------------------------------------------------", "\n") + cat(" Q[11] =", x[[1]]$Q11, "; ", "Q[1+] =", x[[1]]$Q1.plus, ";", "Q[+1] =", x[[1]]$Qplus.1, "; ", "Q[2+] =", x[[1]]$Q2.plus, "; ", "Q[+2] =", x[[1]]$Qplus.2, ";" , "Q[22] =", x[[1]]$Q22, "\n") + cat(" --------------------------------------------------------------------------", "\n") + + } + + cat('\n') + cat('\n(2) ESTIMATION RESULTS OF THE NUMBER OF SHARED SPECIES:\n\n') + print(round(x$Estimation_results,3)) + cat('\n') + if(nrow(x[[2]])==4){ + cat(' +(3) DESCRIPTION OF MODELS FOR ESTIMATING SHARED SPECIES RICHNESS: + +Homogeneous: This model assumes that the shared species in each community have the same discovery probabilities; see the Eq. (3.11a) of Chao et al. (2000). + +Heterogeneous (ACE-shared): This model allows for heterogeneous discovery probabilities among shared species; see Eq. (3.11b) of Chao et al. (2000). It is an extension of the ACE estimator to two communities. It is replaced by Chao1-shared when the estimated sample coverage for rare shared species group (C12_rare in the output) is zero. + +Chao1-shared: An extension of the Chao1 estimator to estimate shared species richness between two communities. It provides a lower bound of shared species richness. See Eq. (3.6) of Pan et al. (2009). It is replaced by Chao1-shared-bc for the case f[2+]=0 or f[+2]=0. + +Chao1-shared-bc: A bias-corrected form of Chao1-shared estimator; See Pan et al. (2009). + ') + }else{ + cat(' +(3) DESCRIPTION OF MODELS FOR ESTIMATING SHARED SPECIES RICHNESS: + +Chao2-shared: An extension of the Chao2 estimator to estimate shared species richness between two communities. It provides a lower bound of shared species richness. See Pan et al. (2009). It is replaced by Chao2-shared-bc for the case Q[2+]=0 or Q[+2]=0. + +Chao2-shared-bc: A bias-corrected form of Chao2-shared. See Pan et al. (2009). + ')} +} + diff --git a/R/print.ChaoSpecies.R b/R/print.ChaoSpecies.R new file mode 100644 index 0000000..a00510b --- /dev/null +++ b/R/print.ChaoSpecies.R @@ -0,0 +1,69 @@ +print.ChaoSpecies <- function(x, ...){ + cat('\n(1) BASIC DATA INFORMATION:\n\n') + if(names(x)[2]=="Rare.Species.Group"){ + print(x$Basic_data_information[1:4,]) + cat('\n') + print(x$Basic_data_information[-c(1:4),]) + cat('\n') + print(x$Rare_species.Group) + cat('\n') + cat('\n(2) SPECIES RICHNESS ESTIMATORS TABLE:\n\n') + print(x$Species_table) + cat('\n') + cat('\n(3) DESCRIPTION OF ESTIMATORS/MODELS: + +Homogeneous Model: This model assumes that all species have the same abundances or discovery probabilities. See Eq. (2.3) of Chao and Lee (1992) or Eq. (7a) of Chao and Chiu (2016b). + +Homogeneous (MLE): An approximate maximum likelihood estimate under homogeneous model. See Eq. (1.1) and Eq. (1.2) of Chao and Lee (1992) or Eq. (3) of Chao and Chiu (2016b). + +Chao1 (Chao, 1984): This approach uses the numbers of singletons and doubletons to estimate the number of undetected species because undetected species information is mostly concentrated on those low frequency counts; see Chao (1984), and Chao and Chiu (2012, 2016a,b). + +Chao1-bc: A bias-corrected form for the Chao1 estimator; see Chao (2005) or Eq. (6b) of Chao and Chiu (2016b). + +iChao1: An improved Chao1 estimtor; see Chiu et al. (2014). + +ACE (Abundance-based Coverage Estimator): A non-parametric estimator proposed by Chao and Lee (1992) and Chao, Ma and Yang (1993). The observed species are separated as rare and abundant groups; only data in the rare group is used to estimate the number of undetected species. The estimated CV of the species in rare group characterizes the degree of heterogeneity among species discovery probabilities. See Eq. (2.14) in Chao and Lee (1992) or Eq. (7b) of Chao and Chiu (2016b). + +ACE-1: A modified ACE for highly-heterogeneous communities when CV of the entire dataset > 2 and species richness > 1000. See Eq. (2.15) in Chao and Lee (1992). + +1st order jackknife: It uses the number of singletons to estimate the number of undetected species; see Burnham and Overton (1978). + +2nd order jackknife: It uses the numbers of singletons and doubletons to estimate the number of undetected species; see Burnham and Overton (1978). + +95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987). +')} + if(names(x)[2]!="Rare.Species.Group"){ + print(x$Basic_data_information[1:5,]) + cat('\n') + print(x$Basic_data_information[-c(1:5),]) + cat('\n') + print(x$Infreq_species_group) + cat('\n') + cat('\n(2) SPECIES RICHNESS ESTIMATORS TABLE:\n\n') + print(x$Species_table) + cat('\n') + cat('\n(3) DESCRIPTION OF ESTIMATORS/MODELS: + +Homogeneous Model: This model assumes that all species have the same incidence or detection probabilities. See Eq. (3.2) of Lee and Chao (1994) or Eq. (12a) in Chao and Chiu (2016b). + +Chao2 (Chao, 1987): This approach uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Chao (1987) or Eq. (11a) in Chao and Chiu (2016b). + +Chao2-bc: A bias-corrected form for the Chao2 estimator; see Chao (2005). + +iChao2: An improved Chao2 estimator; see Chiu et al. (2014). + +ICE (Incidence-based Coverage Estimator): A non-parametric estimator originally proposed by Lee and Chao (1994) in the context of capture-recapture data analysis. The observed species are separated as frequent and infrequent species groups; only data in the infrequent group are used to estimate the number of undetected species. The estimated CV for species in the infrequent group characterizes the degree of heterogeneity among species incidence probabilities. See Eq. (12b) of Chao and Chiu (2016b), which is an improved version of Eq. (3.18) in Lee and Chao (1994). This model is also called Model(h) in capture-recapture literature where h denotes "heterogeneity". + +ICE-1: A modified ICE for highly-heterogeneous cases. + +1st order jackknife: It uses the frequency of uniques to estimate the number of undetected species; see Burnham and Overton (1978). + +2nd order jackknife: It uses the frequencies of uniques and duplicates to estimate the number of undetected species; see Burnham and Overton (1978). + +95% Confidence interval: A log-transformation is used for all estimators so that the lower bound of the resulting interval is at least the number of observed species. See Chao (1987). +') + } +} + + + \ No newline at end of file diff --git a/R/spader.R b/R/spader.R new file mode 100644 index 0000000..49b4dac --- /dev/null +++ b/R/spader.R @@ -0,0 +1,1388 @@ +# +# +########################################### +#' Estimation of species richness in a community +#' +#' \code{ChaoSpecies}: Estimation of species richness in a single community based on five types of data: +#' Type (1) abundance data (datatype="abundance"), Type (1A) abundance-frequency counts \cr +#' (datatype="abundance_freq_count"), Type (2) incidence-frequency data (datatype = +#' "incidence_freq"), Type (2A) incidence-frequency counts (datatype="incidence_freq_count"), and +#' Type (2B) incidence-raw data (datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +#' @param data a matrix/data.frame of species abundances/incidences.\cr +#' @param datatype type of input data, "abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count" or "incidence_raw". \cr +#' @param k the cut-off point (default = 10), which separates species into "abundant" and "rare" groups for abundance data for the estimator ACE; it separates species into "frequent" and +#' "infrequent" groups for incidence data for the estimator ICE. +#' @param conf a positive number \eqn{\le} 1 specifying the level of confidence interval. +#' @return a list of three objects: \cr\cr +#' \code{$Basic_data_information} and \code{$Rare_species_group}/\code{$Infreq_species_group} for summarizing data information. \cr\cr +#' \code{$Species_table} for showing a table of various species richness estimates, standard errors, and the associated confidence intervals. \cr\cr +#' @examples +#' data(ChaoSpeciesData) +#' # Type (1) abundance data +#' ChaoSpecies(ChaoSpeciesData$Abu,"abundance",k=10,conf=0.95) +#' # Type (1A) abundance-frequency counts data +#' ChaoSpecies(ChaoSpeciesData$Abu_count,"abundance_freq_count",k=10,conf=0.95) +#' # Type (2) incidence-frequency data +#' ChaoSpecies(ChaoSpeciesData$Inci,"incidence_freq",k=10,conf=0.95) +#' # Type (2A) incidence-frequency counts data +#' ChaoSpecies(ChaoSpeciesData$Inci_count,"incidence_freq_count",k=10,conf=0.95) +#' # Type (2B) incidence-raw data +#' ChaoSpecies(ChaoSpeciesData$Inci_raw,"incidence_raw",k=10,conf=0.95) +#' @references +#' Chao, A., and Chiu, C. H. (2012). Estimation of species richness and shared species richness. In N. Balakrishnan (ed). Methods and Applications of Statistics in the Atmospheric and Earth Sciences. p.76-111, Wiley, New York.\cr\cr +#' Chao, A., and Chiu, C. H. (2016). Nonparametric estimation and comparison of species richness. Wiley Online Reference in the Life Science. In: eLS. John Wiley and Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0026329.\cr\cr +#' Chao, A., and Chiu, C. H. (2016). Species richness: estimation and comparison. Wiley StatsRef: Statistics Reference Online. 1-26.\cr\cr +#' Chiu, C. H., Wang Y. T., Walther B. A. and Chao A. (2014). An improved non-parametric lower bound of species richness via the Good-Turing frequency formulas. Biometrics, 70, 671-682. \cr\cr +#' Gotelli, N. G. and Chao, A. (2013). Measuring and estimating species richness, species diver- sity, and biotic similarity from sampling data. Encyclopedia of Biodiversity, 2nd Edition, Vol. 5, 195-211, Waltham, MA. \cr\cr +#' @export + + +ChaoSpecies <- function(data, datatype = c("abundance","abundance_freq_count", "incidence_freq", "incidence_freq_count", "incidence_raw"), k = 10, conf = 0.95) +{ + if (is.matrix(data) == T || is.data.frame(data) == T){ + #if (ncol(data) != 1 & nrow(data) != 1) + #stop("Error: The data format is wrong.") + if(datatype != "incidence_raw"){ + if (ncol(data) == 1){ + data <- data[, 1] + } else { + data <- data[1, ] + } + } else{ + t <- ncol(data) + dat <- rowSums(data) + dat <- as.integer(dat) + t_infreq <- sum(colSums(data[which(dat=1) + data <- dat + data <- c(t_infreq, t , data) + } + + } + + if(datatype == "abundance_freq_count"){ + data <- as.integer(data) + length4b <- length(data) + data <- rep(data[seq(1,length4b,2)],data[seq(2,length4b,2)]) + names(data) <- paste("x",1:length(data),sep="") + datatype <- "abundance" + } + if (datatype == "incidence_freq_count"){ + t <- as.integer(data[1]) + data <- data[-c(1)] + data <- as.integer(data) + lengthdat <- length(data) + data <- rep(data[seq(1,lengthdat,2)],data[seq(2,lengthdat,2)]) + data <- c(t,data) + names(data) <- c("T", paste("y",1:(length(data)-1),sep="")) + datatype <- "incidence_freq" + } + method <- "all" + if (k != round(k) || k < 0) + stop("Error: The cutoff t to define less abundant species must be non-negative integer!") + if (is.numeric(conf) == FALSE || conf > 1 || conf < 0) + stop("Error: confidence level must be a numerical value between 0 and 1, e.g. 0.95") + # data <- as.numeric(round(data)) + + if (datatype == "abundance"){ + f <- function(i, data){length(data[which(data == i)])} + if (f(1, data) == sum(data)){ + stop("Error: The information of data is not enough.")} + z <- (list(Basic_data_information = basicAbun(data, k)[[1]], Rare_species_group = RareSpeciesGroup(data, k), + Species_table = round(SpecAbunOut(data, method, k, conf), 3))) + } else if (datatype == "incidence_raw"){ + dat <- data[-1]; Q <- function(i, data){length(data[which(data == i)])} + if (Q(1, dat) == sum(dat)){ + stop("Error: The information of data is not enough.")} + z <- (list(Basic_data_information = basicInci(data[-1], k)[[1]], Infreq_species_group = InfreqSpeciesGroup(data[-1], k), + Species_table = round(SpecInciOut_raw(data, method, k, conf),3))) + } else if (datatype == "incidence_freq"){ + dat <- data[-1]; + Q <- function(i, data){length(data[which(data == i)])} + if (Q(1, dat) == sum(dat)){ + stop("Error: The information of data is not enough.")} + z <- (list(Basic_data_information = basicInci(data, k)[[1]], Infreq_species_group = InfreqSpeciesGroup(data, k), + Species_table = round(SpecInciOut(data, method, k, conf),3))) + } + else{ + stop("Error: The data type is wrong.") + } + class(z) <- c("ChaoSpecies") + z +} + + +# +# +########################################### +#' Estimation of the number of shared species between two communities/assemblages +#' +#' \code{ChaoShared}: Estimation of shared species richness between two communities/assemblages based on +#' three types of data: Type (1) abundance data (datatype="abundance"), Type (2) incidence-frequency +#' data (datatype="incidence_freq"), and Type (2B) incidence-raw data (datatype="incidence\cr +#' _raw"); see \code{SpadeR-package} details for data input formats. +#' @param data a matrix/data.frame of species abundances/incidences.\cr +#' @param datatype type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr +#' @param units number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data.\cr +#' @param se a logical variable to calculate the bootstrap standard error and the associated confidence interval. \cr +#' @param nboot an integer specifying the number of bootstrap replications. \cr +#' @param conf a positive number \eqn{\le} 1 specifying the level of confidence interval. +#' @return a list of two objects: \cr\cr +#' \code{$Basic_data_information} for summarizing data information. \cr\cr +#' \code{$Estimation_results} for showing a table of various shared richess estimates, standard errors, and the associated confidence intervals. \cr\cr +#' @examples +#' data(ChaoSharedData) +#' # Type (1) abundance data +#' ChaoShared(ChaoSharedData$Abu,"abundance",se=TRUE,nboot=200,conf=0.95) +#' # Type (2) incidence-frequency data +#' ChaoShared(ChaoSharedData$Inci,"incidence_freq",se=TRUE,nboot=200,conf=0.95) +#' # Type (2B) incidence-raw data +#' ChaoShared(ChaoSharedData$Inci_raw,"incidence_raw",units=c(16,17),se=TRUE,nboot=200,conf=0.95) +#' @references +#' Chao, A., Hwang, W.-H., Chen, Y.-C. and Kuo. C.-Y. (2000). Estimating the number of shared species in two communities. Statistica Sinica, 10, 227-246.\cr\cr +#' Pan, H.-Y., Chao, A. and Foissner, W. (2009). A non-parametric lower bound for the number of species shared by multiple communities. Journal of Agricultural, Biological and Environmental Statistics, 14, 452-468. +#' @export + +ChaoShared <- + function(data, datatype = c("abundance", "incidence_freq", "incidence_raw"), units, + se = TRUE, nboot = 200, conf = 0.95) { + + method <- "all" + if (se == TRUE) { + if (nboot < 1) + nboot <- 1 + if (nboot == 1) + cat("Warning: When \"nboot\" =" ,nboot, ", the bootstrap s.e. and confidence interval can't be calculated.", + "\n\n") + } + + if (is.numeric(conf) == FALSE || conf > 1 || conf < 0) { + cat("Warning: \"conf\"(confidence level) must be a numerical value between 0 and 1, e.g. 0.95.", + "\n") + cat(" We use \"conf\" = 0.95 to calculate!", + "\n\n") + conf <- 0.95 + } + + datatype <- match.arg(datatype) + if (datatype == "abundance") { + if(class(data)=="list"){data <-cbind(data[[1]],data[[2]]) } + x1 <- data[, 1] + x2 <- data[, 2] + Basic <- BasicFun(x1, x2, nboot, datatype) + # cat("(2) ESTIMATION RESULTS OF THE NUMBER OF SHARED SPECIES: ", "\n") + output <- ChaoShared.Ind(x1, x2, method, nboot, conf, se) + colnames(output) <- c("Estimate", "s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + } + if (datatype == "incidence_freq") { + if(class(data)=="list"){data <-cbind(data[[1]],data[[2]]) } + y1 <- data[, 1] + y2 <- data[, 2] + Basic <- BasicFun(y1, y2, B=nboot, datatype) + # cat("(2) ESTIMATION RESULTS OF THE NUMBER OF SHARED SPECIES: ", "\n") + output <- ChaoShared.Sam(y1, y2, method, conf, se) + colnames(output) <- c("Estimate", "s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + } + if (datatype=="incidence_raw"){ + t = units + if(ncol(data) != sum(t)) stop("Number of columns does not euqal to the sum of key in sampling units") + dat <- matrix(0, ncol = length(t), nrow = nrow(data)) + n <- 0 + for(i in 1:length(t)){ + dat[, i] <- as.integer(rowSums(data[,(n+1):(n+t[i])] ) ) + n <- n+t[i] + } + t <- as.integer(t) + dat <- apply(dat, MARGIN = 2, as.integer) + dat <- data.frame(rbind(t, dat),row.names = NULL) + y1 <- dat[,1] + y2 <- dat[,2] + datatype = "incidence_freq" + Basic <- BasicFun(y1, y2, B=nboot, datatype) + output <- ChaoShared.Sam(y1, y2, method, conf, se) + colnames(output) <- c("Estimate", "s.e.", paste(conf*100,"%Lower",sep=""), paste(conf*100,"%Upper",sep="")) + } + out <- list(Basic_data_information=Basic, + Estimation_results=output) + class(out) <- c("ChaoShared") + return(out) + } + + +# +# +########################################### +#' Estimation of species diversity (Hill numbers) +#' +#' \code{Diversity}: Estimating a continuous diversity profile in one community including species rich- +#' ness, Shannon diversity and Simpson diversity). This function also supplies plots of empirical and +#' estimated continuous diversity profiles. Various estimates for Shannon entropy and the Gini- +#' Simpson index are also computed. All five types of data are supported: Type (1) abundance data +#' (datatype="abundance"), Type (1A) abundance-frequency counts +#' (datatype="abundance_freq_count"), Type (2) incidence-frequency data (datatype = +#' "incidence_freq"), Type (2A) incidence-frequency counts (datatype="incidence_freq_count"), and +#' Type (2B) incidence-raw data (datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +#' @param data a matrix/data.frame of species abundances/incidences.\cr +#' @param datatype type of input data, "abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count" or "incidence_raw". \cr +#' @param q a vector of nonnegative numbers specifying the diversity orders for which Hill numbers will be estimated. If \code{NULL}, then +#' Hill numbers will be estimated at order q from 0 to 3 with increments of 0.25. +#' @return a list of seven objects: \cr\cr +#' \code{$Basic_data} for summarizing data information. \cr\cr +#' \code{$Species_richness} for showing various species richness estimates along with related statistics. \cr\cr +#' \code{$Shannon_index} and \code{$Shannon_diversity} for showing various Shannon index/diversity estimates. \cr\cr +#' \code{$Simpson_index} and \code{$Simpson_diversity} for showing two Simpson index/diversity estimates. \cr\cr +#' \code{$Hill_numbers} for showing Hill number (diversity) estimates of diversity orders specified in the argument \code{q}. \cr\cr +#' @examples +#' \dontrun{ +#' data(DiversityData) +#' # Type (1) abundance data +#' Diversity(DiversityData$Abu,"abundance",q=c(0,0.5,1,1.5,2)) +#' # Type (1A) abundance-frequency counts data +#' Diversity(DiversityData$Abu_count,"abundance_freq_count",q=seq(0,3,by=0.5)) +#' # Type (2) incidence-frequency data +#' Diversity(DiversityData$Inci,"incidence_freq",q=NULL) +#' # Type (2A) incidence-frequency counts data +#' Diversity(DiversityData$Inci_freq_count,"incidence_freq_count",q=NULL) +#' # Type (2B) incidence-raw data +#' Diversity(DiversityData$Inci_raw,"incidence_raw",q=NULL) +#' } +#' @references +#' Chao, A., and Chiu, C. H. (2012). Estimation of species richness and shared species richness. In N. Balakrishnan (ed). Methods and Applications of Statistics in the Atmospheric and Earth Sciences. p.76-111, Wiley, New York.\cr\cr +#' Chao, A. and Jost, L. (2015). Estimating diversity and entropy profiles via discovery rates of new species. Methods in Ecology and Evolution, 6, 873-882.\cr\cr +#' Chao, A., Wang, Y. T. and Jost, L. (2013). Entropy and the species accumulation curve: a novel estimator of entropy via discovery rates of new species. Methods in Ecology and Evolution 4, 1091-1110.\cr\cr +#' @export + +Diversity=function(data, datatype=c("abundance","abundance_freq_count", "incidence_freq", "incidence_freq_count", "incidence_raw"), q=NULL) +{ + if (is.matrix(data) == T || is.data.frame(data) == T){ + #if (ncol(data) != 1 & nrow(data) != 1) + #stop("Error: The data format is wrong.") + if(datatype != "incidence_raw"){ + if (ncol(data) == 1){ + data <- data[, 1] + } else { + data <- as.vector(data[1, ]) + } + } else{ + t <- ncol(data) + dat <- rowSums(data) + dat <- as.integer(dat) + data <- c(t , dat) + } + + } + X <- data + if(datatype == "abundance_freq_count"){ + data <- as.integer(data) + length4b <- length(data) + data <- rep(data[seq(1,length4b,2)],data[seq(2,length4b,2)]) + names(data) <- paste("x",1:length(data),sep="") + datatype <- "abundance" + X <- data + } + if(datatype=="abundance"){ + type="abundance" + if(!is.vector(X)) X <- as.numeric(unlist(c(X))) + + BASIC.DATA <- matrix(round(c(sum(X), sum(X>0), 1-sum(X==1)/sum(X), CV.Ind(X)),3), ncol = 1) + nickname <- matrix(c("n", "D", "C", "CV"), ncol = 1) + BASIC.DATA <- cbind(nickname, BASIC.DATA) + + colnames(BASIC.DATA) <- c("Variable", "Value") + rownames(BASIC.DATA) <- c(" Sample size", " Number of observed species", + " Estimated sample coverage", + " Estimated CV") + BASIC.DATA <- data.frame(BASIC.DATA) + + table0 <- matrix(0,5,4) + table0[1,]=c(Chao1(X)[-5]) + table0[2,]=c(Chao1_bc(X)) + table0[3,]=round(SpecAbuniChao1(X, k=10, conf=0.95)[1,],1) + table0[4,]=round(c(SpecAbunAce(X)),1) + table0[5,]=round(c(SpecAbunAce1(X)),1) + colnames(table0) <- c("Estimate", "s.e.", paste(Chao1(X)[5]*100,"%Lower", sep=""), paste(Chao1(X)[5]*100,"%Upper", sep="")) + rownames(table0) <- c(" Chao1 (Chao, 1984)"," Chao1-bc ", " iChao1"," ACE (Chao & Lee, 1992)", + " ACE-1 (Chao & Lee, 1992)") + + SHANNON=Shannon_index(X) + table1=round(SHANNON[c(1:5),],3) + table1=table1[-2,] ##2016.05.09 + colnames(table1) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1) <- c(" MLE"," MLE_bc"," Jackknife", + # " Chao & Shen"," Chao et al. (2013)") + rownames(table1) <- c(" MLE"," Jackknife", + " Chao & Shen"," Chao et al. (2013)") + + table1_exp=round(SHANNON[c(6:10),],3) + table1_exp=table1_exp[-2,] ##2016.05.09 + colnames(table1_exp) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1_exp) <- c(" MLE"," MLE_bc"," Jackknife", + # " Chao & Shen"," Chao et al. (2013)") + rownames(table1_exp) <- c(" MLE"," Jackknife", + " Chao & Shen"," Chao et al. (2013)") + + table2=round(Simpson_index(X)[c(1:2),],5) + colnames(table2) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2) <- c(" MVUE"," MLE") + + table2_recip=round(Simpson_index(X)[c(3:4),],5) + colnames(table2_recip) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2_recip) <- c(" MVUE"," MLE") + + if(is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "abundance", q=NULL, from=0, to=3, interval=0.25, B=50, conf=0.95))} + if(!is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "abundance", q=q, from=0, to=3, interval=0.25, B=50, conf=0.95))} + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Hill[14:26,4],Hill[1:13,4]) + #Chao.LCL <- Hill[14:26,3] - 1.96*Hill[14:26,4] + #Chao.UCL <- Hill[14:26,3] + 1.96*Hill[14:26,4] + #Emperical.LCL <- Hill[1:13,3] - 1.96*Hill[1:13,4] + #Emperical.UCL <- Hill[1:13,3] + 1.96*Hill[1:13,4] + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Chao.LCL,Chao.UCL,Emperical.LCL,Emperical.UCL) + #Hill<-round(Hill,3) + #Hill <- data.frame(Hill) + q_length<-length(Hill[,1])/2 + + Chao.LCL <- Hill[(q_length+1):(2*q_length),3] - 1.96*Hill[(q_length+1):(2*q_length),4] + Chao.UCL <- Hill[(q_length+1):(2*q_length),3] + 1.96*Hill[(q_length+1):(2*q_length),4] + Emperical.LCL <- Hill[1:q_length,3] - 1.96*Hill[1:q_length,4] + Emperical.UCL <- Hill[1:q_length,3] + 1.96*Hill[1:q_length,4] + Hill<-cbind(Hill[1:q_length,1],Hill[(q_length+1):(2*q_length),3],Chao.LCL,Chao.UCL,Hill[1:q_length,3],Emperical.LCL,Emperical.UCL) + Hill<-round(Hill,3) + Hill <- data.frame(Hill) + #colnames(Hill)<-c("q","Chao","Empirical","Chao(s.e.)","Empirical(s.e.)") + colnames(Hill)<-c("q","ChaoJost","95%Lower","95%Upper","Empirical","95%Lower","95%Upper") + q_hill <- nrow(Hill) + rownames(Hill) <- paste(" ",1:q_hill) + z <- list("datatype"= type,"Basic_data"=BASIC.DATA,"Species_richness"=table0, + "Shannon_index"=table1,"Shannon_diversity"=table1_exp, + "Simpson_index"=table2,"Simpson_diversity"=table2_recip, + "Hill_numbers"= Hill) + } + if(datatype == "incidence_freq_count"){ + t <- as.integer(data[1]) + data <- data[-c(1)] + data <- as.integer(data) + lengthdat <- length(data) + data <- rep(data[seq(1,lengthdat,2)],data[seq(2,lengthdat,2)]) + data <- c(t,data) + names(data) <- c("T", paste("y",1:(length(data)-1),sep="")) + datatype <- "incidence_freq" + X <- data + } + if(datatype=="incidence_freq"){ + if(!is.vector(X)) X <- as.numeric(unlist(c(X))) + type="incidence" + U<-sum(X[-1]) + D<-sum(X[-1]>0) + T<-X[1] + C<-Chat.Sam(X,T) + CV_squre<-max( D/C*T/(T-1)*sum(X[-1]*(X[-1]-1))/U^2-1, 0) + CV<-CV_squre^0.5 + BASIC.DATA <- matrix(round(c(D,T,U, C, CV),3), ncol = 1) + nickname <- matrix(c("D", "T","U", "C", "CV"), ncol = 1) + BASIC.DATA <- cbind(nickname, BASIC.DATA) + + colnames(BASIC.DATA) <- c("Variable", "Value") + rownames(BASIC.DATA) <- c(" Number of observed species", " Number of Sampling units"," Total number of incidences", + " Estimated sample coverage", + " Estimated CV") + BASIC.DATA <- data.frame(BASIC.DATA) + #BASIC.DATA <- basicInci(X, k=10)[[1]] + ############################################################ + table0=SpecInci(X, k=10, conf=0.95) + rownames(table0) <- c(" Chao2 (Chao, 1987)"," Chao2-bc ", " iChao2"," ICE (Lee & Chao, 1994)", + " ICE-1 (Lee & Chao, 1994)") + SHANNON=Shannon_Inci_index(X) + table1=round(SHANNON[c(1,4),],3) + colnames(table1) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1) <- c(" MLE"," MLE_bc"," Chao & Shen"," Chao et al. (2013)") + rownames(table1) <- c(" MLE"," Chao et al. (2013)") + table1_exp=round(SHANNON[c(5,8),],3) + colnames(table1_exp) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1_exp) <- c(" MLE"," MLE_bc"," Chao & Shen"," Chao et al. (2013)") + rownames(table1_exp) <- c(" MLE"," Chao et al. (2013)") + + SIMPSON=Simpson_Inci_index(X) + table2=round(SIMPSON[c(1:2),],5) + colnames(table2) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2) <- c(" MVUE"," MLE") + + table2_recip=round(SIMPSON[c(3:4),],5) + colnames(table2_recip) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2_recip) <- c(" MVUE"," MLE") + + + ############################################################ + #Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence", from=0, to=3, interval=0.25, B=50, conf=0.95)) + if(is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence_freq", q=NULL, from=0, to=3, interval=0.25, B=50, conf=0.95))} + if(!is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence_freq", q=q, from=0, to=3, interval=0.25, B=50, conf=0.95))} + + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Hill[14:26,4],Hill[1:13,4]) + #Chao.LCL <- Hill[14:26,3] - 1.96*Hill[14:26,4] + #Chao.UCL <- Hill[14:26,3] + 1.96*Hill[14:26,4] + #Emperical.LCL <- Hill[1:13,3] - 1.96*Hill[1:13,4] + #Emperical.UCL <- Hill[1:13,3] + 1.96*Hill[1:13,4] + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Chao.LCL,Chao.UCL,Emperical.LCL,Emperical.UCL) + #Hill<-round(Hill,3) + #Hill <- data.frame(Hill) + q_length<-length(Hill[,1])/2 + + Chao.LCL <- Hill[(q_length+1):(2*q_length),3] - 1.96*Hill[(q_length+1):(2*q_length),4] + Chao.UCL <- Hill[(q_length+1):(2*q_length),3] + 1.96*Hill[(q_length+1):(2*q_length),4] + Emperical.LCL <- Hill[1:q_length,3] - 1.96*Hill[1:q_length,4] + Emperical.UCL <- Hill[1:q_length,3] + 1.96*Hill[1:q_length,4] + Hill<-cbind(Hill[1:q_length,1],Hill[(q_length+1):(2*q_length),3],Chao.LCL,Chao.UCL,Hill[1:q_length,3],Emperical.LCL,Emperical.UCL) + Hill<-round(Hill,3) + Hill <- data.frame(Hill) + #colnames(Hill)<-c("q","Chao","Empirical","Chao(s.e.)","Empirical(s.e.)") + colnames(Hill)<-c("q","ChaoJost","95%Lower","95%Upper","Empirical","95%Lower","95%Upper") + q_hill <- nrow(Hill) + rownames(Hill) <- paste(" ",1:q_hill) + #z <- list("BASIC.DATA"=BASIC.DATA,"HILL.NUMBERS"= Hill) + z <- list("datatype"= type,"Basic_data"=BASIC.DATA,"Species_richness"=table0, + "Shannon_index"=table1,"Shannon_diversity"=table1_exp, + "Simpson_index"=table2,"Simpson_diversity"=table2_recip, + "Hill_numbers"= Hill) + } + if(datatype=="incidence_raw"){ + type="incidence" + datatype = "incidence_freq" + U<-sum(X[-1]) + D<-sum(X[-1]>0) + T<-X[1] + C<-Chat.Sam(X,T) + CV_squre<-max( D/C*T/(T-1)*sum(X[-1]*(X[-1]-1))/U^2-1, 0) + CV<-CV_squre^0.5 + BASIC.DATA <- matrix(round(c(D,T,U, C, CV),3), ncol = 1) + nickname <- matrix(c("D", "T","U", "C", "CV"), ncol = 1) + BASIC.DATA <- cbind(nickname, BASIC.DATA) + + colnames(BASIC.DATA) <- c("Variable", "Value") + rownames(BASIC.DATA) <- c(" Number of observed species", " Number of Sampling units"," Total number of incidences", + " Estimated sample coverage", + " Estimated CV") + BASIC.DATA <- data.frame(BASIC.DATA) + ############################################################ + table0=SpecInci(X, k=10, conf=0.95) + rownames(table0) <- c(" Chao2 (Chao, 1987)"," Chao2-bc ", " iChao2"," ICE (Lee & Chao, 1994)", + " ICE-1 (Lee & Chao, 1994)") + SHANNON=Shannon_Inci_index(X) + table1=round(SHANNON[c(1,4),],3) + colnames(table1) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1) <- c(" MLE"," MLE_bc"," Chao & Shen"," Chao et al. (2013)") + rownames(table1) <- c(" MLE"," Chao et al. (2013)") + table1_exp=round(SHANNON[c(5,8),],3) + colnames(table1_exp) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + #rownames(table1_exp) <- c(" MLE"," MLE_bc"," Chao & Shen"," Chao et al. (2013)") + rownames(table1_exp) <- c(" MLE"," Chao et al. (2013)") + + SIMPSON=Simpson_Inci_index(X) + table2=round(SIMPSON[c(1:2),],5) + colnames(table2) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2) <- c(" MVUE"," MLE") + + table2_recip=round(SIMPSON[c(3:4),],5) + colnames(table2_recip) <- c("Estimate", "s.e.", paste("95%Lower"), paste("95%Upper")) + rownames(table2_recip) <- c(" MVUE"," MLE") + + + ############################################################ + #Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence", from=0, to=3, interval=0.25, B=50, conf=0.95)) + if(is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence", q=NULL, from=0, to=3, interval=0.25, B=50, conf=0.95))} + if(!is.null(q)){Hill <- reshapeChaoHill(ChaoHill(X, datatype = "incidence", q=q, from=0, to=3, interval=0.25, B=50, conf=0.95))} + + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Hill[14:26,4],Hill[1:13,4]) + #Chao.LCL <- Hill[14:26,3] - 1.96*Hill[14:26,4] + #Chao.UCL <- Hill[14:26,3] + 1.96*Hill[14:26,4] + #Emperical.LCL <- Hill[1:13,3] - 1.96*Hill[1:13,4] + #Emperical.UCL <- Hill[1:13,3] + 1.96*Hill[1:13,4] + #Hill<-cbind(Hill[1:13,1],Hill[14:26,3],Hill[1:13,3],Chao.LCL,Chao.UCL,Emperical.LCL,Emperical.UCL) + #Hill<-round(Hill,3) + #Hill <- data.frame(Hill) + q_length<-length(Hill[,1])/2 + + Chao.LCL <- Hill[(q_length+1):(2*q_length),3] - 1.96*Hill[(q_length+1):(2*q_length),4] + Chao.UCL <- Hill[(q_length+1):(2*q_length),3] + 1.96*Hill[(q_length+1):(2*q_length),4] + Emperical.LCL <- Hill[1:q_length,3] - 1.96*Hill[1:q_length,4] + Emperical.UCL <- Hill[1:q_length,3] + 1.96*Hill[1:q_length,4] + Hill<-cbind(Hill[1:q_length,1],Hill[(q_length+1):(2*q_length),3],Chao.LCL,Chao.UCL,Hill[1:q_length,3],Emperical.LCL,Emperical.UCL) + Hill<-round(Hill,3) + Hill <- data.frame(Hill) + #colnames(Hill)<-c("q","Chao","Empirical","Chao(s.e.)","Empirical(s.e.)") + colnames(Hill)<-c("q","ChaoJost","95%Lower","95%Upper","Empirical","95%Lower","95%Upper") + q_hill <- nrow(Hill) + rownames(Hill) <- paste(" ",1:q_hill) + #z <- list("BASIC.DATA"=BASIC.DATA,"HILL.NUMBERS"= Hill) + + z <- list("datatype"= type,"Basic_data"=BASIC.DATA,"Species_richness"=table0, + "Shannon_index"=table1,"Shannon_diversity"=table1_exp, + "Simpson_index"=table2,"Simpson_diversity"=table2_recip, + "Hill_numbers"= Hill) + } + class(z) <- c("spadeDiv") + return(z) +} + + + + + + + +# +# +########################################### +#' Estimation of two-assemblage similarity measures +#' +#' \code{SimilarityPair}: Estimation various similarity indices for two assemblages. The richness-based +#' indices include the classic two-community Jaccard and Sorensen indices; the abundance-based +#' indices include the Horn, Morisita-Horn, regional species-overlap, two-community Bray-Curtis and the +#' abundance-based Jaccard and Sorensen indices. Three types of data are supported: Type (1) +#' abundance data (datatype="abundance"), Type (2) incidence-frequency data +#' (datatype="incidence_freq"), and Type (2B) incidence-raw data (datatype="incidence_raw"); see +#' \code{SpadeR-package} details for data input formats. +#' @param X a matrix/data.frame of species abundances/incidences.\cr +#' @param datatype type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr +#' @param units number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data. \cr +#' @param nboot an integer specifying the number of replications. +#' @return a list of ten objects: \cr\cr +#' \code{$datatype} for showing the specified data types (abundance or incidence). \cr\cr +#' \code{$info} for summarizing data information. \cr\cr +#' \code{$Empirical_richness} for showing the observed values of the richness-based similarity indices +#' include the classic two-community Jaccard and Sorensen indices. \cr\cr +#' \code{$Empirical_relative} for showing the observed values of the equal-weighted similarity indices +#' for comparing species relative abundances including Horn, Morisita-Horn, regional overlap, +#' Chao-Jaccard and Chao-Sorensen abundance (or incidence) measures based on species relative abundances. \cr \cr +#' \code{$Empirical_WtRelative} for showing the observed value of the Horn similarity index for comparing +#' size-weighted species relative abundances based on Shannon entropy under equal-effort sampling. \cr\cr +#' \code{$Empirical_absolute} for showing the observed values of the similarity indices for comparing +#' absolute abundances. These measures include the Shannon-entropy-based measure, +#' Morisita-Horn and the regional overlap measures based on species absolute abundances, as well as the Bray-Curtis index. +#' All measures are valid only under equal-effort sampling. \cr\cr +#' The corresponding four objects for showing the estimated similarity indices are: +#' \code{$estimated_richness}, \code{$estimated_relative}, \code{$estimated_WtRelative} and \code{$estimated_Absolute}. \cr\cr +#' @examples +#' \dontrun{ +#' data(SimilarityPairData) +#' # Type (1) abundance data +#' SimilarityPair(SimilarityPairData$Abu,"abundance",nboot=200) +#' # Type (2) incidence-frequency data +#' SimilarityPair(SimilarityPairData$Inci,"incidence_freq",nboot=200) +#' # Type (2B) incidence-raw data +#' SimilarityPair(SimilarityPairData$Inci_raw,"incidence_raw",units=c(19,17),nboot=200) +#' } +#' @references +#' Chao, A., Chazdon, R. L., Colwell, R. K. and Shen, T.-J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters, 8, 148-159.\cr\cr +#' Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +#' Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected +#' Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471. \cr\cr +#' Chiu, C. H., Jost, L. and Chao, A. (2014). Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. Ecological Monographs, 84, 21-44.\cr\cr +#' @export + +SimilarityPair=function(X, datatype = c("abundance","incidence_freq", "incidence_raw"), units,nboot=200) +{ + + if(datatype=="abundance") + { + if(class(X)=="list"){X <- do.call(cbind,X)} + type <- "abundance" + info1 <- c("S.total"=sum(rowSums(X)>0), "n1"=sum(X[,1]), "n2"=sum(X[,2]), + "D1"=sum(X[,1]>0), "D2"=sum(X[,2]>0), "D12"=sum(X[,1]>0 & X[,2]>0), + "nboot"=nboot) + + info2 <- c("f[11]"=sum(X[,1]==1 & X[,2]==1), + "f[1+]"=sum(X[,1]==1 & X[,2]>0), "f[+1]"=sum(X[,1]>0 & X[,2]==1), + "f[2+]"=sum(X[,1]==2 & X[,2]>0), "f[+2]"=sum(X[,1]>0 & X[,2]==2),"f[22]"=sum(X[,1]==2 & X[,2]==2)) + info <- c(info1, info2) + ################################################################2016.07.08-(P.L.Lin) + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + temp <- list() + weight <- c(sum(X[,1])/(sum(X[,1])+sum(X[,2])), sum(X[,2])/(sum(X[,1])+sum(X[,2]))) + weight <- - sum(weight*log(weight)) / log(2) + mat <- Jaccard_Sorensen_Abundance_equ(datatype,X[, 1],X[, 2], nboot)[, c(1, 2)] + mat <- cbind(mat, mat[, 1]-1.96*mat[, 2], mat[, 1]+1.96*mat[, 2]) + MLE_Jaccard <- mat[1, ] + Est_Jaccard <- mat[2, ] + MLE_Sorensen <- mat[3, ] + Est_Sorensen <- mat[4, ] + mat2 <- Two_Horn_equ(X[,1], X[,2], method="all", weight="unequal", nboot = nboot) + MLE_Ee_Horn <- mat2$mle + MLE_Ee_Horn <- plus_CI(c(MLE_Ee_Horn[1],MLE_Ee_Horn[2])) + Est_Ee_Horn <- mat2$est + MLE_Ee_U12 <- plus_CI(c(weight*MLE_Ee_Horn[1],MLE_Ee_Horn[2])) + Est_Ee_U12 <- plus_CI(c(weight*Est_Ee_Horn[1],Est_Ee_Horn[2])) + mat3 <- Two_BC_equ(X[, 1],X[, 2], datatype="abundance", nboot) + MLE_Ee_Braycurtis <- mat3$mle + Est_Ee_Braycurtis <- mat3$est + mat4 <- SimilarityTwo(X,2,nboot,method="unequal weight") + MLE_Ee_C22 <- mat4$CqN[1, ] + Est_Ee_C22 <- mat4$CqN[2, ] + MLE_Ee_U22 <- mat4$UqN[1, ] + Est_Ee_U22 <- mat4$UqN[2, ] + mat5 <- Two_Horn_equ(X[,1], X[,2], method="all", weight="equal", nboot = nboot) + MLE_ew_Horn <- mat5$mle + Est_ew_Horn <- mat5$est + mat6 <- SimilarityTwo(X,2,nboot,method="equal weight") + MLE_ew_C22 <- mat6$CqN[1, ] + Est_ew_C22 <- mat6$CqN[2, ] + MLE_ew_U22 <- mat6$UqN[1, ] + Est_ew_U22 <- mat6$UqN[2, ] + #MLE_ew_Braycurtis <- plus_CI(MLE_Braycurtis_equ(X[,1],X[,2],w1=0.5)) + #Est_ew_Braycurtis <- plus_CI(KH_Braycurtis_equ(X[,1],X[,2],w1=0.5)) + MLE_ew_ChaoSoresen <- mat[11,] + Est_ew_ChaoSoresen <- mat[12, ] + MLE_ew_ChaoJaccard <- mat[9, ] + Est_ew_ChaoJaccard <- mat[10, ] + temp[[1]] <- rbind(MLE_Sorensen, MLE_Jaccard) + rownames(temp[[1]]) <- c("C02(q=0,Sorensen)","U02(q=0,Jaccard)") + temp[[2]] <- rbind(MLE_ew_Horn, MLE_ew_C22, MLE_ew_U22, MLE_ew_ChaoJaccard, MLE_ew_ChaoSoresen) + rownames(temp[[2]]) <- c("C12=U12(q=1,Horn)","C22(q=2,Morisita)","U22(q=2,Regional overlap)", + "ChaoJaccard","ChaoSorensen") + temp[[3]] <- t(as.matrix(MLE_Ee_Horn)) + rownames(temp[[3]]) <- c("Horn size weighted(q=1)") + temp[[4]] <- rbind(MLE_Ee_U12, MLE_Ee_C22, MLE_Ee_U22, MLE_Ee_Braycurtis) + rownames(temp[[4]]) <- c("C12=U12(q=1)","C22(Morisita)", "U22(Regional overlap)","Bray-Curtis") + temp[[5]] <- rbind(Est_Sorensen, Est_Jaccard) + rownames(temp[[5]]) <- c("C02(q=0,Sorensen)","U02(q=0,Jaccard)") + temp[[6]] <- rbind(Est_ew_Horn, Est_ew_C22, Est_ew_U22, Est_ew_ChaoJaccard, Est_ew_ChaoSoresen) + rownames(temp[[6]]) <- c("C12=U12(q=1,Horn)","C22(q=2,Morisita)","U22(q=2,Regional overlap)", + "ChaoJaccard","ChaoSorensen") + temp[[7]] <- t(as.matrix(Est_Ee_Horn)) + rownames(temp[[7]]) <- c("Horn size weighted(q=1)") + temp[[8]] <- rbind(Est_Ee_U12, Est_Ee_C22, Est_Ee_U22, Est_Ee_Braycurtis) + temp <- lapply(temp, FUN = function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") + return(x) + }) + rownames(temp[[8]]) <- c("C12=U12(q=1)","C22(Morisita)", "U22(Regional overlap)","Bray-Curtis") + z <- list("datatype"=type,"info"=info, "Empirical_richness"=temp[[1]], "Empirical_relative"=temp[[2]], "Empirical_WtRelative"=temp[[3]], + "Empirical_absolute"=temp[[4]], "estimated_richness"=temp[[5]], "estimated_relative"=temp[[6]], "estimated_WtRelative"=temp[[7]], "estimated_absolute"=temp[[8]]) + } + ##--------------------------------------------------------------- + if(datatype=="incidence_raw"){ + data <- X + t = units + if(ncol(data) != sum(t)) stop("Number of columns does not euqal to the sum of key in sampling units") + dat <- matrix(0, ncol = length(t), nrow = nrow(data)) + n <- 0 + for(i in 1:length(t)){ + dat[, i] <- as.integer(rowSums(data[,(n+1):(n+t[i])] ) ) + n <- n+t[i] + } + t <- as.integer(t) + dat <- apply(dat, MARGIN = 2, as.integer) + dat <- data.frame(rbind(t, dat),row.names = NULL) + y1 <- dat[,1] + y2 <- dat[,2] + X <- cbind(y1, y2) + type <- "incidence_freq" + X <- as.data.frame(X) + } + if(datatype=="incidence_freq") type <- "incidence_freq" + if(datatype=="incidence_freq" | type == "incidence_freq") + { + if(class(X)=="list"){X <- do.call(cbind,X)} + no.assemblage=length(X[1,]) + Y=X[-1,] + type="incidence" + info1 <- c("S.total"=sum(rowSums(Y)>0), "T1"=X[1,1], "T2"=X[1,2], "U1"=sum(Y[,1]), "U2"=sum(Y[,2]), + "D1"=sum(Y[,1]>0), "D2"=sum(Y[,2]>0), "D12"=sum(Y[,1]>0 & Y[,2]>0), + "nboot"=nboot) + + info2 <- c("Q[11]"=sum(Y[,1]==1 & Y[,2]==1), + "Q[1+]"=sum(Y[,1]==1 & Y[,2]>0), "Q[+1]"=sum(Y[,1]>0 & Y[,2]==1), + "Q[2+]"=sum(Y[,1]==2 & Y[,2]>0), "Q[+2]"=sum(Y[,1]>0 & Y[,2]==2), "Q[22]"=sum(Y[,1]==2 & Y[,2]==2)) + info <- c(info1, info2) + + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + temp <- list() + weight <- c(sum(Y[,1])/(sum(Y[,1])+sum(Y[,2])), sum(Y[,2])/(sum(Y[,1])+sum(Y[,2]))) + weight <- - sum(weight*log(weight)) / log(2) + mat <- Jaccard_Sorensen_Abundance_equ(datatype="incidence",X[, 1],X[, 2], nboot)[, c(1, 2)] + mat <- cbind(mat, mat[, 1]-1.96*mat[, 2], mat[, 1]+1.96*mat[, 2]) + MLE_Jaccard <- mat[1, ] + Est_Jaccard <- mat[2, ] + MLE_Sorensen <- mat[3, ] + Est_Sorensen <- mat[4, ] + mat2 <- Two_Horn_equ(X[,1], X[,2], datatype = "incidence", method="all", weight="unequal", nboot) + MLE_Ee_Horn <- mat2$mle + MLE_Ee_Horn <- plus_CI(c(MLE_Ee_Horn[1],MLE_Ee_Horn[2])) + Est_Ee_Horn <- mat2$est + MLE_Ee_U12 <- plus_CI(c(weight*MLE_Ee_Horn[1],MLE_Ee_Horn[2])) + Est_Ee_U12 <- plus_CI(c(weight*Est_Ee_Horn[1],Est_Ee_Horn[2])) + mat3 <- C2N_ee_se_inc(X, nboot) + MLE_Ee_C22 <- plus_CI(mat3[1,]) + Est_Ee_C22 <- plus_CI(mat3[3,]) + MLE_Ee_U22 <- plus_CI(mat3[2,]) + Est_Ee_U22 <- plus_CI(mat3[4,]) + mat4 <- Two_Horn_equ(X[,1], X[,2], datatype = "incidence", method="all", weight="equal", nboot) + MLE_ew_Horn <- mat4$mle + Est_ew_Horn <- mat4$est + mat5 <- SimilarityTwo(X, 2, nboot, method="equal weight", datatype="incidence") + MLE_ew_C22 <- mat5$CqN[1, ] + Est_ew_C22 <- mat5$CqN[2, ] + MLE_ew_U22 <- mat5$UqN[1, ] + Est_ew_U22 <- mat5$UqN[2, ] + MLE_ew_ChaoSoresen <- mat[11,] + Est_ew_ChaoSoresen <- mat[12, ] + MLE_ew_ChaoJaccard <- mat[9, ] + Est_ew_ChaoJaccard <- mat[10, ] + mat5 <- Two_BC_equ(X[, 1],X[, 2], datatype="incidence", nboot) + MLE_Ee_Braycurtis <- mat5$mle + Est_Ee_Braycurtis <- mat5$est + temp[[1]] <- rbind(MLE_Sorensen, MLE_Jaccard) + rownames(temp[[1]]) <- c("C02(q=0,Sorensen)","U02(q=0,Jaccard)") + temp[[2]] <- rbind(MLE_ew_Horn, MLE_ew_C22, MLE_ew_U22, MLE_ew_ChaoJaccard, MLE_ew_ChaoSoresen) + rownames(temp[[2]]) <- c("C12=U12(q=1,Horn)","C22(q=2,Morisita)","U22(q=2,Regional overlap)", + "ChaoJaccard","ChaoSorensen") + temp[[3]] <- t(as.matrix(MLE_Ee_Horn)) + rownames(temp[[3]]) <- c("Horn size weighted(q=1)") + temp[[4]] <- rbind(MLE_Ee_U12, MLE_Ee_C22, MLE_Ee_U22, MLE_Ee_Braycurtis) + rownames(temp[[4]]) <- c("C12=U12(q=1)","C22(Morisita)", "U22(Regional overlap)","Bray-Curtis") + temp[[5]] <- rbind(Est_Sorensen, Est_Jaccard) + rownames(temp[[5]]) <- c("C02(q=0,Sorensen)","U02(q=0,Jaccard)") + temp[[6]] <- rbind(Est_ew_Horn, Est_ew_C22, Est_ew_U22, Est_ew_ChaoJaccard, Est_ew_ChaoSoresen) + rownames(temp[[6]]) <- c("C12=U12(q=1,Horn)","C22(q=2,Morisita)","U22(q=2,Regional overlap)", + "ChaoJaccard","ChaoSorensen") + temp[[7]] <- t(as.matrix(Est_Ee_Horn)) + rownames(temp[[7]]) <- c("Horn size weighted(q=1)") + temp[[8]] <- rbind(Est_Ee_U12, Est_Ee_C22, Est_Ee_U22, Est_Ee_Braycurtis) + rownames(temp[[8]]) <- c("C12=U12(q=1)","C22(Morisita)", "U22(Regional overlap)","Bray-Curtis") + temp <- lapply(temp, FUN = function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") + return(x) + }) + z <- list("datatype"=type,"info"=info, "Empirical_richness"=temp[[1]], "Empirical_relative"=temp[[2]], "Empirical_WtRelative"=temp[[3]], + "Empirical_absolute"=temp[[4]], "estimated_richness"=temp[[5]], "estimated_relative"=temp[[6]], "estimated_WtRelative"=temp[[7]], "estimated_absolute"=temp[[8]]) + + } + class(z) <- c("spadeTwo") + return(z) +} + + +# +# +########################################### +#' Estimation of multiple-community similarity measures +#' +#' \code{SimilarityMult}: Estimation various \eqn{N}-community similarity indices. The richness-based indices +#' include the classic \eqn{N}-community Jaccard and Sorensen indices; the abundance-based indices include the Horn, Morisita-Horn, regional species-overlap, and the \eqn{N}-community Bray-Curtis indices. +#' Three types of data are supported: Type (1) abundance data (datatype="abundance"), Type (2) +#' incidence-frequency data (datatype="incidence_freq"), and Type (2B) incidence-raw data +#' (datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +#' @param X a matrix/data.frame of species abundances/incidences.\cr +#' @param datatype type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr +#' @param units number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data. \cr +#' @param q a specified order to use to compute pairwise similarity measures. If \code{q = 0}, this function computes the estimated pairwise richness-based Jaccard and +#' Sorensen similarity indices. +#' If \code{q = 1} and \code{goal=relative}, this function computes the estimated pairwise equal-weighted and size-weighted Horn indices based on Shannon entropy; +#' If \code{q = 1} and \code{goal=absolute}, this function computes the estimated pairwise Shannon-entropy-based measure for comparing absolute abundances. If \code{q = 2} and \code{goal=relative}, +#' this function computes the estimated pairwise Morisita-Horn and regional species-overlap indices based on species relative abundances. +#' If \code{q = 2} and \code{goal=absolute}, +#' this function computes the estimated pairwise Morisita-Horn and regional species-overlap indices based on species absolute abundances. +#' @param nboot an integer specifying the number of bootstrap replications. +#' @param goal a specified estimating goal to use to compute pairwise similarity measures:comparing species relative abundances (\code{goal=relative}) or comparing species absolute abundances (\code{goal=absolute}). \cr\cr +#' @return a list of fourteen objects: \cr\cr +#' \code{$datatype} for showing the specified data types (abundance or incidence).\cr\cr +#' \code{$info} for summarizing data information.\cr\cr +#' \code{$Empirical_richness} for showing the observed values of the richness-based similarity indices +#' include the classic \eqn{N}-community Jaccard and Sorensen indices. \cr\cr +#' \code{$Empirical_relative} for showing the observed values of the equal-weighted similarity indices +#' for comparing species relative abundances including Horn, Morisita-Horn and regional overlap measures. \cr \cr +#' \code{$Empirical_WtRelative} for showing the observed value of the Horn similarity index for comparing +#' size-weighted species relative abundances based on Shannon entropy under equal-effort sampling. \cr\cr +#' \code{$Empirical_absolute} for showing the observed values of the similarity indices for comparing +#' absolute abundances. These measures include the Shannon-entropy-based measure, Morisita-Horn and the regional species-overlap measures based on species absolute abundance, as well as the \eqn{N}-community Bray-Curtis index. +#' All measures are valid only under equal-effort sampling. \cr\cr +#' The corresponding four objects for showing the estimated similarity indices are: +#' \code{$estimated_richness}, \code{$estimated_relative}, \code{$estimated_WtRelative} and \code{$estimated_absolute}. \cr\cr +#' \code{$pairwise} and \code{$similarity.matrix} for showing respectively the pairwise dis-similarity +#' estimates (with related statistics) and the similarity matrix for various measures depending on the +#' diversity order \code{q} and the \code{goal} aspecified in the function arguments. \cr\cr +#' \code{$goal} for showing the goal specified in the argument goal (absolute or relative) used to compute pairwise similarity.\cr\cr +#' \code{$q} for showing which diversity order \code{q} specified to compute pairwise similarity. \cr\cr +#' @examples +#' \dontrun{ +#' data(SimilarityMultData) +#' # Type (1) abundance data +#' SimilarityMult(SimilarityMultData$Abu,"abundance",q=2,nboot=200,"relative") +#' # Type (2) incidence-frequency data +#' SimilarityMult(SimilarityMultData$Inci,"incidence_freq",q=2,nboot=200,"relative") +#' # Type (2B) incidence-raw data +#' SimilarityMult(SimilarityMultData$Inci_raw,"incidence_raw", +#' units=c(19,17,15),q=2,nboot=200,"relative") +#' } +#' @references +#' Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +#' Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471.\cr\cr +#' Chiu, C. H., Jost, L. and Chao, A. (2014). Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. Ecological Monographs, 84, 21-44.\cr\cr +#' Gotelli, N. G. and Chao, A. (2013). Measuring and estimating species richness, species diver- sity, +#' and biotic similarity from sampling data. Encyclopedia of Biodiversity, 2nd Edition, Vol. 5, 195-211, Waltham, MA. +#' @export + + +SimilarityMult=function(X,datatype=c("abundance","incidence_freq", "incidence_raw"),units,q=2,nboot=200,goal="relative") +{ + method <- goal + if(datatype=="abundance"){ + if(class(X)=="list"){X <- do.call(cbind,X)} + type <- "abundance" + N <- no.community <- ncol(X) + temp <- c("N"=ncol(X), "S.total"=sum(rowSums(X)>0)) + n <- apply(X,2,sum) + D <- apply(X,2,function(x)sum(x>0)) + + if(N > 2){ + temp1 <- temp2 <- rep(0, N*(N-1)/2) + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + temp1[k] <- paste('D',i,j,sep="") + temp2[k] <- sum(X[,i]>0 & X[,j]>0) + k <- k + 1 + } + } + } + names(temp2) <- temp1 + names(n) <- paste('n',1:N, sep="") + names(D) <- paste('D',1:N, sep="") + info <- c(temp, n, D, temp2) + if(N == 3) info <- c(temp, n, D, temp2, D123=sum(X[,1]>0 & X[,2]>0 & X[,3]>0)) + info <- c(info, nboot=nboot) + ################################################################2016.07.11-(P.L.Lin) + temp <- list() + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + n <- apply(X = X, MARGIN = 2, FUN = sum) + weight <- n/sum(n) + weight <- - sum(weight*log(weight)) / log(N) + mat <- SimilarityMul(X, 0, nboot, method ="unequal weight") + MLE_Jaccard <- mat$UqN[1, ] + Est_Jaccard <- mat$UqN[2, ] + MLE_Sorensen <- mat$CqN[1, ] + Est_Sorensen <- mat$CqN[2, ] + mat2 <- Horn_Multi_equ(X, datatype="abundance", nboot, method=c("unequal")) + MLE_Ee_Horn <- mat2$mle + Est_Ee_Horn <- mat2$est + Est_Ee_U12 <- plus_CI(c(weight*Est_Ee_Horn[1], Est_Ee_Horn[2])) + MLE_Ee_U12 <- plus_CI(c(weight*MLE_Ee_Horn[1], MLE_Ee_Horn[2])) + mat3 <- BC_equ(X, datatype="abundance", nboot) + MLE_Ee_Braycurtis <- mat3$mle + Est_Ee_Braycurtis <- mat3$est + mat4 <- SimilarityMul(X,2,nboot,method="unequal weight") + MLE_Ee_C22 <- mat4$CqN[1, ] + Est_Ee_C22 <- mat4$CqN[2, ] + MLE_Ee_U22 <- mat4$UqN[1, ] + Est_Ee_U22 <- mat4$UqN[2, ] + mat5 <- Horn_Multi_equ(X, datatype="abundance", nboot, method=c("equal")) + MLE_ew_Horn <- mat5$mle + Est_ew_Horn <- mat5$est + mat6 <- SimilarityMul(X,2,nboot,method="equal weight") + MLE_ew_C22 <- mat6$CqN[1, ] + Est_ew_C22 <- mat6$CqN[2, ] + MLE_ew_U22 <- mat6$UqN[1, ] + Est_ew_U22 <- mat6$UqN[2, ] + temp[[1]] <- rbind(MLE_Sorensen, MLE_Jaccard) + rownames(temp[[1]]) <- c("C0N(q=0,Sorensen)","U0N(q=0,Jaccard)") + temp[[2]] <- rbind(MLE_ew_Horn, MLE_ew_C22, MLE_ew_U22) + rownames(temp[[2]]) <- c("C1N=U1N(q=1,Horn)","C2N(q=2,Morisita)","U2N(q=2,Regional overlap)") + temp[[3]] <- t(as.matrix(MLE_Ee_Horn)) + rownames(temp[[3]]) <- c("Horn size weighted(q=1)") + temp[[4]] <- rbind(MLE_Ee_U12, MLE_Ee_C22, MLE_Ee_U22, MLE_Ee_Braycurtis) + rownames(temp[[4]]) <- c("C1N=U1N(q=1)","C2N(Morisita)", "U2N(Regional overlap)","Bray-Curtis") + temp[[5]] <- rbind(Est_Sorensen, Est_Jaccard) + rownames(temp[[5]]) <- c("C0N(q=0,Sorensen)","U0N(q=0,Jaccard)") + temp[[6]] <- rbind(Est_ew_Horn, Est_ew_C22, Est_ew_U22) + rownames(temp[[6]]) <- c("C1N=U1N(q=1,Horn)","C2N(q=2,Morisita)","U2N(q=2,Regional overlap)") + temp[[7]] <- t(as.matrix(Est_Ee_Horn)) + rownames(temp[[7]]) <- c("Horn size weighted(q=1)") + temp[[8]] <- rbind(Est_Ee_U12, Est_Ee_C22, Est_Ee_U22, Est_Ee_Braycurtis) + rownames(temp[[8]]) <- c("C1N=U1N(q=1)","C2N(Morisita)", "U2N(Regional overlap)","Bray-Curtis") + temp <- lapply(temp, FUN = function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") + return(x) + }) + if(q == 0){ + temp_PC <- rep(0, N*(N-1)/2) + C02=matrix(0,choose(no.community,2),4) + U02=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal effort') + C02[k,] <- mat[1, ] + U02[k,] <- mat[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C02[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U02[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C02"=C02, "U02"=U02) + C_SM <- list("C02"=C_SM_1, "U02"=C_SM_2) + } + if(q == 1 & method=="relative"){ + temp_PC <- rep(0, N*(N-1)/2) + C12=matrix(0,choose(no.community,2),4) + Horn=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + C12[k,] <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal weight')[1, ] + Horn[k,] <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal effort')[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C12[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- Horn[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C12"=C12, "Horn"=Horn) + C_SM <- list("C12"=C_SM_1, "Horn"=C_SM_2) + } + if(q == 1 & method=="absolute"){ + temp_PC <- rep(0, N*(N-1)/2) + k=1 + C_SM_1=matrix(1,N,N) + C12=matrix(0,choose(no.community,2),4) + for(i in 1:(N-1)){ + for(j in (i+1):N){ + C12[k,] <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal effort')[1, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C12[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C12"=C12) + C_SM <- list("C12"=C_SM_1) + } + if(q == 2){ + temp_PC <- rep(0, N*(N-1)/2) + if(method=="absolute") method2 <- 'equal effort' + if(method=="relative") method2 <- 'equal weight' + C22=matrix(0,choose(no.community,2),4) + U22=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(X[,c(i,j)], q, nboot, method=method2) + C22[k,] <- mat[1, ] + U22[k,] <- mat[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C22[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U22[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C22"=C22, "U22"=U22) + C_SM <- list("C22"=C_SM_1,"U22"=C_SM_2) + } + Cqn_PC <- lapply(Cqn_PC, function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") ; rownames(x) <- temp_PC + return(x) + }) + z <- list("datatype"=datatype,"info"=info, "Empirical_richness"=temp[[1]], "Empirical_relative"=temp[[2]], "Empirical_WtRelative"=temp[[3]], + "Empirical_absolute"=temp[[4]], "estimated_richness"=temp[[5]], "estimated_relative"=temp[[6]], "estimated_WtRelative"=temp[[7]], "estimated_absolute"=temp[[8]], "pairwise"=Cqn_PC, "similarity.matrix"=C_SM, "goal"=method, "q"=q) + } + if(datatype == "incidence_raw"){ + data <- X + t = units + if(ncol(data) != sum(t)) stop("Number of columns does not euqal to the sum of key in sampling units") + dat <- matrix(0, ncol = length(t), nrow = nrow(data)) + n <- 0 + for(i in 1:length(t)){ + dat[, i] <- as.integer(rowSums(data[,(n+1):(n+t[i])] ) ) + n <- n+t[i] + } + t <- as.integer(t) + dat <- apply(dat, MARGIN = 2, as.integer) + X <- data.frame(rbind(t, dat),row.names = NULL) + if(ncol(X) <= 2) stop("Multiple Commumity measures is only for the data which has three community or more") + type = "incidence_freq" + } + if(datatype=="incidence_freq") type <- "incidence_freq" + if(datatype=="incidence_freq" | type == "incidence_freq"){ + if(class(X)=="list"){X <- do.call(cbind,X)} + type <- "incidence" + Y <- X + X <- X[-1,] + t <- as.vector(Y[1,]) + N <- no.community <- ncol(X) + temp <- c("N"=ncol(X), "S.total"=sum(rowSums(X)>0)) + n <- apply(X,2,sum) + D <- apply(X,2,function(x)sum(x>0)) + if(N > 2){ + temp1 <- temp2 <- rep(0, N*(N-1)/2) + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + temp1[k] <- paste('D',i,j,sep="") + temp2[k] <- sum(X[,i]>0 & X[,j]>0) + k <- k + 1 + } + } + } + names(temp2) <- temp1 + names(t) <- paste('T',1:N, sep="") + names(n) <- paste('u',1:N, sep="") + names(D) <- paste('D',1:N, sep="") + info <- c(temp, t, n, D, temp2) + if(N == 3) info <- c(temp, t, n, D, temp2, D123=sum(X[,1]>0 & X[,2]>0 & X[,3]>0)) + info <- unlist(c(info, nboot=nboot)) + ################################################################2016.07.20-(P.L.Lin) + temp <- list() + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + n <- apply(X = X, MARGIN = 2, FUN = sum) + weight <- n/sum(n) + weight <- - sum(weight*log(weight)) / log(N) + mat <- SimilarityMul(Y, 0, nboot, method ="unequal weight", datatype="incidence") + MLE_Jaccard <- mat$UqN[1, ] + Est_Jaccard <- mat$UqN[2, ] + MLE_Sorensen <- mat$CqN[1, ] + Est_Sorensen <- mat$CqN[2, ] + mat2 <- Horn_Multi_equ(Y, datatype="incidence", nboot, method=c("unequal")) + MLE_Ee_Horn <- mat2$mle + Est_Ee_Horn <- mat2$est + Est_Ee_U12 <- plus_CI(c(weight*Est_Ee_Horn[1], Est_Ee_Horn[2])) + MLE_Ee_U12 <- plus_CI(c(weight*MLE_Ee_Horn[1], MLE_Ee_Horn[2])) + mat3 <- BC_equ(Y, datatype="incidence", nboot) + MLE_Ee_Braycurtis <- mat3$mle + Est_Ee_Braycurtis <- mat3$est + mat4 <- C2N_ee_se_inc(Y, nboot) + MLE_Ee_C22 <- plus_CI(mat4[1, ]) + Est_Ee_C22 <- plus_CI(mat4[3, ]) + MLE_Ee_U22 <- plus_CI(mat4[2, ]) + Est_Ee_U22 <- plus_CI(mat4[4, ]) + mat5 <- Horn_Multi_equ(Y, datatype="incidence", nboot, method=c("equal")) + MLE_ew_Horn <- mat5$mle + Est_ew_Horn <- mat5$est + mat6 <- SimilarityMul(Y, 2, nboot, datatype = "incidence", method="equal weight") + MLE_ew_C22 <- mat6$CqN[1, ] + Est_ew_C22 <- mat6$CqN[2, ] + MLE_ew_U22 <- mat6$UqN[1, ] + Est_ew_U22 <- mat6$UqN[2, ] + temp[[1]] <- rbind(MLE_Sorensen, MLE_Jaccard) + rownames(temp[[1]]) <- c("C0N(q=0,Sorensen)","U0N(q=0,Jaccard)") + temp[[2]] <- rbind(MLE_ew_Horn, MLE_ew_C22, MLE_ew_U22) + rownames(temp[[2]]) <- c("C1N=U1N(q=1,Horn)","C2N(q=2,Morisita)","U2N(q=2,Regional overlap)") + temp[[3]] <- t(as.matrix(MLE_Ee_Horn)) + rownames(temp[[3]]) <- c("Horn size weighted(q=1)") + temp[[4]] <- rbind(MLE_Ee_U12, MLE_Ee_C22, MLE_Ee_U22, MLE_Ee_Braycurtis) + rownames(temp[[4]]) <- c("C1N=U1N(q=1)","C2N(Morisita)", "U2N(Regional overlap)","Bray-Curtis") + temp[[5]] <- rbind(Est_Sorensen, Est_Jaccard) + rownames(temp[[5]]) <- c("C0N(q=0,Sorensen)","U0N(q=0,Jaccard)") + temp[[6]] <- rbind(Est_ew_Horn, Est_ew_C22, Est_ew_U22) + rownames(temp[[6]]) <- c("C1N=U1N(q=1,Horn)","C2N(q=2,Morisita)","U2N(q=2,Regional overlap)") + temp[[7]] <- t(as.matrix(Est_Ee_Horn)) + rownames(temp[[7]]) <- c("Horn size weighted(q=1)") + temp[[8]] <- rbind(Est_Ee_U12, Est_Ee_C22, Est_Ee_U22, Est_Ee_Braycurtis) + rownames(temp[[8]]) <- c("C1N=U1N(q=1)","C2N(Morisita)", "U2N(Regional overlap)","Bray-Curtis") + temp <- lapply(temp, FUN = function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") + return(x) + }) + if(q == 0){ + temp_PC <- rep(0, N*(N-1)/2) + C02=matrix(0,choose(no.community,2),4) + U02=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(Y[,c(i,j)], q, nboot,datatype="incidence", method='equal effort') + C02[k,] <- mat[1, ] + U02[k,] <- mat[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C02[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U02[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C02"=C02, "U02"=U02) + C_SM <- list("C02"=C_SM_1, "U02"=C_SM_2) + } + if(q == 1 & method=="relative"){ + temp_PC <- rep(0, N*(N-1)/2) + C12=matrix(0,choose(no.community,2),4) + Horn=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + C12[k,] <- Cq2_est_equ(Y[,c(i,j)], q, nboot,datatype="incidence", method='equal weight')[1, ] + Horn[k,] <- Cq2_est_equ(Y[,c(i,j)], q, nboot,datatype="incidence", method='equal effort')[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C12[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- Horn[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C12"=C12, "Horn"=Horn) + C_SM <- list("C12"=C_SM_1, "Horn"=C_SM_2) + } + if(q == 1 & method=="absolute"){ + temp_PC <- rep(0, N*(N-1)/2) + k=1 + C_SM_1=matrix(1,N,N) + C12=matrix(0,choose(no.community,2),4) + for(i in 1:(N-1)){ + for(j in (i+1):N){ + C12[k,] <- Cq2_est_equ(Y[,c(i,j)], q, nboot,datatype="incidence", method='equal effort')[1, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C12[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C12"=C12) + C_SM <- list("C12"=C_SM_1) + } + if(q == 2){ + temp_PC <- rep(0, N*(N-1)/2) + if(method=="absolute") method2 <- 'equal effort' + if(method=="relative") method2 <- 'equal weight' + C22=matrix(0,choose(no.community,2),4) + U22=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(Y[,c(i,j)], q, nboot,datatype="incidence", method=method2) + C22[k,] <- mat[1, ] + U22[k,] <- mat[2, ] + temp_PC[k] <- paste("C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C22[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U22[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C22"=C22, "U22"=U22) + C_SM <- list("C22"=C_SM_1,"U22"=C_SM_2) + } + Cqn_PC <- lapply(Cqn_PC, function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") ; rownames(x) <- temp_PC + return(x) + }) + z <- list("datatype"=datatype,"info"=info, "Empirical_richness"=temp[[1]], "Empirical_relative"=temp[[2]], "Empirical_WtRelative"=temp[[3]], + "Empirical_absolute"=temp[[4]], "estimated_richness"=temp[[5]], "estimated_relative"=temp[[6]], "estimated_WtRelative"=temp[[7]], "estimated_absolute"=temp[[8]], "pairwise"=Cqn_PC, "similarity.matrix"=C_SM, "goal"=method, "q"=q) + } + class(z) <- c("spadeMult") + z +} + + +# +# +########################################### +#' Estimation of genetic differentiation measures +#' +#' \code{Genetics}: Estimation allelic differentiation among subpopulations based on multiple-subpopulation +#' genetics data. The richness-based indices include the classic Jaccard and Sorensen dissimilarity +#' indices; the abundance-based indices include the conventional Gst measure, Horn, Morisita-Horn +#' and regional species-differentiation indices. \cr\cr +#' Only Type (1) abundance data (datatype="abundance") is supported; input data for each sub-population +#' include sample frequencies in an empirical sample of individuals. When there are multiple subpopulations, input data consist of an allele-by-subpopulation frequency matrix. +#' @param X a matrix, or a data.frame of allele frequencies. +#' @param q a specified order to use to compute pairwise dissimilarity measures. If \code{q = 0}, this function computes the estimated pairwise Jaccard and Sorensen dissimilarity indices. +#' If \code{q = 1}, this function computes the estimated pairwise equal-weighted and size-weighted Horn indices; +#' If \code{q = 2}, this function computes the estimated pairwise Morisita-Horn and regional species-diffrentiation indices. +#' @param nboot an integer specifying the number of bootstrap replications. +#' @return a list of ten objects: \cr\cr +#' \code{$info} for summarizing data information.\cr\cr +#' \code{$Empirical_richness} for showing the observed values of the richness-based dis-similarity indices +#' including the classic Jaccard and Sorensen indices. \cr\cr +#' \code{$Empirical_relative} for showing the observed values of the equal-weighted dis-similarity +#' indices for comparing allele relative abundances including Gst, Horn, Morisita-Horn and regional differentiation measures. \cr \cr +#' \code{$Empirical_WtRelative} for showing the observed value of the dis-similarity index for +#' comparing size-weighted allele relative abundances, i.e., Horn size-weighted measure based on Shannon-entropy under equal-effort sampling. \cr\cr +#' The corresponding three objects for showing the estimated dis-similarity indies are: \cr +#' \code{$estimated_richness}, \code{$estimated_relative} and \code{$estimated_WtRelative}. \cr\cr +#' \code{$pairwise} and \code{$dissimilarity.matrix} for showing respectively the pairwise dis-similarity +#' estimates (with related statistics) and the dissimilarity matrix for various measures depending on +#' the diversity order \code{q} specified in the function argument. \cr\cr +#' \code{$q} for showing which diversity order \code{q} to compute pairwise dissimilarity. +#' @examples +#' \dontrun{ +#' # Type (1) abundance data +#' data(GeneticsDataAbu) +#' Genetics(GeneticsDataAbu,q=2,nboot=200) +#' } +#' @references +#' Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +#' Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471.\cr\cr +#' Jost, L. (2008). \eqn{G_{ST}} and its relatives do not measure differentiation. Molecular Ecology, 17, 4015-4026.\cr\cr +#' @export + + +Genetics=function(X,q=2,nboot=200) +{ + type <- "abundance" + N <- no.community <- ncol(X) + temp <- c("N"=ncol(X), "S.total"=sum(rowSums(X)>0)) + n <- apply(X,2,sum) + D <- apply(X,2,function(x)sum(x>0)) + + if(N > 2){ + temp1 <- temp2 <- rep(0, N*(N-1)/2) + k <- 1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + temp1[k] <- paste('D',i,j,sep="") + temp2[k] <- sum(X[,i]>0 & X[,j]>0) + k <- k + 1 + } + } + } + names(temp2) <- temp1 + names(n) <- paste('n',1:N, sep="") + names(D) <- paste('D',1:N, sep="") + info <- c(temp, n, D, temp2) + if(N == 3) info <- c(temp, n, D, temp2, D123=sum(X[,1]>0 & X[,2]>0 & X[,3]>0)) + info <- c(info, nboot=nboot) + temp <- list() + n <- apply(X = X, MARGIN = 2, FUN = sum) + weight <- n/sum(n) + weight <- - sum(weight*log(weight)) / log(N) + plus_CI <-function(x){ + if(x[1] >= 1) x[1] <- 1 + if(x[1] <= 0) x[1] <- 0 + c(x, max(0,x[1]-1.96*x[2]), min(1,x[1]+1.96*x[2])) + } + mat2 <- GST_se_equ(X,nboot) + MLE_ew_Gst <- mat2[1, ] + Est_ew_Gst <- mat2[2, ] + mat <- SimilarityMul(X,0,nboot,method="unequal weight") + MLE_Jaccard <- plus_CI(c(1-mat$UqN[1, 1],mat$UqN[1, 2])) + Est_Jaccard <- plus_CI(c(1-mat$UqN[2, 1],mat$UqN[2, 2])) + MLE_Sorensen <- plus_CI(c(1-mat$CqN[1, 1],mat$CqN[1, 2])) + Est_Sorensen <- plus_CI(c(1-mat$CqN[2, 1],mat$CqN[2, 2])) + mat3 <- Horn_Multi_equ(X, datatype="abundance", nboot, method=c("unequal")) + MLE_Ee_Horn <- mat3$mle + MLE_Ee_Horn <- plus_CI(c(1-MLE_Ee_Horn[1],MLE_Ee_Horn[2])) + Est_Ee_Horn <- mat3$est + Est_Ee_Horn <- plus_CI(c(1-Est_Ee_Horn[1],Est_Ee_Horn[2])) + mat4 <- SimilarityMul(X,2,nboot,method="equal weight") + mat5 <- Horn_Multi_equ(X, datatype="abundance", nboot, method=c("equal")) + MLE_ew_Horn <- mat5$mle + Est_ew_Horn <- mat5$est + MLE_ew_Horn <- plus_CI(c(1-MLE_ew_Horn[1],MLE_ew_Horn[2])) + Est_ew_Horn <- plus_CI(c(1-Est_ew_Horn[1],Est_ew_Horn[2])) + MLE_ew_C22 <- plus_CI(c(1-mat4$CqN[1, 1],mat4$CqN[1, 2])) + Est_ew_C22 <- plus_CI(c(1-mat4$CqN[2, 1],mat4$CqN[2, 2])) + MLE_ew_U22 <- plus_CI(c(1-mat4$UqN[1, 1],mat4$UqN[1, 2])) + Est_ew_U22 <- plus_CI(c(1-mat4$UqN[2, 1],mat4$UqN[2, 2])) + temp[[1]] <- rbind(MLE_Sorensen, MLE_Jaccard) + rownames(temp[[1]]) <- c("1-C0N(q=0,Sorensen)","1-U0N(q=0,Jaccard)") + temp[[2]] <- rbind(MLE_ew_Horn, MLE_ew_C22, MLE_ew_U22,MLE_ew_Gst) + rownames(temp[[2]]) <- c("1-C1N=1-U1N(q=1,Horn)","1-C2N(q=2,Morisita)","1-U2N(q=2,Regional overlap)","Gst") + temp[[3]] <- t(as.matrix(MLE_Ee_Horn)) + rownames(temp[[3]]) <- c("Horn size weighted(q=1)") + temp[[4]] <- rbind(Est_Sorensen, Est_Jaccard) + rownames(temp[[4]]) <- c("1-C0N(q=0,Sorensen)","1-U0N(q=0,Jaccard)") + temp[[5]] <- rbind(Est_ew_Horn, Est_ew_C22, Est_ew_U22, Est_ew_Gst) + rownames(temp[[5]]) <- c("1-C1N=1-U1N(q=1,Horn)","1-C2N(q=2,Morisita)","1-U2N(q=2,Regional overlap)","Gst") + temp[[6]] <- t(as.matrix(Est_Ee_Horn)) + rownames(temp[[6]]) <- c("Horn size weighted(q=1)") + temp <- lapply(temp, FUN = function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") + return(x) + }) + if(q == 0){ + temp_PC <- rep(0, N*(N-1)/2) + C02=matrix(0,choose(no.community,2),4) + U02=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(1,N,N) + C_SM_2=matrix(1,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + if(sum( X[,i]>0 & X[,j]>0)==0){ + mat <- rbind(c(0, 0), c(0 ,0)) + }else{ + mat <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal effort') + } + C02[k,] <- plus_CI(c(1-mat[1, 1],mat[1, 2])) + U02[k,] <- plus_CI(c(1-mat[2, 1],mat[2, 2])) + temp_PC[k] <- paste("1-C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C02[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U02[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C02"=C02, "U02"=U02) + C_SM <- list("C02"=C_SM_1, "U02"=C_SM_2) + } + if(q == 1){ + temp_PC <- rep(0, N*(N-1)/2) + C12=matrix(0,choose(no.community,2),4) + Horn=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(0,N,N) + C_SM_2=matrix(0,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal weight') + mat2 <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal effort') + C12[k,] <- plus_CI(c(1-mat[1, 1],mat[1, 2])) + Horn[k,] <- plus_CI(c(1-mat2[2, 1],mat2[2, 2])) + temp_PC[k] <- paste("1-C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C12[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- Horn[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C12"=C12, "Horn"=Horn) + C_SM <- list("C12"=C_SM_1, "Horn"=C_SM_2) + } + if(q == 2){ + temp_PC <- rep(0, N*(N-1)/2) + C22=matrix(0,choose(no.community,2),4) + U22=matrix(0,choose(no.community,2),4) + C_SM_1=matrix(0,N,N) + C_SM_2=matrix(0,N,N) + k=1 + for(i in 1:(N-1)){ + for(j in (i+1):N){ + mat <- Cq2_est_equ(X[,c(i,j)], q, nboot, method='equal weight') + C22[k,] <- plus_CI(c(1-mat[1, 1],mat[1, 2])) + U22[k,] <- plus_CI(c(1-mat[2, 1],mat[2, 2])) + temp_PC[k] <- paste("1-C",q,"2(",i,",",j,")", sep="") + C_SM_1[i,j] <- C_SM_1[j,i] <- C22[k,1] + C_SM_2[i,j] <- C_SM_2[j,i] <- U22[k,1] + k <- k+1 + } + } + Cqn_PC <- list("C22"=C22, "U22"=U22) + C_SM <- list("C22"=C_SM_1,"U22"=C_SM_2) + } + Cqn_PC <- lapply(Cqn_PC, function(x){ + colnames(x) <- c("Estimate", "s.e.", "95%.LCL", "95%.UCL") ; rownames(x) <- temp_PC + return(x) + }) + + z <- list("info"=info, "Empirical_richness"=temp[[1]], "Empirical_relative"=temp[[2]], "Empirical_WtRelative"=temp[[3]], + "estimated_richness"=temp[[4]], "estimated_relative"=temp[[5]], "estimated_WtRelative"=temp[[6]], "pairwise"=Cqn_PC, "dissimilarity_matrix"=C_SM, "q"=q) + class(z) <- c("spadeGenetic") + z +} \ No newline at end of file diff --git a/data/ChaoSharedData.rda b/data/ChaoSharedData.rda new file mode 100644 index 0000000000000000000000000000000000000000..15d12016ae0d651f8b9b90de88046164e68d7324 GIT binary patch literal 10121 zcmeHr=|9x*-)=clbM0)UA5|N!zl11h;}UtHYs7!mcFi84Z_W5*A}Q%fR}Xjn(PHGr zf*@N-nLW@Gpn2%n`X}q^1<#O*^0|aKL51)#xrKDB=>=7v0<(sk0!8rIb2u6X@K!|xrHF_Rj3c^7VnDs4)xw4_57jg&#kj) z%`;~8Bdz`V@@Nd}R%d)i4S(I}?>DNo#I!%;yOM+{i)+3k>&**2){G%~c`Ye#wGq_d=G<$%8tg9WvL;^pZm!{@? z?pR+!=yQS#2OOtZpLu0Bi+Xb(B$QSe?mhY8Kb@Q1yK6}82ba`846VBN1`DCe$DS2h z^kuezAE2mH+vuq-=Tu5}_Ql;^{RQ#4^Y`X=vT4;h5}g+Wj-HW%?&aRG`XNb|=2&^N z)T)^+gtP=Z-EawQ`C}lj7(rKd>BpNu(_6D5pE0~qZUVO>8CKgd=Rr*r0-u;wyaP)+ zyW(L>=EbShe9TF~1i9Jz=%+Vj*4{6Q!(B|{`_us8m3pMl11jLu+fQ)0=4SKrM#Y~8 z3Bq;3A`M0|hSCLRh8AX_>h3%<6m=Fi;-YV8ljp_#f6|)9{yxV3i=jZCnL6qemq7ZN z5F;?uLHIO7hBa6*TSJYjL1?Z@%BCzsh@uv2neV4N>zzl*3eJBAwW~-m=(p+P_#1`?B19 zOv;_7Cc*%qY9A9BMF(k4&#XlyI}DI}^;umplswxRw>Q7V1=4rOaq;aw6``cT+erTK zCcKTZVAqysQ^v&!xMAp9Y`vt$67){1?QW>qP&Ns3r?Gb@1rGX5=p}w8UkERzew~Ob zO{8d@&zn_9E`IHypRSF`H3*!z+WMV9!GHPJRUZI|H0n@1iQd{1i==nX>&R zL@d|=?n{rhKRP+7(3LsBe!yt^e4MfWupA&!>j|QbXsvfOugQvlJ&4;NkIq-=BFsol zcN&c;wLdwksxm~cWk3FGKfRTBT8a5*%jTBbRwmR~a}^u@Jxyk5^kV${>`}S8aG8#M z0zWX3I;4JKMq0Cu`fV4r4YdB}*9dW zT*!+sS8@uR1+zq5s!iTq2=K?;KDciJ6dyRG0(nSzRAEd_EZJz!at_BK3^OK4Z1Bd^ z2&?!*9FM93dqTk^tI0-7=(p{mMlBD4c%=muJia~PF6`7)I#3!uEH-@wLYZIKI0jvD z(CikIvFu?D7Tal&JEr5bK&*0;Tz@bdhwjFVOh4LtgTmSKf5^fb)4V7 z6%uPb#~H}ca!xPOg7bG**h@XDflZxQfZokmy59+j!<;Kjub*`QYD98e29ifeWVrd*epX(_ol|#N zjA6&c&naD;I@*=q{>(D2>m@^KX{vUbn$v68%<)`FRh%j8_Wo@Kmh0Z1@j}KCOc`O8 zoA0#Zr!oMvyl*w-($tbP#oAJEA7*Stx_W z$QPysM@9PNrRtu`SjXI3>&A zm99@KQppdR-&iPKD~ZV!XP?)u_R!Ea)Xn(kI zj#z5^ay$Pkp-7(d)pc>MZ8x1b9T1s5eHYFLTks#(R)t?Dp`4Ny61zJ4G<@!^3^Uuz z=)j%i(%D#H3_oi%R`ePg{yKJn%>#1=ECiV{R(Vr?oqv?_E9gqsIgg8$?~2p;2Xyp( zPx$B+7xnui=5r?bGor7AEbef!uFVBO)Xw8muPA*pWGzq{UUyT zz!ZAkm&{z^di8>N-GQ9y=L{I*kU*@V=)dLZe|?U4*|fLj?tqZYH&3JtYO#vsuxKuw z`KK4cuKLD2eRy-WA1`}tI}_dc9owuApS&#X*`|06e{+Za6Gntk6c{aBhkCUV`Lc>} zdbW_xDF3efqlzx)wm+WvJm0@6PMyp0+53^zmD$tJtuwU}P?-{~bXQ7Btm~=#);*=6 z81au@{BJ%6vEvg1R&03R@BiJEGAn8ONiNY_`}oQIDD;Q6gV{~S`jVeFxCBB?T+4GD zrvIG}{{q#XY#)kn7bxdc&`dSv4jUQO$jHz>V#HjzUNV-YbYC+~NEsUB$;!`N_WAR!eg=QU&p!(UYbdZ6iu3E_>FuLQj z|5O_%=7IkeFJG~KdF}X@{&vmJC;8a~LO#@NXKXaD$rpwVnsEoe^a45yR!tj~x5vZO z)4SzH^Z)oO`~tXg>sPEiIJSQ}5*rvErLEU0mjXED;|}^N0Ak@%I%pORu>`c#nW&o; zJovWTMrcTNYFU5X_G+;H;=8Xh->&%xXaUuG!rm|*kM|iN`Mc9Ex+^|jN=05gGzl0d z+-FjZ0Eku3(+Kq9%@{|br7xiZ%%0r8?4*$K{l+DhI;}kgj{5rWfQ;642 z-qUx{kBd`@Q#GMqM`7FayI=1Mo4Oy=+Zld!N3a$< z5ibvj7p#&ZM}qDPR*bdc=5EL&iWiQY%zO{Ga?b^`!{)_f{){gy?{Hsj)&E<6o&^J2 z5&Z%Wv&m_6sK7x~Omf6atP~86ec`GRZ30gI1*5zi*|O=34r$G-8spKg@tyWe5wo!N z2Ae;I8;G8O7y>mhp}>UlIs2aw(}sBSXLchNqGE(W_gn74c@5m1H*+;IORRPDPXqR> z;;T4!zbUH)vNnhBz4L@jrz&)ZPpG6D*oWOc?;!3U9Xa+`_Z5VM@>>wQ#=abSj1#em zt%rWi$dHwJyH(y^;vwcesj4^BAa>#U_Zx^#)-Mnak><+=iI|$Om3zYcUm=6A_lJu2 zL5r+ug2Ht^@N>%U6qj$>R~0HA8r{05K-4Hps1kZQl+Ek?X;OpZ#VcK(`|R5v;%tc& zF+yol$eaPEiNPEopiXJdAaPGH<7r5Y)%fb;1ow@CCb53&6(F^i^Gqc!coP9;w+AIZS`LJ&Dl;iMjwCx%Nlz!ylc0+kDT!tjh<%J-M+%6p% zV|F)I2AR!sGHbY!la;TB34@ZHcTSS+KcQtfjEA_}TRwa3@eB-<%eW4|%XUF&ODe1` zD@h?eUNRWpQJi|x+9omXt#B##Pr6)b+_<2A)L7P!3D>-9ikP32*FT2&$Wa?~JRH64 z3;wR2ZZ>J%ZJ*FzoLqBdBeZK)R5gTV^(DdIzG|%vfV(Yz|C@ale4YX;v#`9%oAmEVuK|8k-~q$) z)x5Z8(RICLuBmc82a7c-{qscp;~=$1laaPmBX`m-yh?tx#JiUo9Tp5HLhoO4 zw$E=$wj-kK^GWxkRgOa|jJDNTart!N(P*x?L#i31ZdtOH04pg zC8;zKi8LF9P6e*4%@upwe$WQGq=N!hy@<<|C-;L>#zL=cN;IwQl)gMeQ@R4OynvP> z1t@ih#(*Tec-m{@k)BUlLEv7^6tqL+@u;3&&FJjH6? zCs@Si$$r&nE^1~h9yDrGtv;nbiyj#7vF#&> z_ntjzRRk6hCdOWYk5-zHi>ZV%bU+|wWv49N?ORe!V2l@kJ;CiMuiYM}M1wTYmXh!H zQnK?GM`6wvl0;ZiQ?XOJgNXq?&}!AP9fQh1Qz-SCg#M7xLsY^O>L~r1Ds ze%v|&Xq=*tv-F+h}qE8|K@hxA^+GMmrW$F zTGmoHLkUp4b%vn9jFV4yP9QZ-K;vR=?OEJ@J%$hr~N&x#tetub9& zb4rUDKyHqW8V6@dVk-M(iVuoIQR}ltaD2$$Bo|4XHAFW4A~>ur7Kiq;QNj|}awTyi zg4SV$i`fegjse4}E^FBQQ1d<1d>Jyi6p_k~BmGQsIKV4vLfVtnJ;{wlw3V}?%qikzi5(d>Ky=A2NfkE*j>cWHqkF? zE8XzHwvCVk^!~Rw>n+sIR}??wWNpWC&Wk)&N!I4UJa4S;!J)U%M=NViPYbNK@^`)^ z;UMjqo8^!)ual>bH7ojGqBd-5yjSPF$nPb?$B&x1V*X<4$6#R6@E&N`Y3*QH!;}2| zPNtWDLQ<}zgWguLY5f4LsJ~yQt<{TXtKmgzcB5c08A&KQEIeErP#See#u(W{;Akpx zB{?$&vJAr=+qGj@yj9N{j>}&G^v@jVA(+B+Q7_8lzcR>#fFw*HC2c=4@Feuf$s>sB zwXI@=uNT4+Ga`_G{G=Tlc$UkGYX|qMZJibwH6ag&M{{4)(ilk~qj&5^Eon8jr;CFi z4IVUxZxPo>UPPDTm12GTq0&;>JW#Q#`4-aM@V?{o^I%#~w zlZk{vk}+O_3VE-RYP`r1L{;bFY$V!g%3-Aif}_L1F;bYI%uYL844Q1$o=^(KH@Qyu z7axu!+43T$VIQQ=(0&MenzEQ46kVSg)6^HNCJ|o1d8iFXfYI{N&UYqgI zraJxry;Dr68>%;IB8Isklf5A3B-_X8BS|k`UK|cg?=VZsD5g}Ot{7!`X_OANT7C3t z3P|!PUO^tF?aR+pYj#viKfXk3#_u?f)0!j~)fojJ+@9nV)jd7f%HOX2Hwp^TL>&Qm zKw#J>Nl@_{;(+B}@ahlatryU#BLI%l^7|}xO{mp+*E+zOyf##~7HP8jymkZ11HuI! z-(f371bgu3rn zeU%$4J!VJ(ZTJ@bW!F*rSnI}`{#KtU?6=^RaRMWbhi+WgGo6p$ldaO z9Eu#p;-kiPII>1tLKAX&kp^+zYk}r0c$!={d1nUG({P6NdrO1Pl6hM$79j( ziW0Q9ttfFQwWo-PsXq3$-LmVJFO{r@X+avq5yN42?T9CPgU-sUE+)hT58#3X0{oZA zJ{3}u4*CWugIgZ~hN0f|rlaaS)7DuPrQw0Zv^%F>dvd6?V&swc^#at1{XM&~Ve7_U zd%Q^*5~_HoH5oKDfP|thCBXXbW2- z2uwj=JOx?pShapjSUn0{o;u2WsfLX9fT0qRk_9LX^3b}OCbUDjs7w=b)g2&i$C^s) z85O9eShiyo%tn64c{QStF?a0qk#ZkS@xQ$Ue)Wr@_AR5z=E3(&$S-)Byqij28sPS9 zz#lzM00WQtdhYf79Xk?oo(2=OgI=Yg;qxytB~+}TfqUV$r_S}5e5mJv zF)zZNl4Hb?L@l0+HEXp1y71MYcdWN0RM7s&Xquomq1Ij1u#Zjg!Nb(YN?9f+%|Q;u zhqbZU5&8YdV+X#S5R)K|u! zpu-lCGmnk{V5|hC*dn+I8J2JC0QSa1bB$|C4i$a}L&C?$7{SNhJ)PCwX#^2mpzb|v zwY}9}TqX@>*N7%+d)mfP%frp4_Mz?G!qoHGQ=JnXp-&S5xYPhFb5U)}mjkO^yZA z#A9vXYztA~ zJEJ&lM~U(fl;2~;h3d5zY?kj4Yj{MfPk+q2yf6ck2YM5bi&Oh@{~4ec(SxUvHWHi1 zu%RWlG>|85X)@d77+^`|GANyqq&@0s4&v2RvF8&yJ_ZJ#WXG0y2t0WVwA6D;=e3(M zxMw%&fZM7LU8PAmO(BCulH7U?c)g2ht+6|hLsP8-0ypxJF|;jFK?haEpXY&29HkwF zCs8HP=tIWV!i3K=4^O5QPLa`G{8>1gpy?pUMXwXwo;rB%$O|{~e9Ac40UwIm^%L0Y zEq;|Bh_h-!30C`iQKTiCOqM2ga7n~s>D7U%(z9yV2UJFl?6;xLl>Mn{7}~@kndX1h zyZNwk50LX@8FJC8yC&1kxJ>fQ>p(7l7d78zoq^7=eGFXanX#r8&j#*UtX7nq+=~Z+ zi6a8N13K1Z$dVzRgdFM|C-jQ2iz0wk>oi{phSZ^=iJJO)K9kuiLye`Hu1^9hrBA1Z zI%~fn?1R9hSEC6NJ>j;2%?eo$Dd6V7J=%%6kIGRi)6|EbRqx74z-b{j%)3x`7|p+`#aSj=u^m+poqbv5yoZb`~}UHsc25 z^slf_Gi;mHsDV09uCc4uuZ(n3GVW=ede)N%57Zg~%za~YBxYV)$6C6=nSYz=triMf z0YcnNRPXbfHPvd3@*KbZk#d|s+V-Cow$7anZw!!_g4uJo8L+PQfEysu+C56aK~0TX zg6#DH-+%udCa2`uzP|8sH(xBarHUUeFk8OA>2|RZ#)3Yk6ade!%Iyr852|c;xC4UN z4FLV-mzQrJU-~|QhfRiF)dwsg4?SLx>5No499oaU=Ki#jGUk2Yr>od=91I21--r3$ z4nJ9LPc^I9z=BSMd7`|CG>}A-VtP z?O8{V+j3QAL1v?yT6W-0`%Fyxmd0^O@W0T$uUxmA=M#XN(0{98B&s}%1EX@u$(*+m zL^%qcLg`4)T!{2c4j1zI`rR~mk+G0Go*f76vDLt~9NxX;qWsNV2eeB9ivjfQ&k?ja zue(Rvx4H8Tjk2ju2LcLP z4J@s5+#DTt7)iB|s}-{KRSUVzAxI8mx{n3`c!RdZ%<@Rh-8Xb;rgR5vR;nb|TrSmJ zm~NkL!}P_RTgP-2+>Y7d1-Z3VG0lCRMc0~{ub&GOWuAv_=D%Be*ebgc-h|!wbgS7m zx{Zt3z&x7n)|#M}tnU9YNw?f?u{l+lQ6^*mnjQM=(ED*WR1I!^0>-gUnaq{xPfOiS zf2wOQ)aBVP<+AyfZXdQ4Qza!+=^LSSzx7UK_%rd#5tcqZ8Y?FXAF55E+#f7~De=O* zc380$d)@$>UT5=^wo8e(!g5Nd)VjO7A7{HWLD?I{XKw#F)VyMzQZ=MkCC+ToEULT? zFz8G02{Z_~D(ol7>>w;`TcK|He~HmWD3^5-#kFvNUAeDZgVAolk0D$?CQ4)$H~2XR zjsWq`Ps82Z-)Io7u!X&gbfRA;J>J~%`p6dPbaT*0o=w-k4?4%mz|cOkLa*BLNg-<6 zgHBfK%jKZ)xP^a{55;6Wt|FpLqZ&eUdq2#?hj4km`uEXO?d8caBmZ|Ss z=-^#wztFp!EM36GD$ZJ|8tdqg3w%6I4!30T9_qs{NB?E2dJb_LR0-zS4*{_2zk50U zb5KZ~JIQ8wHj%aBw_ub!be>UN^lmMGl8Ls@+Y_fIV@WptbEPIvO-?*F%u`Nex%#Ho z^&NjQnwmD7-O$OYvb)8|sz33pTrj{(`5o&-{vMME=X+CUe)IRD1qO2ruJ@}#Rx+J8 zleg53-l*$878I!>HKe$BKPz|sqZs`6gR-#HJh6-%KjAnl9qR%N4l)y(zE|RJ$3;o0 zTYh)}rboB=ywlYc7{=3oK4bZJXBY4%B;wQhv}mck z0tArfO$F?~3#lCZS*pqYKEYV6hTVKRZ#PlvU2$;s_nRK4E5uPzG1aJC2GB+I|A*|{ zV!FJ{QZ{ah>0rXu9&7IWz^+#dOTTH36lZ9tyx`7{bqDMXae=8S zjTP|yp@p`K#E!m$snu1)n;60Mpk@sv#K-)1K{k1)fOw&$>i0tM&x^MwXU)^x-d@Ko z*1qZer2D^nYt4|adkn1m6{$&&RT7mr20cr9tZBuKIW zM8?He^9H-$by?ngr(DpJQKdaYU+*)t9sl~EQ$Mk6t!5xy>pDK$wIF=U`&NKpdW`#Q zrq;r@(efPPuWMW>S={7_n>%@|DQs5jo9yqa`eK>gzntf_Y8|_oVQD(NaIt9Pjk@203rZvOrZ3Jc^VJlVlcDO* zdlIONg7@9cm_G)XKeDM|RyVr+^YzuOI&V%k3{2dii)SzgS@0qw{(DW^M3F7iG}7Q1 zacAJC;dJl7g!<0VmurVcpXK>~Ro_Y@#s5w@Cvd2zd5)Gm|_g~1#{>1-iVEg)Ek!Gnp32Td>t`-uk+1`e8Vl7w(0 zwO}L>Fnok?G5&4-AD+GX7g*nxdB5qdyKT2@={7RS^v!#nzTeFB&FjbS!`7Z%->WO7 zNR>+ZQL8GGey~(o)$};O`MG(z^YyTGba=kiyl85lMZH{6s((ZcP zt@WjO9(SuB=v-Dmo#`E7?zOYit5vge7#@As5AAmQmmdD;X}D%{NN+gAXIjJ3H8}Ly zn-XTT?1kDjNN6;KWa|F#1G)K@J zL30Gn6*O1STtRaM%@s6P&|E=t1v28H!+6!Dz=H^=h!OPpZDYSsc^l!8RrY#jM1~u;GDKq2AdlBibn^L{& zb6-!9q`LL0VN$LB#~Q}#)=#VyyL)X=on$d}i%Z(MQVHwB$|Tn@t!|;D<&tNPN%zAf zNS4oqxD1+WnqR-WvdBH&>%7I~wtbpbyJ7blXHAppHs~43Q%2GIqm(S?%}BqviqJ=+ zj$xSlEo;B#{rQpwcU1Rezw?&%eJ$%xFai}uV8RGQ7=Z^P&|m}>j6i}BI4}YQ zMqt1Q1Q>z;BG6w1_KQG%5x6e`^+jO52*ekG_ae|<#%@PXk2!v}^B3?CQ)Falr%zzBd503!fK;HwVP6Y2Ph>7TbBFWIbV obH8@C#B4>N34QWoB zN{X_MrKyFPFA#+-5fdfK6jBirMo36$1tgWn)^yIEv&a3mpLXx>p8NQHyw{Gn+U)lL zjgbRyi#S#{>QU_hr6+g4EzD21T)Db1`061=pYA)N*+}|dk8Oxb-x?HWZvFRSvp*H^ z%1h>n!@t!W-G!`EkgI$54D)(+Z+(}N+SJZe;I?f+<*DT*v0QD5kYkNX?i7wyz5>r@`k`Kd%FA(b`E%h^k?Ws8A z0ga*~`;X8oDhgdV-mGsXHeguV^mpGrJxqQ2=^`v`9@gGR%=zfFrgrarN#Thj&fnDj z;r?u)bJk>vX?ZX7&egcBNn!}`Bgd-q zfoPl20=0fvFvpL{~Fgqa#pboo;%(H{^wcgn?+&Z=CskgI-CuKlq=w?2? z4Q166&fdTij8tz}m0A`PKW(0z{_HI?GkK9`pc7?uN>os*OTYuT&g-D6^XFo}qQ1Xz zFTf|ie|Xs@jK01u3KZ^g&AG(SG4SE#HQ_FmT!%3Fb>|ZJR zuWX~IJ?6OR7R~0XxLIcD>4*stk!fDoWQmN;`z+daV*Er`X{)g7BwN^!yvPl0J6)!V z3S-v%50^MC^Bl0QTR$RF70rt8v!Ju!bJ=cvDx_d4V5W1tzy0#O9x`qj@e-jk0=k8J zi~6=1t!_sBw$u3RdO4LMx|*1Mz+T0wqsW0cf~2SknzkZS;#o!?%D_xXLcLr?tN(D3z!?qrxpwmpiv8aE1M*GTG3Xt21yfoQJG$GiPMSJ9`>~;eB4Z_A$ZKF!NiXb0coTSSl$8jzD zIJXvESxv!|9>K&JW24U8eqBF`5U~La?$k)^beS(=E@Ct* zsn1P_R~IUUB;y%LcJ5i!<8q~Sm9LyrINKwpglxyFJMX1UEe^ICK0cT`oiiw}>~Cn9 zvna(Xa8>(WMiX??$hnr$#*7TEN@44Z7v6LbfKjSMcO}q2pIUmEx&N7xSgGuK@T(H% zt%DtpNSPM$x<9k$LN464;p@vEG&evBTKKlosclfQy>YA2)cv~Ubks%PN@5$w&`ru5bdU+@9Zv2k z@6#)Qe$5cFKc+7Vt59zOI778cxzoaLB9+DC(5C(J0sbyNaR;$wuB!N)MSirYAIfjn z%WXONF72m5O-<%NVqQ{opN4IKw4B?+8C<%?Wq zMJyu*Xiz2hgb9i9U9KEmC=G@&0AD@S%pLgy!U=>4JyFcj0*6Qc$5ED0Q z|4(!q`Oenr|0V@7Bmn(nrvDDXJ#EXJ>0p{mu=&Lf^*8z+7G8cQB7C-?L7b%xc_f`#lEz^g0)PUo9Z(uwSK;A+3juO#TI`DBIxmy#m`xTZ0Npb+ zh$f;BPir7NgnnFonWC6aajr24DDM!v1pB^Z*7V`il<-~AvrXI}zePD;)H#-}G}hE2 zrY#1Vhi45aX7oDrZL~a51AlZ`^IJ~tnEP;Vg8}Jv#f-mZ6ZZDoyiL20sC9!!C`-nSl28-UGS)?>WXwTxC2Q(JvOfb0023h BaTfpp literal 0 HcmV?d00001 diff --git a/data/SimilarityMultData.rda b/data/SimilarityMultData.rda new file mode 100644 index 0000000000000000000000000000000000000000..2616a885245f5940b1bb1453f592e81b2a6b1672 GIT binary patch literal 7457 zcmV++9p2&}iwFP!000002JKzza@@9(rX|^qo%p7U?bwztaZZv`Nwl$~IA?3CcFVWK zsl=Cxb2+ttwm2M0!-zAS@FGk3wLkZXb{}G&V4q?C>=Pv0AOQjd$u_CzYXhz|;krVv%ffYuUKi;_%6(6Z?0{V7 zgzKG*>#T5{5w6oY*GkT{tR>r8)|J2ThDsf|Kk_Y~utZJpJjBJ~TZo8lcPc;nL3!;=Aqm zZbyFAuyv2U${wEY!(6eo=R&Cx`#INnyTzmtQ(O9X_~*|*|NIa5_#-*3oFeYzf25`m z6IvIVD5`_0hhdtoVN(h#M@F4p`_2?RdG#b<-Jb`sEzD(R(PVitnA<$3@oAB{hwi_cal|Ct;*p-s}G$osDFo$2kYKtCtRJtxRrl0LKQP3mz$OS;Q7 z`bqS2QA45x)QtOz`3glDYIRoW+!$#k?vKUp7<>%7KoPS%Uq2h1j3YEhW5M)}Fq z@<5Bi2KGJ+HM07c$(9@%C_G_wO^bcYssEy|*fDBP2Ny%Wig%e+VL&(d;9IqwPb-lIH3X9ktONy}j2)(3EXLa(#5o#dT=)wu4N zygT&S8vTBrem^bo8MK`HS~8TrMlUw@|E4AT{FG3}8Orw=rHQ^o3=7zp%kyiRP;QpJ zld}Ga-Y?U(oTrzJa{?d3ms)b%v%0W!g4pP~Kzm2yp$q}(qmSwR6D?{tvGsd`oJ}oh z!y1)Ic>Yr;n>@R&aa~*%WRS8*$bCUegY|osKhXr)47(X+tVo{(BvgN&zDqvuP#G+3 zBYGh={($nIr}Eb|K39=3__fBzraUi9>USB;PMJ-zGCrrD8S~#M(kJ&a%B>kdBwRTPi_{C(X)Jz zNN0M~GUO21WUODL{Fmu-^8M0;Iy0~(2DwZb?Af^t+BC?E#ioVa@5|D?7)ugd68Kph zx~3&DZ9PuP5yY4}i*GaM`K0nl-R0Oat6$DiK1w5r)Y6|?fh=vAapxvxD}+PH!~XbR zEzHA~z{Z~cSAljRYQVqzU(+?H7W1}m*7uu|+f=!ABd;ArpHE9K8>-c(Zj{}=w(6^` z^==EM)o<7OVbT3d+t5^h%xWiHv7dSUgKByx+MbizCDZf1*QJND;%e3UttwQNhqPUP zW!XVf`C(m<^)zo>(bl-XF2A9wT}(S)%v;W^<<81q6w{6eRpYrlzNYh}y7Z}9e<9j| zS;xys{AiDeEVMOFLLd+b2J}1Ap{sgodLMjI^YxQZ_B!#rs5`$adp$>$cx}Du{-qWj^q?*Eo`UZL!B-$l{Y^Ioc+lbdyXR&`E6b-Qc2eoe0@ zHGMukk6*QQ{<`PaqUW}Y^3D5x)%8(*{GJ5W?6)pd9pANGPi>EX5r`_#be%W6&3l?3UsM=r$oA8e zJr$MT_I{cbs@HE?bfq1tpsaHvv(8^l&$~sTdj0AG>H*OuRCPXE)pd-f<*SZaF^UecJujABpUcRbxzfJc)L)-dm(SKjDs8C(Mb)7F!T@PJB z)4%s<>USt*%bmBrWt}gndOg1Cc%dvbz2DZYgKp+rFpg%_-<_C@=1fqH1IP4a%Kh;9l z`=jgYPi_C5YFYA|uD9w?b)N$J4N)dk^@mOUKB+0F+7Hu~Tl75Uynk1uDF3wYSB-h) zR`ol!vd_UyE5GRP|7W#7)y9|V_E*+8FN$f$wd(O!S$?jndK7)Xtn%{8tr~U30aeY9 zvzDhCRL!fZM%%{Av{1EPZAVk*q_IB{T|(E_vrj@>>NzhKrJZ#H^@`{M@ONhj5oPDs zWv^Ev)IwGHdspnBuJ!Py*3DJJqPAa^^~0+6r`r~%RK2c=3 zYPx=HJ$KSppq{#cy3fh07H1>s1@;$0C19M55Ye`Ev9_V@^{*SKuX<6}_-|XkwynMh z1VSBPoQ)8ndVi^2zmw4R`qvHAS2xkNer;QQ5eP(kf$?;=f&GX;sDrMK$F{Yr?(_0( z(+{GXKszi4xSov=p?ZI*UOxn)Omx-W+tw~s{+=PC9l-uWh^YG9EZQ4_0PSElEV|zC zTz@yA`tLyMDi?vs0@~YRLzTZbg{TI&-#HmwZ=Y&FeRmUGkB8BZA`pv*dCz;C^!Lc> zUe8`MKh&1?7PY6gpy~QHEx#!&YP*Y~>2?32|4&0x{i@2}2`pPq+5b;Q+tKuRn6(^L z&jqM5e)CXQ{p#``dGYqNak*;tuZkR1P}Mrjw00od&8&ZKx@><{_usuPd)&iQriqptOmqVvVH@1OMENhoXHD;sUG_j%*`qW-OId0E*&f7x!d^>>h}*;CzdSXod# z53H)3w#c1Uep&NW)9sX@T73}NC>kf@`NMiqxBgV$H)tEWqL-@IkD9i-uAduGWu2kw z_NqSK%39B!RZh`zik8>Zx{K=lUKOg|7b_d8`TeF)bUe-K59*HBRhegIu z3RKZwQ}rlYt}6Ph3iJNnao*pn6h+(O_PT$ksA}ALQopI%e zXHokl4@Kvblge4NdA{j$Ix@ei`f6LeQP+M*)%~X0?@-%PpIQB1QTeJs_5HuP%PmVz zUE`rC|KD|aP0P)LYW3+lRP!@U=TY;%SG0Up>|K?5$f!FHR&6}gRZdgocNJ|tH-i0L z4OHzfRp-Z|P^~_y1NGG`RP_gSqiyv?AP|cK^iK!`2aX2>0)apv77x?zTdA_1jmSn@ z_Q_P0k5CJ3*-u?`6rDHP)*q^!>p&n7?E#+WK_Cza1OkCTAj(Hu&u1eLT?URDM3>Pv z`)*r(5$y)ro0=&5cVbmPpBHF{*=YLQNYn3^T~68MqTLn)>MKIm{{GCoe{ZB}xq0t( z{qH`ut-e)56@92u55#0t9lzjuwpvi_?{jp;4(f`t@jPgRMQ{JKP+d=5IiKAXIrE|| z^(+ci?NpU|ASMI*34uT$5C{aKUBLKF4b;8PHt+L!f$IxNbp5Ffd;Mjo9%pyu_Y_U*QFTnaPJ?kaLPXQ?Y1h|{XVrtM?Wr1z z+U~Ak-g!ZlycXxTBo?*ZO`+_(6#K7Uw9OCAyI!fv{-r8VZCxoZzRc_Qy7z%qvxB1g zDq3FCsH*;|wNo{yq90YxouFM6L*4U4Rprk+->4Q}b!8o`tn!*V&w7%56}7*4=lzq) zEBap5a+|i}qVLsZ_f_qKbmctCwDPMiziB(KOU|@@Y+gO69(UEfK2o&&vggrx{bPleM?z^m&J@04 zSgy<7;IG^eUomZSUmtn0Xo16aM*Eg$x|S(%CCVA<`y*@L)klHtcsk{{`)Jyx3VhHB zR?jj=o+Zhx<3e@4BZ{38hS_K>08d&wtQK$n-ZHfGTou9jt?G z$F>~|w@i4!7=Yf)txqDxzf#WY5~tksyAu77NZWJVp_!CO_~5m%)3^FoQliwIz;|re z5tq}mt=|H;0UpB{_5^)7<_O-mkG-v)J_LdOz)I;W(n3^}?jMgWBVE+(BwO6RbhQH4 z@tmP!_4TcRZaZNgC%Ha(Vi@=sO0C2B+_g zD92Ti1BCq+n7$jo|H)&)V?Y!31M1>EU?r`Udi!RA-#Loz6GQ2)$THRgBUx5l)LMei z9O-V~id!7G9;greyBmxQ-6s~oH28$JFSHV%!eq^7k0;<{{ZIvDWa#cbl;Z1#8F$Ub z6U&8H!k71~sOE75aQBI=A2MIA2T*V5)(p$JCGwel9UKsJiRAr79^%W)QQGBQd=ffh zm}WefD_AQ&W={?MF!F%9XoB(xd_Tp`ZYaa$m>~eSX2+c$KQ|3M=yNUd=cH*)Y`pwA zbh+dEmRx~Infl0)hW{1D0K;%(?gz*0%}b16jrMF_fmkgzSWllly}+K5C`Fe`80x`T z_bj=DA&k8tV_s!X9f*TQK6I`p7ZZ(EOMc;ySYyxPFRVWirqD=z+R z`yvmF4`Tq6EjU>4ck)2_;E6NpA45O}Q&8yfay{C8nG*_TeoJD7`n#53oua+0JvoNB zWpF%%K?!Xe^+8l7$s*66JA<+<{`y9O0Tk!hK}=jv5IvaU_G9O8=UEs9hF*I@lCT!B z9Y>KM1nW-h_HI!UaqdxHhBlncRz#jKr-|)5tCVEqmR$;Xh#)mQ) z=JVLyiEX+0%(aGe)a8t<8v@Hd%rU>4wk6jRys17i15%fCO8J*(p}S>c+at~i2B-@| z9_(Kp$CNqB_;XU2_n!r0+W~JuT4X^4R04+z0Nv99Y zka4iL2Z3Y!jzF;YMTzSI36hxs1erW>p64xxKwsJM_0g4eVOkt}hzfx*pSBf(~kehVx!R!_2w z-1Vtj2{xA`ai%Zwfo&ZyKOaE#`ms-1|6U*Z)I-QUaYLZ>O~Y~1L65*8qTY(gMZuY%;2iSk(gUz;X{@ats^ITKhdjvSbExk4Hx^Er4I-`ds}UV%d=s1;>e% zkt;9A3uGSfb=UB@xOW-Bk4OC~ePmlCSrLur0;MFp@aOblqJvVMB~4B5(U-z$OR8fq zguvex(#p#(OpnYo&|DX~FHROWU%&__ivs=x^SJVYuAvxYN^`lgE;7Qt)$7GiFR-Tr z-7_s-f#{(%0@vlGnL#0rE0+sxjH5}@x}2$7|evWpC0SiU+SJ0E_Ab{Hp!7OdTwHRm2r@{AvHW!88r&%bsj)mV3e`$ z9_q3gbqeO!{89KoGkFPN9=dT>F@i4q5JJ z66v}|a;9A8tkBVh=?E_8l0XCFjz?7Tacax$eNJ~sjx=y&Kg=1tP-*cJ=4-*MCXGnK zM}3~H2Vi17$K`V-q4I5=Fucgx3A>lUm$?^qGr>3xY-1p=>t9W<$UbO2L49kENFK+A zJE86Id}8c6z6p~Pc=%!BeSybwT$3%ZUzcdI>>Gr>FXRZRZVvUZ&%^efm6H+QmKfQ} zp)|B5uH;6Lw7Dkn(uh@N3)?v05ucn@hQ%C^AWPnZPS>Ba%Skbk!?^Z|(KK3gauG{c&9xDGFL-|Fc2q zq3@L=#P!#V_N5LxOT*>R5MdlkY5iCepld0520O8o(^W zvYk&K@9ZQdr90X0#CJlQW46S08z;~Jg2#GLw&@XlfP)>ubV6LZpWfElOt9sof1ni; zGDK;&$xBnCZBB1!`2#C9RG8?as>eexc17U*gpGT?LL@K$NQ{YUrWsSa!s!-WlNn)R z8uN9-&DV+jNw$FHC-zl7q)g6W#L~T9-X6w3qy?yg5NGt^sR^k{b9gJA4;EBP`w2=F(d(v#aBw|+2z0I}O!$LYHP zxbJkgZvNnq*dc1qo;f0xlbUPDe06J+Fo>HN;ohJW z6n!KyO-`aV-dG0?RGg;iQh*y1^H09L0fv1N>_CGo$i)vWAEq!fE4#P352#-e*!1y_CyA4Flhb;!?nYzPFXy%%_pN5sd1ek9A( zv`FF>ge3g#k0kjt?f66y^-*&BZ>0Ff;ZWnNVQW7^(ve;5wV&cSWaPlCB+RIroJ1CO z2ssREK)T=*f1*1ACOMiU9$dP_C%a%m5_w1-^ULoV$hm-diL`-)5Zic8xO zZn@wcE^UWP+u_o7xU?NEZHG(S<Tjc0wLktJV~!jAuZsczk=irj literal 0 HcmV?d00001 diff --git a/data/SimilarityPairData.rda b/data/SimilarityPairData.rda new file mode 100644 index 0000000000000000000000000000000000000000..174f7afc84622ff41167f429823c3d089f16d41a GIT binary patch literal 6332 zcmV;t7(?eDiwFP!000002JKy0bKFLfZjqF2$)k&hBvYgg+cV<{+uXIu(agT=hOV(4 zwvMo5d)_vx(IC4_9BmvF+fVzre{p}qzU(isHw!==C;)xpD4^?$XcLXhy0R*>GPA1X z@1Oqj%gg_~-0St$dh6@(U-^(&R0=6xn0PFfz zRI=B*0-tmE+AFm9X_Pyi|~0T`CNd{H%Xa|grCsgPreg6Y4u%!rqJwPH@5Ktemb_g1@eB6!rCh+UkWUNHmzM`%IS!6Js7qqM zL&r6^dg93A&PvlfF`#mgBpxWJ#HkaVR%3J`-fh>QslL>mejFPKo{R_B>e9b z&wLFPdLM zr_KbO&(Gm=TcG!P+_wCC@?GArCP7@!q}*omOy`jH`KnNc_9agbM0suq@91B#UsPXZ z9lzX5uS0RemDh;;T%?cN0*?r>ez8w^_2;!g9ic7ZdtSLoka@-VE^0TAM(LL)@DNZr z_Hx>x4)N85|9bMCe7`oK&lJ=tl9_ywMIO=` ztgOQ~bBso#2K>kWHQj@1vF!L3E#FkTP1RdBipDYT`&sE_L$&_Yjk3qrR(Z9x-)+IH z^6lC@th#+^8=9JrMb}AJ*3Yu`K{YwdJDzFdlIca?>ykrRezj`zRu!s>L)tFCverRU z@nKz%<+Q9{(N@2|F1w-Xx|nspShk);>s^$8Ud*Zws`_(9d`-tmb;(n;`O;_y7S)$& z{CYiVh)Ase^j~8(R7};uKMe0M|J@&js-UWPKsdpi{m^VNEyB{&{dHAB|zh&K5DEr>`yy)tEFIDf!EvlbY-BVC~-8J36rt?Wn z-%l^%S1r$9_x?KXz3qAVmMvd(e^l4sX;5wb)`hD5ySB@z?f$QZMipo}PLZKXKU9a7 z(_-kl+?tN7s;aN+zk_HCR$ZQ|j9-<0EQ+r7zwG>QTYk@^F(1m>-)3z`)pM${>a8m8 z(-l2h)x6lMV%dDQ#$xEYe3$iC+J>%-qijsGE2{hschP-**6(_Kl!K=H#-`el*H2UH zY2NnRK2M86_5RI@u8cz!lyz@p(fzCGeRp1{-oLt_<)G0eRCPaF)jUSi?W>MO{m7>B z3#!DcU4d%%5cK@!qiyE`%kBeB=W*(~FJINY-=^oEqHXiF>c6j8Rj4lCy6%^#E{Cq5 z>EC-a^*fZZ+g-N5W!*2SIv-!PzfcyMK5uL4Tx8z*o1RD2;;d9c?`?P^BLf zZOO|u4OP!uHPl4c^F4Lni&9Mvs`Y19w6)(F8X6iJX=r<{p|L!)9%*Pa53PsYf}T&c z(DnJ~dj6^Hzf&!%{ige^I#fNU(DSBICRDYDP5nNpDX2ORv(`KBeavP5U6FbDXDwef zmTkAH-?5c_4{p}>=l%WvqU%pJ{Zf7Xm8IwNVphFY-QOz9&Q;ZqdCQkoUs1hPqb@(7 zs{L`%`c#9eaaGl5o4(8nRr}R;G<8o}&!CKU+hD>hq;~|E8hs{jVEZUfo37=Cy78)zHvrFSI}1ZRmN_&`<|m)yKAt ztM2>qZIh2iH=);IIneoR4H2r(m+Jk~&?pmKUGHrhmn#3Bp+-BP=TAdK)%Rxgdeb1# z>tHdgy4>`={%%6`-+|OsuZBh*^mdvAD?Q_^6nAH zuglh(W$k>~<*M$F#$?o$FPV40n6>=0a??=OzE?Kdvfh{V>*uv^ZHvpwhPIdOMq7Ue zsoHv~+Yc)Xs>gv<)zg-CXKlZ%eX8kk%22I88u^&lPuBMj>qXu2Q~lhaZRm;^!cEMMw_7ZpdHYDwnm%L^g62B z-)qY~iK?-x<5mS#t^2mktLpJ&+vHg_+Afc__rC~T9oIByKU<>?=)7bMoyVIGS`PI> zwRcNRMl^tHgBrV-!wGc z|F+6)SuFZHxVrA=tm?cJVcz{@T0N__&o_NfN9I>mUTyO?cs(0y@Sh94jou~ryFS1+ z{J8=jczuC9e=M=Q2P_Vh&QHhKQfRvc1J^Rj1COfvXF=Ug!bT(5n#vd zyB2oPfgM`#Ei^0>1?a#zIPgaY3FF6)jy)IIqapUO2~|wuy-Rq5p@+{WaeewosIAYL zq9O+NJX(Sm*%gbJmU)64Uv669uw3T^`=*CYi7Qo)jZPeV;vpxrTpuMI_aDtVRDlmB z0r#=#_*iQ1mdG`7t#LBEyzY!V*DyV)nH`ZMFg*{$Jfb|}84fUfW5m5j4hmp)UD9Oe z9~fu^)1R(=@6pKg0@oj*U>HIj$g?v8y(jSuusgDFAWL>%V#AK<*|Io3d^B?V(TW^f zR%joXP{0_1+{}YdV#dEDoVO)TdFb~g`Wb2M*!65P-6G+GVk38e2RPlLG@ZaVHmsP- z?OXVl5FUWXaP7VzFV}RSyfyX@`pAX`gCWkyE0#iB6$-{9Y-Fptn`UF<%XTaDT;H`_ zJU|CSWVz8Cr@20P22yp<+==jF#*D zY@%sgqOOHDTLu{}-*7!fKa%N86V23P)Uq_7&nA=wY4~7^hO!X$GiBT}bSkZYjHJ#i zgqth@_-q;G^1adu3h}Ssr%=OSYYYDX4hGC4rW+lo)_pDu!HuBAI;2NL}?!a{tj+-I}H1iksYrcCt0V5kkA|RT9yc4J;K!(YtX6Fg^ph#5s7{@(l z9>D!47CNK09EH$tnAV(?vnTSI0|W*LrbM#-A`h`;rjs>!SD!?N7`7Q7_6jVJbWfEqgHNjRn@uTcVajxF8+CKd};&w#$YLf;ZE1<5toC>$Xl%Qe`bdp2d>q;FmD zgPZ^+*O#k_wH3<+&d6%)V_snUiLiyn;^XE+qzzrz#-OBR>^~BDKtGHjY_?!v!QRP= z^5GNL8H~XpgDohsc!d`2BbgH_rglqWh5mb3Se@c}*?KYtzhyYKK`CKu&H$RqC1FxO z*FAysONa)nZOU;o;vja}C0(~+az)qesdh2>0TaspOJp~Dl$SAEnp$SjCyQ|91i@~XANwNBxxbsrL;(+(ZSEA4^F@A=v{k8 z9p)1V+?cVs!(K-NOR7kfyvbNnB9>bBqqs_xzqse@J{w`t8;|B7wS(W1IDEqtZNP_V zve1@nxul9Rc7~9HF(}vpvPLFx6MVLsMmQLfSj{4V8tmuH+=*kk^UT9`qICtDwJWfk zz#j9fX<@mSU`>%@hNLgqkn)euB6CZ1+b6~e6x0KihxIS(al#x&`~@M*2hYNh<$|>! zBf{Xp_;QFVGRP{tmrDc*3QDm5>^zC_fDoR*$Y4{Gh4L%WO5SO-iRTwkP(W@;yNcqN{O;X zaa$lkFf#;a zletQJ0xc7bkXPlL032)PT+KPMSPQIplY~HX(!((k?>eMi;!yB2Ut7nc*$A`liRkc8}0~ zqADJQgJ3l+LL?~;FEscv*h#Tr=a&$hOg!m23iDHYDK?J;ai+Y9?Gm0i2?4!vp)bG6 zX(W-Ozzpnu?a~<4k0vVP`E3)9B+jbvGRq~~D~QP_E-52@1ED#!J|oitUxrosjWsD_xyi^QWTh1U=C^2Q z!6sn$qZl{B&bKBQn%E)^uxXq|Yn29d-%=f%oS5ACHaxZPDYf$XcWt|>@LyD>H2&95s)~3jkR%wBPeT!(d!9*?4PO*0in`1O+I@<5R zlO;QtH+IfpTL8Pr<+=GC__B^0JIATa$c^Wu0NDou;%g zNhm2R{CV;+H9?ullD4KlB!!}FOX_22gX3=rVdeGbrcZVn7_JA?$AZQE=b+&vDG*Fp z9ygvRITV9zX&x8WRZ2L){XTnpg}xmk-^9EF@k{K0>GHG8AQ0D+>xHrL#kNJ?f`%Y7 z6(e1vq(~ciis(7X9SkrmJ8<12g9K9-iYdkC!WRzYQ znBJruWN*l*$0nu54n5)y#1%>zA@2;y%>^NtGb6Mz74F3xk>Ro+Y^ULG$_*BHM%c*V zxb*^^$7d%|7D6W9q-3!2W`=tG1t!xL8#B;MwiUhr*X^840t^ZDld~w&3+k1<2nVUH z5IbZ(!cpcF>SdIYcjNfR=2j32)#2f7NJEA>PIg<@bATOMp1g+3Dy3~AaLKtW;wM2MR{+&JRc_! z6FQC(1k4<8O0wy&-ZMhhW%($VnB+CmJLNWKg^4yyS1>u(1R5-ErcvqFnIpFkINc>N z($JOdFlX>1p|KR^D`8np(n!LW*gOk`u*CYV$M;M^<;Oaady#h%_8^BZ_blvAig6TL z#!$}d-%PQ{HfTFVeQP=-j-%>MXvaLB7M3dXzCG-;^ zM#xNaB!?63w)b&CM0{6bq?toWwI#0fL6CH~Ch;a7tIQUSamYPBIjW4RIVDb(db34EY(ZJ|@%jq&Qh=AIhjO|@7 zG1bj1W9C#iJEB`MBWz3~o;Tcol{%lK39KM>uJUoh}`k!reWo zC^|I^NQ(XZ0Q8twltUkcf&{UVqCG}zB++9#XOtc>RpgY|W>YP0OQR9V2g)PKZ-w69%jskQ zF$3(+nY``I*I*+|ufWQ0z7CL0PO+mpA~%^e3%sYX(=6TL*cC{_=thkk65f3sjyyDN z$v=>}ie_`?O_WoPGTfQy_zouyLpz=(`oPLV2u8hoSp+xUL@#2u9iQ<#9yNDbPO2nkK?uU{zahb=X|WqciH(xSznkG3MXJWLVOwt?=*aK()(>nCaa`Dy zgdKIClSpERkR!JSS{IDs56OvuNv(GJcyGGIJkO z6ggRYSELR|QZ&d>wLOu`jv~xVVF(xwj-XJB&_) z*r~>ke5XdF!(m!t0@(yl)WOU~3KZ-dhv2_+>#_9{g7?NLY-auNXV)_ka{z90Qtp|? z?oY0Dd*#Q79!PG?n literal 0 HcmV?d00001 diff --git a/man/ChaoShared.Rd b/man/ChaoShared.Rd new file mode 100644 index 0000000..12ba9a3 --- /dev/null +++ b/man/ChaoShared.Rd @@ -0,0 +1,47 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{ChaoShared} +\alias{ChaoShared} +\title{Estimation of the number of shared species between two communities/assemblages} +\usage{ +ChaoShared(data, datatype = c("abundance", "incidence_freq", "incidence_raw"), + units, se = TRUE, nboot = 200, conf = 0.95) +} +\arguments{ +\item{data}{a matrix/data.frame of species abundances/incidences.\cr} + +\item{datatype}{type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr} + +\item{units}{number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data.\cr} + +\item{se}{a logical variable to calculate the bootstrap standard error and the associated confidence interval. \cr} + +\item{nboot}{an integer specifying the number of bootstrap replications. \cr} + +\item{conf}{a positive number \eqn{\le} 1 specifying the level of confidence interval.} +} +\value{ +a list of two objects: \cr\cr +\code{$Basic_data_information} for summarizing data information. \cr\cr +\code{$Estimation_results} for showing a table of various shared richess estimates, standard errors, and the associated confidence intervals. \cr\cr +} +\description{ +\code{ChaoShared}: Estimation of shared species richness between two communities/assemblages based on +three types of data: Type (1) abundance data (datatype="abundance"), Type (2) incidence-frequency +data (datatype="incidence_freq"), and Type (2B) incidence-raw data (datatype="incidence\cr +_raw"); see \code{SpadeR-package} details for data input formats. +} +\examples{ +data(ChaoSharedData) +# Type (1) abundance data +ChaoShared(ChaoSharedData$Abu,"abundance",se=TRUE,nboot=200,conf=0.95) +# Type (2) incidence-frequency data +ChaoShared(ChaoSharedData$Inci,"incidence_freq",se=TRUE,nboot=200,conf=0.95) +# Type (2B) incidence-raw data +ChaoShared(ChaoSharedData$Inci_raw,"incidence_raw",units=c(16,17),se=TRUE,nboot=200,conf=0.95) +} +\references{ +Chao, A., Hwang, W.-H., Chen, Y.-C. and Kuo. C.-Y. (2000). Estimating the number of shared species in two communities. Statistica Sinica, 10, 227-246.\cr\cr +Pan, H.-Y., Chao, A. and Foissner, W. (2009). A non-parametric lower bound for the number of species shared by multiple communities. Journal of Agricultural, Biological and Environmental Statistics, 14, 452-468. +} + diff --git a/man/ChaoSpecies.Rd b/man/ChaoSpecies.Rd new file mode 100644 index 0000000..1c08e7d --- /dev/null +++ b/man/ChaoSpecies.Rd @@ -0,0 +1,53 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{ChaoSpecies} +\alias{ChaoSpecies} +\title{Estimation of species richness in a community} +\usage{ +ChaoSpecies(data, datatype = c("abundance", "abundance_freq_count", + "incidence_freq", "incidence_freq_count", "incidence_raw"), k = 10, + conf = 0.95) +} +\arguments{ +\item{data}{a matrix/data.frame of species abundances/incidences.\cr} + +\item{datatype}{type of input data, "abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count" or "incidence_raw". \cr} + +\item{k}{the cut-off point (default = 10), which separates species into "abundant" and "rare" groups for abundance data for the estimator ACE; it separates species into "frequent" and +"infrequent" groups for incidence data for the estimator ICE.} + +\item{conf}{a positive number \eqn{\le} 1 specifying the level of confidence interval.} +} +\value{ +a list of three objects: \cr\cr +\code{$Basic_data_information} and \code{$Rare_species_group}/\code{$Infreq_species_group} for summarizing data information. \cr\cr +\code{$Species_table} for showing a table of various species richness estimates, standard errors, and the associated confidence intervals. \cr\cr +} +\description{ +\code{ChaoSpecies}: Estimation of species richness in a single community based on five types of data: +Type (1) abundance data (datatype="abundance"), Type (1A) abundance-frequency counts \cr +(datatype="abundance_freq_count"), Type (2) incidence-frequency data (datatype = +"incidence_freq"), Type (2A) incidence-frequency counts (datatype="incidence_freq_count"), and +Type (2B) incidence-raw data (datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +} +\examples{ +data(ChaoSpeciesData) +# Type (1) abundance data +ChaoSpecies(ChaoSpeciesData$Abu,"abundance",k=10,conf=0.95) +# Type (1A) abundance-frequency counts data +ChaoSpecies(ChaoSpeciesData$Abu_count,"abundance_freq_count",k=10,conf=0.95) +# Type (2) incidence-frequency data +ChaoSpecies(ChaoSpeciesData$Inci,"incidence_freq",k=10,conf=0.95) +# Type (2A) incidence-frequency counts data +ChaoSpecies(ChaoSpeciesData$Inci_count,"incidence_freq_count",k=10,conf=0.95) +# Type (2B) incidence-raw data +ChaoSpecies(ChaoSpeciesData$Inci_raw,"incidence_raw",k=10,conf=0.95) +} +\references{ +Chao, A., and Chiu, C. H. (2012). Estimation of species richness and shared species richness. In N. Balakrishnan (ed). Methods and Applications of Statistics in the Atmospheric and Earth Sciences. p.76-111, Wiley, New York.\cr\cr +Chao, A., and Chiu, C. H. (2016). Nonparametric estimation and comparison of species richness. Wiley Online Reference in the Life Science. In: eLS. John Wiley and Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0026329.\cr\cr +Chao, A., and Chiu, C. H. (2016). Species richness: estimation and comparison. Wiley StatsRef: Statistics Reference Online. 1-26.\cr\cr +Chiu, C. H., Wang Y. T., Walther B. A. and Chao A. (2014). An improved non-parametric lower bound of species richness via the Good-Turing frequency formulas. Biometrics, 70, 671-682. \cr\cr +Gotelli, N. G. and Chao, A. (2013). Measuring and estimating species richness, species diver- sity, and biotic similarity from sampling data. Encyclopedia of Biodiversity, 2nd Edition, Vol. 5, 195-211, Waltham, MA. \cr\cr +} + diff --git a/man/DivDemoData.Rd b/man/DivDemoData.Rd new file mode 100644 index 0000000..a9aadc1 --- /dev/null +++ b/man/DivDemoData.Rd @@ -0,0 +1,34 @@ +\name{DiversityData} +\alias{DiversityData} +\docType{data} +\title{ + Data for Function Diversity +} +\description{ + There are five data sets: \cr\cr + 1. Type (1) abundance data (\code{DiversityData$Abu}) \cr\cr + The data include a column of the observed tree abundances/frequencies from an old-growth rain forest in Costa Rica (Chao et al. 2005, 2008). There were 69 tree species among 557 individuals. \cr\cr + 2. Type (1A) abundance-frequency counts data (\code{DiversityData$Abu_count}) \cr\cr + The data consist of the observed beetles species abundance-frequency counts collected from the Osa old-growth forest site in Costa Rica (Janzen, 1973). There were 112 species among 237 individuals. The input abundance-frequency counts data are arranged as = (1 84 2 10 3 4 4 3 ... 42 1); each number needs to be separated by at least one blank space or separated by rows. Here the first pair (1, 84) indicates that there are 84 singletons, the second pair (2, 10) indicates there are 10 doubletons, and so on, with the last pair (42, 1) indicating that there is one species that is represented by 42 individuals. \cr\cr + 3. Type (2) incidence-frequency data (\code{DiversityData$Inci}) \cr\cr + The single-column data include the observed incidence-based frequencies of tropical rainforest ants collected by Berlese extraction of soil samples (217 sampling units) in Costa Rica (Longino et al. 2002). In the input data, the entry in the first row denotes the number of sampling units (217); the subsequent 117 rows denote species incidence frequencies of the observed species. \cr\cr + 4. Type (2A) incidence-frequency counts data (\code{DiversityData$Inci_freq_count}) \cr\cr + The seed-bank data consist of the observed species incidence-based frequency counts of seedlings that germinated from soil samples (Butler and Chazdon, 1998); here each soil sample is regarded as a sampling unit. A total of 34 species of seedlings were found in the 121 soil samples. The incidence frequency counts are read as = (121 1 3 2 2 3 3 ... 61 1); each number needs to be separated by at least one blank space or by separated by rows. The first entry, indicating that there are 121 soil samples, is followed by the 18 pairs (1, 3), (2, 2), (3, 3), (4, 3), (5, 1), (6, 5), and so on, up to (61, 1). Here (1, 3) indicates that there are 3 unique species, (2, 2) indicates there are 2 duplicate species, and so on, with + (61, 1) indicating that there is one species found in 61 soil samples. \cr\cr + 5. Type (2B) incidence-raw data (\code{DiversityData$Inci_raw}) \cr\cr + The data consist of raw incidence data of the seed-bank records, described above for the incidence frequency counts data. The input data include a 34 x 121 (species-by-sampling-unit) matrix. For each element of the matrix, "1" means a detection and "0" means a non-detection. + + +} +\usage{data(DiversityData)} + +\references{ + Chao, A., Chazdon, R. L., Colwell, R. K. and Shen, T.-J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters, 8, 148-159. \cr\cr + Chao, A., Jost, L., Chiang, S.-C., Jiang, Y.-H. and Chazdon, R. L. (2008). A Two-stage probabilistic approach to multiple-community similarity indices. Biometrics, 64, 1178-1186. \cr\cr + Janzen, D. H. (1973) Sweep samples of tropical foliage insects: description of study sites, with data on species abundances and size distributions . Ecology, 54, 659-686. \cr\cr + Longino, J. T., Coddington, J. A. and Colwell, R. K. (2002). The ant fauna of a tropical rain forest: estimating species richness three different ways. Ecology, 83, 689-702. \cr\cr + Butler, B. J., and Chazdon, R. L. (1998). Species richness, spatial variation, and abundance of the + soil seed bank of a secondary tropical rain forest. Biotropica, 30, 214-222. \cr\cr +} + +\keyword{datasets} diff --git a/man/Diversity.Rd b/man/Diversity.Rd new file mode 100644 index 0000000..9fd47bc --- /dev/null +++ b/man/Diversity.Rd @@ -0,0 +1,56 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{Diversity} +\alias{Diversity} +\title{Estimation of species diversity (Hill numbers)} +\usage{ +Diversity(data, datatype = c("abundance", "abundance_freq_count", + "incidence_freq", "incidence_freq_count", "incidence_raw"), q = NULL) +} +\arguments{ +\item{data}{a matrix/data.frame of species abundances/incidences.\cr} + +\item{datatype}{type of input data, "abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count" or "incidence_raw". \cr} + +\item{q}{a vector of nonnegative numbers specifying the diversity orders for which Hill numbers will be estimated. If \code{NULL}, then +Hill numbers will be estimated at order q from 0 to 3 with increments of 0.25.} +} +\value{ +a list of seven objects: \cr\cr +\code{$Basic_data} for summarizing data information. \cr\cr +\code{$Species_richness} for showing various species richness estimates along with related statistics. \cr\cr +\code{$Shannon_index} and \code{$Shannon_diversity} for showing various Shannon index/diversity estimates. \cr\cr +\code{$Simpson_index} and \code{$Simpson_diversity} for showing two Simpson index/diversity estimates. \cr\cr +\code{$Hill_numbers} for showing Hill number (diversity) estimates of diversity orders specified in the argument \code{q}. \cr\cr +} +\description{ +\code{Diversity}: Estimating a continuous diversity profile in one community including species rich- +ness, Shannon diversity and Simpson diversity). This function also supplies plots of empirical and +estimated continuous diversity profiles. Various estimates for Shannon entropy and the Gini- +Simpson index are also computed. All five types of data are supported: Type (1) abundance data +(datatype="abundance"), Type (1A) abundance-frequency counts +(datatype="abundance_freq_count"), Type (2) incidence-frequency data (datatype = +"incidence_freq"), Type (2A) incidence-frequency counts (datatype="incidence_freq_count"), and +Type (2B) incidence-raw data (datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +} +\examples{ +\dontrun{ +data(DiversityData) +# Type (1) abundance data +Diversity(DiversityData$Abu,"abundance",q=c(0,0.5,1,1.5,2)) +# Type (1A) abundance-frequency counts data +Diversity(DiversityData$Abu_count,"abundance_freq_count",q=seq(0,3,by=0.5)) +# Type (2) incidence-frequency data +Diversity(DiversityData$Inci,"incidence_freq",q=NULL) +# Type (2A) incidence-frequency counts data +Diversity(DiversityData$Inci_freq_count,"incidence_freq_count",q=NULL) +# Type (2B) incidence-raw data +Diversity(DiversityData$Inci_raw,"incidence_raw",q=NULL) +} +} +\references{ +Chao, A., and Chiu, C. H. (2012). Estimation of species richness and shared species richness. In N. Balakrishnan (ed). Methods and Applications of Statistics in the Atmospheric and Earth Sciences. p.76-111, Wiley, New York.\cr\cr +Chao, A. and Jost, L. (2015). Estimating diversity and entropy profiles via discovery rates of new species. Methods in Ecology and Evolution, 6, 873-882.\cr\cr +Chao, A., Wang, Y. T. and Jost, L. (2013). Entropy and the species accumulation curve: a novel estimator of entropy via discovery rates of new species. Methods in Ecology and Evolution 4, 1091-1110.\cr\cr +} + diff --git a/man/Genetics.Rd b/man/Genetics.Rd new file mode 100644 index 0000000..497ae6c --- /dev/null +++ b/man/Genetics.Rd @@ -0,0 +1,54 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{Genetics} +\alias{Genetics} +\title{Estimation of genetic differentiation measures} +\usage{ +Genetics(X, q = 2, nboot = 200) +} +\arguments{ +\item{X}{a matrix, or a data.frame of allele frequencies.} + +\item{q}{a specified order to use to compute pairwise dissimilarity measures. If \code{q = 0}, this function computes the estimated pairwise Jaccard and Sorensen dissimilarity indices. +If \code{q = 1}, this function computes the estimated pairwise equal-weighted and size-weighted Horn indices; +If \code{q = 2}, this function computes the estimated pairwise Morisita-Horn and regional species-diffrentiation indices.} + +\item{nboot}{an integer specifying the number of bootstrap replications.} +} +\value{ +a list of ten objects: \cr\cr +\code{$info} for summarizing data information.\cr\cr +\code{$Empirical_richness} for showing the observed values of the richness-based dis-similarity indices +including the classic Jaccard and Sorensen indices. \cr\cr +\code{$Empirical_relative} for showing the observed values of the equal-weighted dis-similarity +indices for comparing allele relative abundances including Gst, Horn, Morisita-Horn and regional differentiation measures. \cr \cr +\code{$Empirical_WtRelative} for showing the observed value of the dis-similarity index for +comparing size-weighted allele relative abundances, i.e., Horn size-weighted measure based on Shannon-entropy under equal-effort sampling. \cr\cr +The corresponding three objects for showing the estimated dis-similarity indies are: \cr +\code{$estimated_richness}, \code{$estimated_relative} and \code{$estimated_WtRelative}. \cr\cr +\code{$pairwise} and \code{$dissimilarity.matrix} for showing respectively the pairwise dis-similarity +estimates (with related statistics) and the dissimilarity matrix for various measures depending on +the diversity order \code{q} specified in the function argument. \cr\cr +\code{$q} for showing which diversity order \code{q} to compute pairwise dissimilarity. +} +\description{ +\code{Genetics}: Estimation allelic differentiation among subpopulations based on multiple-subpopulation +genetics data. The richness-based indices include the classic Jaccard and Sorensen dissimilarity +indices; the abundance-based indices include the conventional Gst measure, Horn, Morisita-Horn +and regional species-differentiation indices. \cr\cr +Only Type (1) abundance data (datatype="abundance") is supported; input data for each sub-population +include sample frequencies in an empirical sample of individuals. When there are multiple subpopulations, input data consist of an allele-by-subpopulation frequency matrix. +} +\examples{ +\dontrun{ +# Type (1) abundance data +data(GeneticsDataAbu) +Genetics(GeneticsDataAbu,q=2,nboot=200) +} +} +\references{ +Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471.\cr\cr +Jost, L. (2008). \eqn{G_{ST}} and its relatives do not measure differentiation. Molecular Ecology, 17, 4015-4026.\cr\cr +} + diff --git a/man/GeneticsDataAbu.Rd b/man/GeneticsDataAbu.Rd new file mode 100644 index 0000000..4e798f9 --- /dev/null +++ b/man/GeneticsDataAbu.Rd @@ -0,0 +1,18 @@ +\name{GeneticsDataAbu} +\alias{GeneticsDataAbu} +\docType{data} +\title{ +Human allele frequency data for Function Genetics +} +\description{ +The data taken from Rosenberg et al. (2002) consist of allele frequencies from four human +subpopulations (BiakaPyg, Palestin, Bedouin and Druze). The data are formatted as an allele +(row) by subpopulation (column) matrix file. Entries in each row denote the frequencies of each +allele found in the four subpopulations. The data include an observed allele frequency table with 27 rows and 4 columns. +} +\usage{data(GeneticsDataAbu)} + +\references{ +Rosenberg, N. A., Pritchard, J. K., Weber, J. L., Cann, H. M., Kidd, K. K., Zhivotovsky, L. A. and Feldman, M. W. (2002). Genetic structure of human populations. Science, 298, 2381-2385. +} +\keyword{datasets} diff --git a/man/MulComSimDemoData.Rd b/man/MulComSimDemoData.Rd new file mode 100644 index 0000000..beb0f75 --- /dev/null +++ b/man/MulComSimDemoData.Rd @@ -0,0 +1,24 @@ +\name{SimilarityMultData} +\alias{SimilarityMultData} +\docType{data} +\title{ + Data for Function SimilarityMult +} +\description{ + There are three data sets: \cr\cr +1. Type (1) abundance data (\code{SimilarityMultData$Abu}) \cr\cr +The data include the observed species frequencies of three communities: seedlings (column 1), saplings (column 2) and trees (column 3) collected from an old-growth rain forest; see Chao et al. (2005, 2008). The three entries in each row are the observed frequency (or abundance) of each species from the three communities. \cr\cr +2. Type (2) incidence-frequency data (\code{SimilarityMultData$Inci}) \cr\cr + The data include the observed incidence frequencies of tropical rainforest ants using three sampling techniques: (a) Berlese extraction of soil samples (217 samples), (b) fogging samples from canopy fogging (459 samples), and (c) Malaise trap samples for flying and crawling insects (62 samples); The data were collected in Costa-Rica (Longino et al. 2002). The three entries in the first row of the input data denote the number of sampling units (217, 459 and 62). Beginning with the second row, the three numbers in each row denotes incidence frequencies (the total number of detections) in the samples based on three sampling techniques. \cr\cr +3. Type (2B) incidence-raw data (\code{SimilarityMultData$Inci_raw}) \cr\cr + The data include the observed soil ciliate species detection/non-detection data for a total of 51 soil samples from three areas of Namibia, Africa: Etosha Pan (19 samples), Central Namib Desert (17 samples) and Southern Namib Desert (15 samples). The raw detection/non-detection data include 365 x 51 matrix of 0's and 1's (0 denotes a non-detection and 1 denotes a detection). +} +\usage{data(SimilarityMultData)} + +\references{ + Chao, A., Chazdon, R. L., Colwell, R. K. and Shen, T.-J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters, 8, 148-159.\cr\cr + Chao, A., Jost, L., Chiang, S.-C., Jiang, Y.-H. and Chazdon, R. L. (2008). A Two-stage probabilistic approach to multiple-community similarity indices. Biometrics, 64, 1178-1186. \cr\cr + Longino, J. T., Coddington, J. A. and Colwell, R. K. (2002). The ant fauna of a tropical rain forest: estimating species richness three different ways. Ecology, 83, 689-702. \cr\cr + Foissner, W., Agatha, S. and Berger, H. (2002). Soil Ciliates (Protozoa, Ciliophora) from Namibia (Southwest Africa), with emphasis on two contrasting environments, the Etosha Region and the Namib Desert. Denisia, 5, 1-1459. +} +\keyword{datasets} diff --git a/man/SharedSpecDemoData.Rd b/man/SharedSpecDemoData.Rd new file mode 100644 index 0000000..4ca3685 --- /dev/null +++ b/man/SharedSpecDemoData.Rd @@ -0,0 +1,25 @@ +\name{ChaoSharedData} +\alias{ChaoSharedData} +\docType{data} +\title{ + Data for Function ChaoShared +} +\description{ + There are three data sets: \cr\cr +1. Type (1) abundance data (\code{ChaoSharedData$Abu}) \cr\cr + The data consist of the observed bird abundances/frequencies collected from two estuaries (Chao et al. 2000). For each species (row), the entry of the first column is the observed species frequency from Estuary I, and the second column is the observed species frequency from Estuary II. The species checklist includes 201 species, so the entry data includes a matrix of 201 rows and 2 columns. \cr\cr + +2. Type (2) incidence-frequency data (\code{ChaoSharedData$Inci}) \cr\cr + The data consist of bird incidence (detection/non-detection) frequencies observed in 2015 (by 16 teams) and 2016 (by 17 teams) in the Hong Kong Bird Race. Each team is regarded as a sampling unit. Unlike the abundance data, the numbers of sampling units (16 and 17 for these data) are specified in the first row. Beginning with the second row, the entry of the first column is the observed incidence frequency (the total number of detections among all teams) of a given species in 2015, and the entry of the second column is the observed incidence frequency of the same species in 2016. A 280-species checklist was used, thus the input data consist of 281 rows (the first entry records the number of sampling units) and 2 columns. \cr\cr +3. Type (2B) incidence-raw data (\code{ChaoSharedData$Inci_raw}) \cr\cr + The data consist of raw detection/non-detection records of bird species in 2015 (by 16 teams) and 2016 (by 17 teams) in the Hong Kong Bird Race. A 280-species checklist was used. The raw data consist of a 280 x 33 (species-by-sampling-unit) matrix with element 1's (detection) or 0's (non-detection). Each row of the matrix refers to the detection/non-detection records of the same species so that the information about shared species can be computed. The first 16 columns of the matrix denote the species detection/non-detection data by 16 teams in 2015, and the next 17 columns denote the species detection/non-detection data by 17 teams in 2016. +} +\usage{data(ChaoSharedData)} + +\references{ + Chao, A., Hwang, W.-H., Chen, Y.-C. and Kuo. C.-Y. (2000). Estimating the number of shared species in two communities. Statistica Sinica, 10, 227-246. \cr\cr + World Wildlife Fund (WWF) for Nature, Hong Kong. Bird Bird Race. \cr + http://www.wwf.org.hk/en/getinvolved/hkbbr/. Assessed on July 26, 2016 +} + +\keyword{datasets} diff --git a/man/SimilarityMult.Rd b/man/SimilarityMult.Rd new file mode 100644 index 0000000..c442e75 --- /dev/null +++ b/man/SimilarityMult.Rd @@ -0,0 +1,76 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{SimilarityMult} +\alias{SimilarityMult} +\title{Estimation of multiple-community similarity measures} +\usage{ +SimilarityMult(X, datatype = c("abundance", "incidence_freq", + "incidence_raw"), units, q = 2, nboot = 200, goal = "relative") +} +\arguments{ +\item{X}{a matrix/data.frame of species abundances/incidences.\cr} + +\item{datatype}{type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr} + +\item{units}{number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data. \cr} + +\item{q}{a specified order to use to compute pairwise similarity measures. If \code{q = 0}, this function computes the estimated pairwise richness-based Jaccard and +Sorensen similarity indices. +If \code{q = 1} and \code{goal=relative}, this function computes the estimated pairwise equal-weighted and size-weighted Horn indices based on Shannon entropy; +If \code{q = 1} and \code{goal=absolute}, this function computes the estimated pairwise Shannon-entropy-based measure for comparing absolute abundances. If \code{q = 2} and \code{goal=relative}, +this function computes the estimated pairwise Morisita-Horn and regional species-overlap indices based on species relative abundances. +If \code{q = 2} and \code{goal=absolute}, +this function computes the estimated pairwise Morisita-Horn and regional species-overlap indices based on species absolute abundances.} + +\item{nboot}{an integer specifying the number of bootstrap replications.} + +\item{goal}{a specified estimating goal to use to compute pairwise similarity measures:comparing species relative abundances (\code{goal=relative}) or comparing species absolute abundances (\code{goal=absolute}). \cr\cr} +} +\value{ +a list of fourteen objects: \cr\cr +\code{$datatype} for showing the specified data types (abundance or incidence).\cr\cr +\code{$info} for summarizing data information.\cr\cr +\code{$Empirical_richness} for showing the observed values of the richness-based similarity indices +include the classic \eqn{N}-community Jaccard and Sorensen indices. \cr\cr +\code{$Empirical_relative} for showing the observed values of the equal-weighted similarity indices +for comparing species relative abundances including Horn, Morisita-Horn and regional overlap measures. \cr \cr +\code{$Empirical_WtRelative} for showing the observed value of the Horn similarity index for comparing +size-weighted species relative abundances based on Shannon entropy under equal-effort sampling. \cr\cr +\code{$Empirical_absolute} for showing the observed values of the similarity indices for comparing +absolute abundances. These measures include the Shannon-entropy-based measure, Morisita-Horn and the regional species-overlap measures based on species absolute abundance, as well as the \eqn{N}-community Bray-Curtis index. +All measures are valid only under equal-effort sampling. \cr\cr +The corresponding four objects for showing the estimated similarity indices are: +\code{$estimated_richness}, \code{$estimated_relative}, \code{$estimated_WtRelative} and \code{$estimated_absolute}. \cr\cr +\code{$pairwise} and \code{$similarity.matrix} for showing respectively the pairwise dis-similarity +estimates (with related statistics) and the similarity matrix for various measures depending on the +diversity order \code{q} and the \code{goal} aspecified in the function arguments. \cr\cr +\code{$goal} for showing the goal specified in the argument goal (absolute or relative) used to compute pairwise similarity.\cr\cr +\code{$q} for showing which diversity order \code{q} specified to compute pairwise similarity. \cr\cr +} +\description{ +\code{SimilarityMult}: Estimation various \eqn{N}-community similarity indices. The richness-based indices +include the classic \eqn{N}-community Jaccard and Sorensen indices; the abundance-based indices include the Horn, Morisita-Horn, regional species-overlap, and the \eqn{N}-community Bray-Curtis indices. +Three types of data are supported: Type (1) abundance data (datatype="abundance"), Type (2) +incidence-frequency data (datatype="incidence_freq"), and Type (2B) incidence-raw data +(datatype="incidence_raw"); see \code{SpadeR-package} details for data input formats. +} +\examples{ +\dontrun{ +data(SimilarityMultData) +# Type (1) abundance data +SimilarityMult(SimilarityMultData$Abu,"abundance",q=2,nboot=200,"relative") +# Type (2) incidence-frequency data +SimilarityMult(SimilarityMultData$Inci,"incidence_freq",q=2,nboot=200,"relative") +# Type (2B) incidence-raw data +SimilarityMult(SimilarityMultData$Inci_raw,"incidence_raw", +units=c(19,17,15),q=2,nboot=200,"relative") +} +} +\references{ +Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471.\cr\cr +Chiu, C. H., Jost, L. and Chao, A. (2014). Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. Ecological Monographs, 84, 21-44.\cr\cr +Gotelli, N. G. and Chao, A. (2013). Measuring and estimating species richness, species diver- sity, +and biotic similarity from sampling data. Encyclopedia of Biodiversity, 2nd Edition, Vol. 5, 195-211, Waltham, MA. +} + diff --git a/man/SimilarityPair.Rd b/man/SimilarityPair.Rd new file mode 100644 index 0000000..98cebf5 --- /dev/null +++ b/man/SimilarityPair.Rd @@ -0,0 +1,64 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/spader.R +\name{SimilarityPair} +\alias{SimilarityPair} +\title{Estimation of two-assemblage similarity measures} +\usage{ +SimilarityPair(X, datatype = c("abundance", "incidence_freq", + "incidence_raw"), units, nboot = 200) +} +\arguments{ +\item{X}{a matrix/data.frame of species abundances/incidences.\cr} + +\item{datatype}{type of input data, "abundance", "incidence_freq" or "incidence_raw". \cr} + +\item{units}{number of sampling units in each community. For \code{datatype = "incidence_raw"}, users must specify the number of sampling units taken from each community. This argument is not needed for "abundance" and "incidence_freq" data. \cr} + +\item{nboot}{an integer specifying the number of replications.} +} +\value{ +a list of ten objects: \cr\cr +\code{$datatype} for showing the specified data types (abundance or incidence). \cr\cr +\code{$info} for summarizing data information. \cr\cr +\code{$Empirical_richness} for showing the observed values of the richness-based similarity indices +include the classic two-community Jaccard and Sorensen indices. \cr\cr +\code{$Empirical_relative} for showing the observed values of the equal-weighted similarity indices +for comparing species relative abundances including Horn, Morisita-Horn, regional overlap, +Chao-Jaccard and Chao-Sorensen abundance (or incidence) measures based on species relative abundances. \cr \cr +\code{$Empirical_WtRelative} for showing the observed value of the Horn similarity index for comparing +size-weighted species relative abundances based on Shannon entropy under equal-effort sampling. \cr\cr +\code{$Empirical_absolute} for showing the observed values of the similarity indices for comparing +absolute abundances. These measures include the Shannon-entropy-based measure, +Morisita-Horn and the regional overlap measures based on species absolute abundances, as well as the Bray-Curtis index. +All measures are valid only under equal-effort sampling. \cr\cr +The corresponding four objects for showing the estimated similarity indices are: +\code{$estimated_richness}, \code{$estimated_relative}, \code{$estimated_WtRelative} and \code{$estimated_Absolute}. \cr\cr +} +\description{ +\code{SimilarityPair}: Estimation various similarity indices for two assemblages. The richness-based +indices include the classic two-community Jaccard and Sorensen indices; the abundance-based +indices include the Horn, Morisita-Horn, regional species-overlap, two-community Bray-Curtis and the +abundance-based Jaccard and Sorensen indices. Three types of data are supported: Type (1) +abundance data (datatype="abundance"), Type (2) incidence-frequency data +(datatype="incidence_freq"), and Type (2B) incidence-raw data (datatype="incidence_raw"); see +\code{SpadeR-package} details for data input formats. +} +\examples{ +\dontrun{ +data(SimilarityPairData) +# Type (1) abundance data +SimilarityPair(SimilarityPairData$Abu,"abundance",nboot=200) +# Type (2) incidence-frequency data +SimilarityPair(SimilarityPairData$Inci,"incidence_freq",nboot=200) +# Type (2B) incidence-raw data +SimilarityPair(SimilarityPairData$Inci_raw,"incidence_raw",units=c(19,17),nboot=200) +} +} +\references{ +Chao, A., Chazdon, R. L., Colwell, R. K. and Shen, T.-J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters, 8, 148-159.\cr\cr +Chao, A., and Chiu, C. H. (2016). Bridging the variance and diversity decomposition approaches to beta diversity via similarity and differentiation measures. Methods in Ecology and Evolution, 7, 919-928. \cr\cr +Chao, A., Jost, L., Hsieh, T. C., Ma, K. H., Sherwin, W. B. and Rollins, L. A. (2015). Expected +Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. Plos One, 10:e0125471. \cr\cr +Chiu, C. H., Jost, L. and Chao, A. (2014). Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. Ecological Monographs, 84, 21-44.\cr\cr +} + diff --git a/man/SpecDemoData.Rd b/man/SpecDemoData.Rd new file mode 100644 index 0000000..99a8164 --- /dev/null +++ b/man/SpecDemoData.Rd @@ -0,0 +1,32 @@ +\name{ChaoSpeciesData} +\alias{ChaoSpeciesData} +\docType{data} +\title{ + Data for Function ChaoSpecies +} +\description{ + There are five data sets: \cr\cr + 1. Type (1) abundance data (\code{ChaoSpeciesData$Abu}) \cr\cr + The data consist of 25 birds abundances/frequencies in a sample (Magurran, 1988, p.152). Their observed frequencies are respectively 752, 276, 194, 126, 121, 97, 95, 83, 72, 44, 39, 16, 15, 13, 9, 9, 9, 8, 7, 4, 2, 2, 1, 1, 1. \cr\cr + 2. Type (1A) abundance-frequency counts data (\code{ChaoSpeciesData$Abu_count}) \cr\cr + The data consist of the observed species abundance distribution of endangered and rare vascular plant species in the central portion of the southern Appalachian region (Miller and Wiegert, 1989). A total of 188 species were recorded out of 1008 individuals compiled over a span of 150 years of field observations. The data are read as: (1 61 2 35 3 18 4 12 ... 67 1); each number needs to be separated by at least one blank space or by separated by rows. Here the first pair (1, 61) indicates that there are 61 singletons, the second pair (2, 35) indicates there are 35 doubletons, and so on, with the last pair (67, 1) indicating that there is one species that is represented by 67 individuals. \cr\cr + 3. Type (2) incidence-frequency data (\code{ChaoSpeciesData$Inci}) \cr\cr + The data include seed-bank records taken from Butler and Chazdon (1998). There were 121 soil samples (each soil sample is regarded as a sampling unit) and species of seedlings that germinated from each soil sample were recorded. A total of 34 species of seedlings were found in the 121 soil samples. In the input data, the entry in the first row denotes the number of sampling units. Then, beginning with the second row, each row records the species incidence frequency (i.e., the number of soil samples in which the seedlings were found) of a given species in all 121 soil samples. The ordering of data entries does not affect the analysis. \cr\cr + 4. Type (2A) incidence-frequency counts data (\code{ChaoSpeciesData$Inci_freq_count}) \cr\cr + The data consist of cottontail capture-recapture data provided in Edwards and Eberhardt (1967) to illustrate that species richness estimation can be applied to estimate the size of a population. An "individual" animal in capture-recapture studies corresponds to a "species" in the richness estimation. A total of 142 captures were recorded for 76 distinct rabbits in 18 trapping nights. For these data, the incidence frequency counts (\eqn{Q_1} to \eqn{Q_7}) were 43, 16, 8, 6, 0, 2, 1. The input data are read as follows: + (18 1 43 2 16 3 8 4 6 6 2 7 1); each number needs to be separated by at least one blank space or separated by rows. Here the pair (1, 43) indicates that there are 43 unique species, the next pair (2, 16) indicates there are 16 duplicate species, and so on. \cr\cr + 5. Type (2B) incidence-raw data (\code{ChaoSpeciesData$Inci_raw}) \cr\cr + In the cottontail capture-recapture experiments conducted by Edwards and Eberhardt (1967), a total of 76 distinct individuals (regarded as 76 "species") were found in 18 trapping nights. The incidence-raw data consist of a capture/non-capture matrix (where "1" means a capture and "0" means a non-capture) with 76 rows and 18 columns. If we regard this capture-recapture matrix as a species-by-sampling-unit matrix, then species richness estimation can be applied to estimate the size of the cottontail population. +} +\usage{data(ChaoSpeciesData)} + +\references{ + Magurran, A. E. (1988). Ecological Diversity and Its Measurement. + Princeton University Press, Princeton, New Jersey. \cr\cr + Miller, R. I. and Wiegert, R. G. (1989). Documenting completeness, species-area relations, and the species-abundance distribution of a regional flora. Ecology, 70, 16-22. \cr\cr + Butler, B. J., and Chazdon, R. L. (1998). Species richness, spatial variation, and abundance of + the soil seed bank of a secondary tropical rain forest. Biotropica, 30, 214-222.\cr\cr + Edwards, W. R. and Eberhardt, L. (1967). Estimating cottontail abundance from live trapping data. The Journal of Wildlife Management, 31, 87-96. +} + +\keyword{datasets} diff --git a/man/TwoComSimDemoData.Rd b/man/TwoComSimDemoData.Rd new file mode 100644 index 0000000..3cded9e --- /dev/null +++ b/man/TwoComSimDemoData.Rd @@ -0,0 +1,25 @@ +\name{SimilarityPairData} +\alias{SimilarityPairData} +\docType{data} +\title{ + Data for Function SimilarityPair +} +\description{ + There are three data sets: \cr\cr + 1. Type (1) abundance data (\code{SimilarityPairData$Abu}) \cr\cr + The data include the observed species frequencies of two communities: seedlings (column 1), and trees (column 2) collected from an old-growth rain forest; see Chao et al. (2005, 2008). The two entries in each row are the observed frequency (or abundance) of each species from the two communities. (These data are subset of \code{SimilarityMultData$Abu} used in the function SimilarityMult.) \cr\cr + 2. Type (2) incidence-frequency data (\code{SimilarityPairData$Inci}) \cr\cr + The data include the observed incidence frequencies of tropical rainforest ants based on two sampling techniques: (a) Berlese extraction of soil samples (217 samples), and (b) Malaise trap samples for flying and crawling insects (62 samples); see Longino et al. (2002). The two entries in first row of the input data denote the number of sampling units (217 and 62). Beginning with the second row, the two numbers in each row denotes incidence frequencies (the total number of detections) in the soil samples based on the two sampling techniques. (These data are subset of \code{SimilarityMultData$Inci} used in the function SimilarityMult.) \cr\cr + 3. Type (2B) incidence-raw data (\code{SimilarityPairData$Inci_raw}) \cr\cr + The data include the observed soil ciliate species detection/non-detection data for a total of 36 soil samples from two areas of Namibia, Africa: Etosha Pan (19 samples), and Central Namib Desert (17 samples). The raw detection/non-detection data include 365 x 36 matrix of 0's and 1's (0 denotes a non-detection and 1 denotes a detection). (These data are subset of \code{SimilarityMultData$Inci_raw} used in the function SimilarityMult.) +} +\usage{data(SimilarityPairData)} + +\references{ + Chao, A., Chazdon, R. L., Colwell, R. K. and Shen, T.-J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters, 8, 148-159.\cr\cr + Chao, A., Jost, L., Chiang, S.-C., Jiang, Y.-H. and Chazdon, R. L. (2008). A Two-stage probabilistic approach to multiple-community similarity indices. Biometrics, 64, 1178-1186.\cr\cr + Longino, J. T., Coddington, J. A. and Colwell, R. K. (2002). The ant fauna of a tropical rain forest: estimating species richness three different ways. Ecology, 83, 689-702. \cr\cr + Foissner, W., Agatha, S. and Berger, H. (2002) Soil Ciliates (Protozoa, Ciliophora) from Namibia (Southwest Africa), with emphasis on two Contrasting environments, the Etosha Region and the Namib Desert. Denisia, 5, 1-1459. +} + +\keyword{datasets} diff --git a/man/spader-package.Rd b/man/spader-package.Rd new file mode 100644 index 0000000..a6482bb --- /dev/null +++ b/man/spader-package.Rd @@ -0,0 +1,49 @@ +\name{SpadeR-package} +\alias{SpadeR-package} +\alias{SpadeR} +\docType{package} +\title{Species-richness prediction and diversity estimation with R +} +\description{ +Provides simple functions to compute various biodiversity indices and related (dis)similarity measures based on individual-based (abundance) data or sampling-unit-based (incidence) data taken from one or multiple communities/assemblages. \cr\cr + +This package contains six main functions: \cr\cr +1. \code{ChaoSpecies} (estimating species richness for one community). \cr\cr +2. \code{Diversity} (estimating a continuous diversity profile and various diversity indices in one community including species richness, Shannon +diversity and Simpson diversity). This function also features plots of empirical and estimated continuous diversity profiles. \cr\cr +3. \code{ChaoShared} (estimating the number of shared species between two communities). \cr\cr +4. \code{SimilartyPair} (estimating various similarity indices between two assemblages). Both richness- and abundance-based two-community similarity indices are included. \cr\cr +5. \code{SimilarityMult} (estimating various similarity indices among \eqn{N} communities). Both richness- and abundance-based \eqn{N}-community similarity indices are included. \cr\cr +6. \code{Genetics} (estimating allelic dissimilarity/differentiation among sub-populations based on multiple-subpopulation genetics data). \cr\cr +Except for the \code{Genetics} function, there are at least three types of data are supported for each function. \cr\cr +} +\details{ +Data are generally classified as abundance data and incidence data and there are five types of data input formats options (datatype="abundance", "abundance_freq_count", "incidence_freq", "incidence_freq_count", "incidence_raw"). \cr\cr + + \describe{ + \item{A.}{Individual-based abundance data when a sample of individuals is taken from each community.} + \bold{Type (1) abundance data} (datatype = "abundance"): Input data consist of species (in rows) by community (in columns) matrix. The entries of each row are the observed abundances of a species in \eqn{N} communities. \cr\cr + \bold{Type (1A) abundance-frequency counts data} only for a single community (datatype = "abundance_freq_count"): input data are arranged as (1 \eqn{f_1 \ 2 \ f_2 \ ... \ r \ f_r})(each number needs to be separated by at least one blank space or separated by rows), where \eqn{r} denotes the maximum frequency and \eqn{f_k} denotes the number of species represented by exactly \eqn{k} individuals/times in the sample. Here the data (\eqn{f_1, f_2, ..., f_r}) are referred to as "abundance-frequency counts". + \item{B.}{Sampling-unit-based incidence data when a number of sampling units are randomly taken from each community. Only the incidence (detection/non-detection) of species is recorded in each sampling unit. There are three data formats options.} \cr\cr + \bold{Type (2) incidence-frequency data} (datatype="incidence_freq"): The first row of the input data must be the number of sampling units in each community. Beginning with the second row, input data consist of species (in rows) by community (in columns) matrix. The entries of each row are the observed incidence frequencies (the number of detections or the number of sampling units in which a species are detected) of a species in \eqn{N} communities. \cr\cr + \bold{Type (2A) incidence-frequency counts data} only for a single community (datatype="incidence \cr _freq_count"): input data are arranged as (\eqn{T \ 1 \ Q_1 \ 2 \ Q_2 \ ... \ r \ Q_r}) (each number needs to be separated by at least one blank space or separated by rows), where \eqn{Q_k} denotes the number of species that were detected in exactly \eqn{k} sampling units, while \eqn{r} denotes the number of sampling units in which the most frequent species were found. The first entry must be the total number of sampling units, \eqn{T}. The data (\eqn{Q_1, Q_2, ..., Q_r}) are referred to as "incidence frequency counts". \cr\cr + \bold{Type (2B) incidence-raw data} (datatype="incidence_raw"): Data consist of a species-by-sampling-unit incidence (detection/non-detection) matrix; typically "1" means a detection and "0" means a non-detection. Each row refers to the detection/non-detection record of a species in \eqn{T} sampling units. Users must specify the number of sampling units in the function argument "units". The first \eqn{T_1} columns of the input matrix denote species detection/non-detection data based on the \eqn{T_1} sampling units from Community 1, and the next \eqn{T_2} columns denote the detection/non-detection data based on the \eqn{T_2} sampling units from Community 2, and so on, and the last \eqn{T_N} columns denote the detection/non-detection data based on \eqn{T_N} sampling units from Community \eqn{N}, \eqn{T_1 + T_2 + ... + T_N = T}. \cr +} + +An Online version of SpadeR is also available for users without an R background: \cr +http://chao.stat.nthu.edu.tw/wordpress/software_download/softwarespader_online/. \cr +In the detailed Online SpadeR User's Guide, we illustrate all the running procedures in an easily +accessible way through numerical examples with proper interpretations of portions of the output. +All the data of those illustrative examples are included in this package. \cr\cr + +functions: ChaoSpecies, Diversity, ChaoShared, SimilarityPair, SimilarityMult, Genetics +} +\author{ +Anne Chao, K. H. Ma, T. C. Hsieh and Chun-Huo Chiu + +Maintainer: Anne Chao + +} +\keyword{ package } + +