Skip to content

Commit

Permalink
version 0.1.3
Browse files Browse the repository at this point in the history
  • Loading branch information
clarkevansteenderen authored and cran-robot committed Feb 7, 2022
1 parent bb25217 commit 88450e5
Show file tree
Hide file tree
Showing 29 changed files with 1,393 additions and 1,318 deletions.
16 changes: 9 additions & 7 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,23 +1,25 @@
Package: BinMat
Type: Package
Title: Processes Binary Data Obtained from Fragment Analysis
Version: 0.1.2
Title: Processes Binary Data Obtained from Fragment Analysis (Such as
AFLPs, ISSRs, and RFLPs)
Version: 0.1.3
Authors@R: person("Clarke", "van Steenderen", email = "vsteenderen@gmail.com",
role = c("aut", "cre"), comment = c(ORCID = "0000-0002-4219-446X"))
Description: A molecular genetics tool that processes binary data from fragment analysis, such as inter-simple sequence repeats (ISSRs) and amplified fragment length polymorphism (AFLP). It consolidates replicate sample pairs, outputs summary statistics, and produces hierarchical clustering trees and nMDS plots. This package was developed from the M.Sc. thesis entitled "A genetic analysis of the species and intraspecific lineages of Dactylopius Costa (Hemiptera:Dactylopiidae)" (van Steenderen, 2019, Rhodes University, Department of Zoology and Entomology, Center for Biological Control (CBC) <https://www.ru.ac.za/centreforbiologicalcontrol/>, Grahamstown, South Africa), <doi:10.13140/RG.2.2.28470.86083>. The GUI version of this package is available on the R Shiny online server at: <https://clarkevansteenderen.shinyapps.io/BINMAT/> , or it is accessible via GitHub by typing: shiny::runGitHub("BinMat", "CJMvS") into the console in R. Please see the vignette supplied with the package for a worked example, and detailed explanations of functions.
Description: A molecular genetics tool that processes binary data from fragment analysis. It consolidates replicate sample pairs, outputs summary statistics, and produces hierarchical clustering trees and nMDS plots. This package was developed from the M.Sc. thesis entitled "A genetic analysis of the species and intraspecific lineages of Dactylopius Costa (Hemiptera:Dactylopiidae)" (van Steenderen, 2019, Rhodes University, Department of Zoology and Entomology, Center for Biological Control (CBC) <https://www.ru.ac.za/centreforbiologicalcontrol/>, Grahamstown, South Africa), <doi:10.13140/RG.2.2.28470.86083>. The GUI version of this package is available on the R Shiny online server at: <https://clarkevansteenderen.shinyapps.io/BINMAT/> , or it is accessible via GitHub by typing: shiny::runGitHub("BinMat", "CJMvS") into the console in R.
License: GPL-3
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.0.2
RoxygenNote: 7.1.2
Depends: R (>= 3.0)
Imports: pvclust (>= 2.0), magrittr, MASS (>= 7.3), stats (>= 3.4.0),
graphics (>= 3.4.0), base (>= 3.4.0)
graphics (>= 3.4.0), base (>= 3.4.0), ggpubr (>= 0.4.0), tibble
(>= 3.1.4)
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2020-03-06 17:53:36 UTC; clarke
Packaged: 2022-02-07 08:00:58 UTC; s1000334
Author: Clarke van Steenderen [aut, cre]
(<https://orcid.org/0000-0002-4219-446X>)
Maintainer: Clarke van Steenderen <vsteenderen@gmail.com>
Repository: CRAN
Date/Publication: 2020-03-06 18:20:06 UTC
Date/Publication: 2022-02-07 08:10:09 UTC
56 changes: 28 additions & 28 deletions MD5
Original file line number Diff line number Diff line change
@@ -1,36 +1,36 @@
c3c5e603eadbf21ad163bf62180cfd36 *DESCRIPTION
ccf880aa567aab03603c1666b342dabd *DESCRIPTION
ddbeeb32dd02f10bd1d99a4494c4e192 *NAMESPACE
078180d085ca97c033e6289db5e04f4b *R/BinMatInput_ordination.R
082e0ca4ed12ccedc0d553e6cd29d873 *R/BinMatInput_reps.R
cfcf931474bc6c4bb028f7c9df2ce8f4 *R/check_data.R
733037d1264f7b8abb95bed40ed7dd8c *R/BinMatInput_ordination.R
47487718699dc10a65f10b434af64f98 *R/BinMatInput_reps.R
4ec7d2185ec4a8990ac13daf369cea40 *R/check_data.R
fedd397e38070517a149c174c2259e4a *R/consolidate.R
c49a809c04a650b10c98c920d78b63d0 *R/errors.R
4dd23b492eb7f0dad07e57c78935cb57 *R/errors.R
de87dddf47b0b4d0b2bd96d4bcff4fd8 *R/group_names.R
5b5b69e2a0f30a477817a45ca732166b *R/nmds.R
36835ad82e7d87067698504df1e7ccb9 *R/peakRemove.R
0759a1e7ef7b6ba7d3284cb8000d5bfb *R/peaks_consolidated.R
83eb445f1d6494cf4d48ecd95ceed98e *R/peaks_replicates.R
fa657ae34e4f0cce38d4f3ee587e0143 *R/scree.R
098a7f43822051515812735c797de625 *R/shepard.R
fb531c1ec001d1ed5470159ef4fb395c *R/upgma.R
6a99094965ca95c1d941d06b4b0d1e9d *build/vignette.rds
4632f13fc01e415503527cca4715c8f2 *R/nmds.R
3b5e29d12b637efbcf6a7ba9fac88e3a *R/peakRemove.R
a271dfa8b01fa6cd62092182594cbdad *R/peaks_consolidated.R
3237d3bfcef8fab5707c926b5f85380f *R/peaks_replicates.R
2a5f074965089cb6c27f2a53e9760ca9 *R/scree.R
6d5fc44c6ffe366925f90b5a85d2096f *R/shepard.R
02d4411f71120f7304a51fd8a8454060 *R/upgma.R
f0f46601079db3b4afdc555abd66dd70 *build/vignette.rds
a9bf62bcc5b7057d08b7050aa330f97f *data/BinMatInput_ordination.rda
f4959bc1b00cd58e0953cc6653d5c191 *data/BinMatInput_reps.rda
75a5f9380939bf25ced1af2439b30d47 *inst/doc/BinMat.R
373cf0ec38f29706a275214e0f805351 *inst/doc/BinMat.Rmd
1cfe6068ac9bb0e6c9896ffdadb1415c *inst/doc/BinMat.html
94b1f63003bf13329d7178502fc9a488 *man/BinMatInput_ordination.Rd
9aa70b4d098169b497ac0ac77c67d377 *man/BinMatInput_reps.Rd
2a5ffde37f48e0f2ba893fc8e5fe3359 *man/check.data.Rd
bbf9afef94f136d0f31d7562e9ce2275 *inst/doc/BinMat.R
bc5bc9fddd5e6e1fa4519f342ccb9afa *inst/doc/BinMat.Rmd
9fc4ad109bbd3e4d9ad50783f9556e2a *inst/doc/BinMat.html
d56e9be99a6601ca67569789d41f8616 *man/BinMatInput_ordination.Rd
75f18c1c416a63df7b204dba336a366a *man/BinMatInput_reps.Rd
7b6d01567ba5562a6f15a68b0b093536 *man/check.data.Rd
15e19aa2662e7be9219d77ec747a00b0 *man/consolidate.Rd
787b15e3d26326a6ac8abeacd1cde0e9 *man/errors.Rd
7f74098399e3e1fcde2e6ad3807272bc *man/errors.Rd
e2ee5ac37b210342fba90c4f6b40c082 *man/group.names.Rd
564fe135e1a27bde9d6d7d6dd5a8b0fe *man/nmds.Rd
c55cdd4b10c8e1322c880f61f476eceb *man/peak.remove.Rd
10e2b7f803ed4b1df735ebc9f7839128 *man/peaks.consolidated.Rd
0672df0ac30e4af122f04b85bf2dbb36 *man/peaks.original.Rd
97a6fd815ba7347a51657d68beb2c6d7 *man/scree.Rd
3b38d759ae4e116ac34b6ec378a7eedd *man/shepard.Rd
2b33a1da8d3796ada9fbe8b65a61a1d1 *man/upgma.Rd
373cf0ec38f29706a275214e0f805351 *vignettes/BinMat.Rmd
00266c04f007c996470739f2c473e9c8 *man/nmds.Rd
aa78fd00595121a36d42897de779505b *man/peak.remove.Rd
e7e49b1464fd4f9d317ec4e0b3fba90e *man/peaks.consolidated.Rd
692eaa1923b9579b3c850462ba9cdcce *man/peaks.original.Rd
57eb9c7031779018cd537ff61d8392c8 *man/scree.Rd
e3fbb4e39814d1a410351ce384f67382 *man/shepard.Rd
730e6b240744b5ccdd11c264635d6f47 *man/upgma.Rd
bc5bc9fddd5e6e1fa4519f342ccb9afa *vignettes/BinMat.Rmd
af8503436cc5b48444ef79deddfbb5a5 *vignettes/logo.png
32 changes: 16 additions & 16 deletions R/BinMatInput_ordination.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
#' Example input data containing a consolidated binary matrix with grouping information
#' @docType data
#'
#' @usage data(BinMatInput_ordination)
#'
#' @format A dataframe with columns for loci, and rows for samples. Grouping information is in the second column.
#' @examples data(BinMatInput_ordination)
#' mat = BinMatInput_ordination
#' group.names(mat)
#' scree(mat)
#' shepard(mat)
#' clrs = c("red", "green", "black")
#' shp = c(16,16,16)
#' nmds(mat, colours = clrs, shapes = shp, labs = TRUE)

"BinMatInput_ordination"
#' Example input data containing a consolidated binary matrix with groups
#' @docType data
#'
#' @usage data(BinMatInput_ordination)
#'
#' @format A dataframe with columns for loci, and rows of replicate pairs. Grouping information is in the second column.
#' @examples data(BinMatInput_ordination)
#' mat = BinMatInput_ordination
#' group.names(mat)
#' scree(mat)
#' shepard(mat)
#' clrs = c("red", "green", "black")
#' shp = c(16,16,16)
#' nmds(mat, colours = clrs, shapes = shp, labs = TRUE)

"BinMatInput_ordination"
38 changes: 19 additions & 19 deletions R/BinMatInput_reps.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
#' Example input data containing a binary matrix comprising replicate sample pairs
#'
#' @docType data
#'
#' @usage data(BinMatInput_reps)
#'
#' @format A dataframe with columns for loci, and rows for replicate sample pairs.
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' check.data(mat)
#' cons = consolidate(mat)
#' pks = peaks.consolidated(cons)
#' err = errors(cons)
#' rem = peak.remove(cons, 4)
#' clust = upgma(cons)
#'
#'
"BinMatInput_reps"
#' Example input data containing a binary matrix comprising replicate pairs
#'
#' @docType data
#'
#' @usage data(BinMatInput_reps)
#'
#' @format A dataframe with columns for loci, and rows of replicate pairs.
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' check.data(mat)
#' cons = consolidate(mat)
#' pks = peaks.consolidated(cons)
#' err = errors(cons)
#' rem = peak.remove(cons, 4)
#' clust = upgma(cons)
#'
#'
"BinMatInput_reps"
50 changes: 25 additions & 25 deletions R/check_data.R
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
#' @title Checks binary matrix for unwanted characters.
#'
#' @description Checks for unwanted values (other than 1, 0, and ?) in the data set.
#'
#' @param x A CSV file containing replicate sample pairs of binary data.
#'
#' @return Index positions where unwanted values occur (row, column).
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' check.data(mat)
#'
#' @export

check.data = function(x){
row.names(x) <- x[[1]]
x[,1] <- NULL
x[,] <- sapply(x[,], as.numeric)
answer = which(x != 0 & x != 1 & x != "?", arr.ind = TRUE)
if(length(answer) > 0) message(answer)
else {message("None found.")}

}


#' @title Checks binary matrix for unwanted characters.
#'
#' @description Checks for unwanted values (other than 1, 0, and ?).
#'
#' @param x A CSV file containing replicate pairs of binary data.
#'
#' @return Index positions where unwanted values occur (row, column).
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' check.data(mat)
#'
#' @export

check.data = function(x){
row.names(x) <- x[[1]]
x[,1] <- NULL
x[,] <- sapply(x[,], as.numeric)
answer = which(x != 0 & x != 1 & x != "?", arr.ind = T)
if(length(answer) > 0) print(answer)
else {writeLines("None found.")}

}


94 changes: 48 additions & 46 deletions R/errors.R
Original file line number Diff line number Diff line change
@@ -1,46 +1,48 @@
#' @title Calculates Jaccard and Euclidean error rates.
#'
#' @description Calculates the Jaccard and Euclidean error rates for the dataset. Jaccard's error does not take shared absences of bands as being biologically meaningful. Jaccard Error = (f10 + f01)/(f10 + f01 + f11) and Euclidean Error = (f10 + f01)/(f10 + f01 + f11 + f00). At each locus, f01 and f10 indicates a case where a 0 was present in one replicate, and a 1 in the other. f11 indicates the shared presence of a band in both replicates, and f00 indicates a shared absence. For example, if a replicate pair comprises Rep1 = 00101 and Rep2 = 01100, Jaccard Error = (1+1)/(1+1+1) = 2/3 = 0.67, Euclidean Error = (1+1)/(1+1+1+2) = 2/5 = 0.4.
#'
#' @param x Consolidated binary matrix.
#'
#' @return JE (Jaccard Error), EE (Euclidean Error), and standard deviations.
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' cons = consolidate(mat)
#' errors(cons)
#'
#' @export

errors = function(x){

mismatch_err = matrix(nrow=nrow(x), ncol = 1)
jacc_err = matrix(nrow=nrow(x), ncol = 1)


for(i in 1:nrow(x)) {
# find the number of 1s, Os and question marks
ones = length(which(x[i,] == 1))
zeroes = length(which(x[i,] == 0))
questions = length(which(x[i,] == "?"))
sum_bands = ones + questions

mismatch_err[i,] = (questions/(questions + ones + zeroes))
jacc_err[i,] = (questions/(questions + ones))

}

error_table = data.frame("Errors" = matrix(ncol = 1, nrow = 8))
error_table[1,] = "Average Euclidean Error:"
error_table[2,] = round(base::mean(mismatch_err[,1]),4)
error_table[3,] = "Euclidean error St. dev:"
error_table[4,] = round(stats::sd(mismatch_err[,1]),4)
error_table[5,] = "Average Jaccard:"
error_table[6,] = round(base::mean(jacc_err[,1]),4)
error_table[7,] = "Jaccard error St.dev:"
error_table[8,] = round(stats::sd(jacc_err[,1]),4)

return(error_table)

}
#' @title Calculates Jaccard and Euclidean error rates.
#'
#' @description Calculates the Jaccard and Euclidean error rates for the dataset. Jaccard's error does not take shared absences of bands as being biologically meaningful. JE = (f10 + f01)/(f10 + f01 + f11) and EE = (f10 + f01)/(f10 + f01 + f11 + f00). At each locus, f01 and f10 indicates a case where a 0 was present in one replicate, and a 1 in the other. f11 indicates the shared presence of a band in both replicates, and f00 indicates a shared absence. For example, if a replicate pair comprises Rep1 = 00101 and Rep2 = 01100, JE = (1+1)/(1+1+1) = 2/3 = 0.67, EE = (1+1)/(1+1+1+2) = 2/5 = 0.4.
#'
#' @param x Consolidated binary matrix.
#'
#' @return JE (Jaccard Error), EE (Euclidean Error), and standard deviations.
#'
#' @examples data(BinMatInput_reps)
#' mat = BinMatInput_reps
#' cons = consolidate(mat)
#' errors(cons)
#'
#' @export

errors = function(x){

mismatch_err = matrix(nrow=nrow(x), ncol = 1)
jacc_err = matrix(nrow=nrow(x), ncol = 1)


for(i in 1:nrow(x)) {
# find the number of 1s, Os and question marks
ones = length(which(x[i,] == 1))
zeroes = length(which(x[i,] == 0))
questions = length(which(x[i,] == "?"))
sum_bands = ones + questions

mismatch_err[i,] = (questions/(questions + ones + zeroes))
jacc_err[i,] = (questions/(questions + ones))

}

error_table = data.frame("Errors" = matrix(ncol = 2, nrow = 4))
error_table[1,1] = "Average Euclidean Error:"
error_table[1,2] = round(base::mean(mismatch_err[,1]),4)
error_table[2,1] = "Euclidean error St. dev:"
error_table[2,2] = round(stats::sd(mismatch_err[,1]),4)
error_table[3,1] = "Average Jaccard:"
error_table[3,2] = round(base::mean(jacc_err[,1]),4)
error_table[4,1] = "Jaccard error St.dev:"
error_table[4,2] = round(stats::sd(jacc_err[,1]),4)

colnames(error_table) = c("Metric", "Value")

return(error_table)

}
Loading

0 comments on commit 88450e5

Please sign in to comment.