Skip to content

Commit

Permalink
update threshold vs cutoff terminology
Browse files Browse the repository at this point in the history
  • Loading branch information
ejanalysis committed May 25, 2023
1 parent f9fa88f commit 47f6a0f
Show file tree
Hide file tree
Showing 11 changed files with 56 additions and 56 deletions.
14 changes: 7 additions & 7 deletions R/RR.if.address.top.x.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@
#' @param d.pct Demographic percentage, as fraction, defining what fraction of population in each place (row) is in demographic group of interest. Required.
#' @param popcounts Numeric vector of counts of total population in each place
#' @param d.pct.us xxxxx
#' @param or.tied Logical value, optional, TRUE by default, in which case ties of ranking variable with a cutoff value (value >= cutoff) are included in places within that bin.
#' @param or.tied Logical value, optional, TRUE by default, in which case ties of ranking variable with a threshold value (value >= threshold) are included in places within that bin.
#' @param if.multiply.e.by Optional, 0 by default. Specifies the number that environmental indicator values would be multiplied by
#' in the scenario where some places are addressed. Zero means those top-ranked places would have the environmental indicator set to zero,
#' while 0.9 would mean and 10 percent cut in the environmental indicator value.
#' @param zones Subsets of places such as States
#' @param mycuts optional vector of cutoff values to analyze. Default is c(50,80,90:100)
#' @param mycuts optional vector of threshold values to analyze. Default is c(50,80,90:100)
#' @param silent optional logical, default is TRUE, while FALSE means more information is printed
#' @return Returns a list of results:
#'
#' 1. rrs data.frame, one column per environmental indicator, one row per cutoff value
#' 1. rrs data.frame, one column per environmental indicator, one row per threshold value
#' 2. rrs2 data.frame, Relative risks 2
#' 3. state.tables A list
#' 4. worst.as.pct Worst as percent, vector as long as number of environmental indicators
Expand Down Expand Up @@ -60,9 +60,9 @@ RR.if.address.top.x <- function(rank.by.df, e.df, d.pct, popcounts, d.pct.us, or
for (mycut.i in 1:length(mycuts) ) {

if (or.tied) {
worst.x <- ( rank.by.df[ , i] >= mycuts[mycut.i]) # those above given EJ %ile cutoff, or specified cutoff
worst.x <- ( rank.by.df[ , i] >= mycuts[mycut.i]) # those above given EJ %ile threshold, or specified threshold
} else {
worst.x <- ( rank.by.df[ , i] > mycuts[mycut.i]) # those above given EJ %ile cutoff, or specified cutoff
worst.x <- ( rank.by.df[ , i] > mycuts[mycut.i]) # those above given EJ %ile threshold, or specified threshold
}
worst.x[is.na(worst.x)] <- FALSE

Expand All @@ -83,8 +83,8 @@ RR.if.address.top.x <- function(rank.by.df, e.df, d.pct, popcounts, d.pct.us, or
cat('\n')
}

# Now for a calculated cutoff, not the specified cutoff:
# Calculate the cutoff that will get RR==1, using specified rank.by.df and if.multiply.e.by, etc.:
# Now for a calculated threshold, not the specified threshold:
# Calculate the threshold that will get RR==1, using specified rank.by.df and if.multiply.e.by, etc.:

# DOES NOT WORK YET ***

Expand Down
2 changes: 1 addition & 1 deletion R/assign.map.bins.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' @description Takes a vector (not matrix or df) of values and returns a vector of bin numbers.
#' For creating color-coded maps (choropleths), assign each place (e.g., each row of a single column) to a bin.
#' Each bin represents one map color, and is defined by cutoff values.
#' Each bin represents one map color, and is defined by threshold values.
#' @details
#' The default bins 0-11 are defined as follows: \cr
#' bin 0: PCTILE=NA \cr
Expand Down
18 changes: 9 additions & 9 deletions R/flagged.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#' @title Which rows have a value above cutoff in at least one column
#' @title Which rows have a value above threshold in at least one column
#'
#' @description Flags rows that have values above some cutoff in at least one column.
#' @description Flags rows that have values above some threshold in at least one column.
#' @details The use of na.rm=TRUE in this function means it will always ignore NA values in a given place and take the max of the valid (non-NA) values instead of returning NA when there is an NA in that row
#' @param df Data.frame with numeric values to be checked against the threshold.
#' @param cutoff Number that is the threshold that must be (met or) exceeded for a row to be flagged. Optional, default is 0.80
#' @param or.tied Logical, optional, default is TRUE, in which case a value equal to the cutoff also flags the row.
#' @param threshold Number that is the threshold that must be (met or) exceeded for a row to be flagged. Optional, default is 0.80
#' @param or.tied Logical, optional, default is TRUE, in which case a value equal to the threshold also flags the row.
#' @return Returns a logical vector or data.frame the shape of df
#' @examples
#' set.seed(999)
Expand All @@ -14,17 +14,17 @@
#' a <- cbind(any.over.0.8=x, round(places,2))
#' a[order(a[,1]),]
#' @export
flagged <- function(df, cutoff=0.80, or.tied=TRUE) {
flagged <- function(df, threshold=0.80, or.tied=TRUE) {

# CREATE A LOGICAL VECTOR THAT IS TRUE FOR EACH ROW OF DATA FRAME WHERE AT LEAST ONE VALUE IN ROW IS >= CUTOFF (or just >CUTOFF if above.only=TRUE)
# CREATE A LOGICAL VECTOR THAT IS TRUE FOR EACH ROW OF DATA FRAME WHERE AT LEAST ONE VALUE IN ROW IS >= threshold (or just >threshold if above.only=TRUE)

# *** Be careful to check if percentiles are 0-100 or 0-1 !!! ***
if (cutoff <= 1 & any(df > 1, na.rm = TRUE) ) {warning('Cutoff is <=1 so it might be a percentage as fraction, but some of data are >1 so may percentages as 0-100 not as fraction')}
if (threshold <= 1 & any(df > 1, na.rm = TRUE) ) {warning('threshold is <=1 so it might be a percentage as fraction, but some of data are >1 so may percentages as 0-100 not as fraction')}

if (or.tied) {
flag <- do.call(pmax, c(df, na.rm=TRUE)) >= cutoff
flag <- do.call(pmax, c(df, na.rm=TRUE)) >= threshold
} else {
flag <- do.call(pmax, c(df, na.rm=TRUE)) > cutoff
flag <- do.call(pmax, c(df, na.rm=TRUE)) > threshold
}
return(flag)
}
Expand Down
30 changes: 15 additions & 15 deletions R/flagged.by.R
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
#' @title flagged.by
#' @description Flag which cells are at or above some cutoff.
#' @details For a matrix with a few cols of related data, find which cells are at/above (or below) some cutoff.
#' Returns a logical matrix, with TRUE for each cell that is at/above the cutoff.
#' Can be used in EJ analysis as 1st step in identifying places (rows) where some indicator(s) is/are at/above a cutoff, threshold value.
#' @description Flag which cells are at or above some threshold.
#' @details For a matrix with a few cols of related data, find which cells are at/above (or below) some threshold.
#' Returns a logical matrix, with TRUE for each cell that is at/above the threshold.
#' Can be used in EJ analysis as 1st step in identifying places (rows) where some indicator(s) is/are at/above a threshold, threshold value.
#'
#' @param x Data.frame or matrix of numbers to be compared to cutoff value.
#' @param cutoff Numeric. The threshold or cutoff to which numbers are compared. Default is arithmetic mean of row. Usually one number, but can be a vector of same length as number of rows, in which case each row can use a different cutoff.
#' @param or.tied Logical. Default is FALSE, which means we check if number in x is greater than the cutoff (>). If TRUE, check if greater than or equal (>=).
#' @param below Logical. Default is FALSE. If TRUE, uses > or >= cutoff. If FALSE, uses < or <= cutoff.
#' @param x Data.frame or matrix of numbers to be compared to threshold value.
#' @param threshold Numeric. The threshold or threshold to which numbers are compared. Default is arithmetic mean of row. Usually one number, but can be a vector of same length as number of rows, in which case each row can use a different threshold.
#' @param or.tied Logical. Default is FALSE, which means we check if number in x is greater than the threshold (>). If TRUE, check if greater than or equal (>=).
#' @param below Logical. Default is FALSE. If TRUE, uses > or >= threshold. If FALSE, uses < or <= threshold.
#' @param ... optional additional parameters to pass to [analyze.stuff::cols.above.which()]
#' @return Returns a logical matrix the same size as x.
#' @seealso cols.above.which, another name for the exact same function.
#' @seealso cols.above.count or cols.above.pct to see, for each row, count or fraction of columns with numbers at/above/below cutoff.
#' @seealso flagged.only.by to find cells that are the only one in the row that is at/above/below the cutoff.
#' @seealso cols.above.count or cols.above.pct to see, for each row, count or fraction of columns with numbers at/above/below threshold.
#' @seealso flagged.only.by to find cells that are the only one in the row that is at/above/below the threshold.
#' @seealso rows.above.count, rows.above.pct, rows.above.which
#' @examples
#' out <- flagged.by(x<-data.frame(a=1:10, b=rep(7,10), c=7:16), cutoff=7)
#' out <- flagged.by(x<-data.frame(a=1:10, b=rep(7,10), c=7:16), threshold=7)
#' x; out # default is or.tied=FALSE
#' out <- flagged.by(data.frame(a=1:10, b=rep(7,10), c=7:16), cutoff=7, or.tied=TRUE, below=TRUE)
#' out <- flagged.by(data.frame(a=1:10, b=rep(7,10), c=7:16), threshold=7, or.tied=TRUE, below=TRUE)
#' out
#' out <- flagged.by(data.frame(a=1:10, b=rep(7,10), c=7:16) )
#' # Compares each number in each row to the row's mean.
#' out
#' @note Future work: these functions could have wts, na.rm, & allow cutoffs or benchmarks as a vector (not just 1 number), & have benchnames.
#' @note Future work: these functions could have wts, na.rm, & allow cutpoints or benchmarks as a vector (not just 1 number), & have benchnames.
#' @export
flagged.by <- function(x, cutoff, or.tied, below, ...) {
cols.above.which(x=x, cutoff=cutoff, or.tied=or.tied, ...)
flagged.by <- function(x, threshold, or.tied, below, ...) {
analyze.stuff::cols.above.which(x=x, cutoff=threshold, or.tied=or.tied, ...)
}
32 changes: 16 additions & 16 deletions R/flagged.only.by.R
Original file line number Diff line number Diff line change
@@ -1,34 +1,34 @@
#' @title flagged.only.by
#' @description Flag which cells are the only one in the row that is at/above a cutoff (find rows that meet only 1 of several criteria).
#' @details For a data.frame with a few cols of related data, find which cells are the only one in the row that is at/above some cutoff.
#' @description Flag which cells are the only one in the row that is at/above a threshold (find rows that meet only 1 of several criteria).
#' @details For a data.frame with a few cols of related data, find which cells are the only one in the row that is at/above some threshold.
#' This can find rows that meet only 1 of several criteria, for example.
#' Returns a logical matrix or data.frame, with TRUE for each cell that meets the test.
#' Can be used in EJ analysis in identifying places (rows) that were only flagged because one of the indicator(s) is at/above a cutoff, threshold value.
#' Can be used in EJ analysis in identifying places (rows) that were only flagged because one of the indicator(s) is at/above a threshold, threshold value.
#' For example, if there were four criteria to be met in flagging a location, this function identifies
#' places that met only one of the criteria, and can show which one was met.
#'
#' @param x Data.frame or matrix of numbers to be compared to cutoff value.
#' @param cutoff The numeric threshold or cutoff to which numbers are compared. Default is 8! Usually one number, but can be a vector of same length as number of rows, in which case each row can use a different cutoff.
#' @param or.tied Logical. Default is FALSE, which means we check if number in x is greater than the cutoff (>). If TRUE, check if greater than or equal (>=).
#' @param below Logical. Default is FALSE. If TRUE, uses > or >= cutoff. If FALSE, uses < or <= cutoff.
#' @param x Data.frame or matrix of numbers to be compared to threshold value.
#' @param threshold The numeric threshold or threshold to which numbers are compared. Default is 8! Usually one number, but can be a vector of same length as number of rows, in which case each row can use a different threshold.
#' @param or.tied Logical. Default is FALSE, which means we check if number in x is greater than the threshold (>). If TRUE, check if greater than or equal (>=).
#' @param below Logical. Default is FALSE. If TRUE, uses > or >= threshold. If FALSE, uses < or <= threshold.
#' @return Returns a logical matrix the same size as x.
#' @seealso flagged.by or cols.above.which to see which cells are at/above/below some cutoff
#' @seealso cols.above.count to see, for each row, how many columns are at/above some cutoff
#' @seealso cols.above.percent to see, for each row, what fraction of columns are at/above some cutoff
#' @seealso flagged.by or cols.above.which to see which cells are at/above/below some threshold
#' @seealso cols.above.count to see, for each row, how many columns are at/above some threshold
#' @seealso cols.above.percent to see, for each row, what fraction of columns are at/above some threshold
#' @keywords EJ
#' @examples
#' out <- flagged.only.by(x<-data.frame(a=1:10, b=rep(7,10), c=7:16), cutoff=7)
#' out <- flagged.only.by(x<-data.frame(a=1:10, b=rep(7,10), c=7:16), threshold=7)
#' x; out # default is or.tied=FALSE
#' out <- flagged.only.by(data.frame(a=1:10, b=rep(7,10), c=7:16), cutoff=7,
#' out <- flagged.only.by(data.frame(a=1:10, b=rep(7,10), c=7:16), threshold=7,
#' or.tied=TRUE, below=TRUE)
#' out
#' out <- flagged.only.by(data.frame(a=1:10, b=rep(7,10), c=7:16) )
#' # Compares each number in each row to the default cutoff.
#' # Compares each number in each row to the default threshold.
#' out
#' @note Future work: these functions could have wts, na.rm, & allow cutoffs or benchmarks as a vector (not just 1 number), & have benchnames.
#' @note Future work: these functions could have wts, na.rm, & allow cutpoints or benchmarks as a vector (not just 1 number), & have benchnames.
#' @export
flagged.only.by <- function(x, cutoff=8, or.tied=FALSE, below=FALSE) {
above <- flagged.by(df.bins=df.bins, cutoff=cutoff, or.tied=or.tied)
flagged.only.by <- function(x, threshold=8, or.tied=FALSE, below=FALSE) {
above <- flagged.by(df.bins=df.bins, threshold=threshold, or.tied=or.tied)
onlyby <- above
onlyby[rowSums(onlyby) > 1, ] <- 0
return(onlyby)
Expand Down
2 changes: 1 addition & 1 deletion R/make.bin.pctile.cols.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#' @description This function just combines [make.pctile.cols()] and [make.bin.cols()].
#' Takes a data.frame of values and returns a data.frame (or matrix) of percentiles,
#' showing the percentile of a value within all values in its column, as well as bin numbers,
#' showing what bin each falls into, based on specified cutoffs defining bins. \cr\cr
#' showing what bin each falls into, based on specified cutpoints defining bins. \cr\cr
#' ** Work in progress/ not fully tested, e.g., need to test if all code below works with both as.df=TRUE and as.df=FALSE
#' @param raw.data.frame Data.frame of values
#' @param weights Optional Numeric vector of weights to create weighted percentiles, such as population-weighted quantiles. Unweighted if not specified. Vector same length as number of rows in data.frame.
Expand Down
6 changes: 3 additions & 3 deletions ejanalysis.html
Original file line number Diff line number Diff line change
Expand Up @@ -1155,7 +1155,7 @@ <h3>Value</h3>

<h3>Note</h3>

<p>Future work: these functions could have wts, na.rm, &amp; allow cutoffs or benchmarks as a vector (not just 1 number), &amp; have benchnames.
<p>Future work: these functions could have wts, na.rm, &amp; allow cutpoints or benchmarks as a vector (not just 1 number), &amp; have benchnames.
</p>


Expand Down Expand Up @@ -1245,7 +1245,7 @@ <h3>Value</h3>

<h3>Note</h3>

<p>Future work: these functions could have wts, na.rm, &amp; allow cutoffs or benchmarks as a vector (not just 1 number), &amp; have benchnames.
<p>Future work: these functions could have wts, na.rm, &amp; allow cutpoints or benchmarks as a vector (not just 1 number), &amp; have benchnames.
</p>


Expand Down Expand Up @@ -2414,7 +2414,7 @@ <h3>Description</h3>
<p>This function just combines <code><a href="#make.pctile.cols">make.pctile.cols()</a></code> and <code><a href="#make.bin.cols">make.bin.cols()</a></code>.
Takes a data.frame of values and returns a data.frame (or matrix) of percentiles,
showing the percentile of a value within all values in its column, as well as bin numbers,
showing what bin each falls into, based on specified cutoffs defining bins. <br><br>
showing what bin each falls into, based on specified cutpoints defining bins. <br><br>
** Work in progress/ not fully tested, e.g., need to test if all code below works with both as.df=TRUE and as.df=FALSE
</p>

Expand Down
2 changes: 1 addition & 1 deletion man/assign.map.bins.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/flagged.by.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/flagged.only.by.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/make.bin.pctile.cols.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 47f6a0f

Please sign in to comment.