Sexcoding by smasongarrison · Pull Request #100 · R-Computing-Lab/ggpedigree

smasongarrison · 2025-12-19T17:58:57Z

This pull request significantly improves how the package handles sex coding and ID types in pedigree data, especially for non-standard or inconsistent inputs. It adds more informative error messages, expands and refactors tests for edge cases, introduces new configuration options, and improves robustness in data handling and plotting. Below are the most important changes grouped by theme:

Improved Error Handling and Messaging

Added more informative error messages when pedigree object construction fails due to incorrect or non-standard sex coding, guiding users to use configuration options like code_male for non-standard codes (e.g., "M"/"F", "1"/"2") [1] [2].
Updated error and warning messages in pedigree.sexrepair to clarify when all sex values are unknown or when a large proportion are unknown, suggesting configuration fixes.

Expanded and Refactored Testing

Added a comprehensive new test file (test-ggpedigree_datainputs.R) that systematically tests combinations of sex codes, ID types (numeric/character), and missing parent encodings, checking for correct error/warning handling and round-trip data invariance.
Refactored and removed redundant tests in test-joss_review.R, consolidating input validation coverage into the new, more thorough test suite.
Expanded test coverage in existing tests to include more explicit code_male, code_female, and code_na settings [1] [2] [3] [4].

Configuration and Documentation Enhancements

Added a new code_unknown parameter to getDefaultPlotConfig and its documentation, allowing users to specify codes for unknown sex values [1] [2] [3] [4] [5].
Updated tests to reflect the new default configuration parameter count.

Robustness in Data Handling

Improved coercion logic in transformPed to ensure consistent ID types for momID and dadID, preventing class mismatches that could cause errors [1] [2].
Adjusted vertical layout logic in .adjustSpacing to ensure y_pos and y_fam start at zero for more consistent plotting.
Modified y-axis scaling in ggPedigree.core to ensure the minimum y-value is respected.

Miscellaneous

Minor improvements to documentation and code formatting [1] [2] [3] [4].

These changes collectively make the package more user-friendly, robust, and reliable when handling diverse and potentially inconsistent pedigree data inputs.

Eliminated print statements that output the classes of ped$spouseID, ped$personID, ped$momID, and ped$dadID. This cleans up the function by removing unnecessary debug output.

Refactored and expanded test cases in test-joss_review.R to thoroughly check consistent handling of string and numeric IDs, including various missing value representations. Updated NEWS.md to document the repair of inconsistent ID handling.

Updates sex code handling in alignPedigreeWithRelations and pedigree.sexrepair to better support non-standard and numeric codes. Enhances tests to cover new config options and warning handling for sex code edge cases.

Wrapped ggpedigree plotting code in df_warroses.R with if (FALSE) to prevent execution during data processing. Improved formatting, argument alignment, and logic in test-ggpedigree_datainputs.R for better readability and consistency, including clearer handling of expect_warnings and test cases.

Introduces a new code_unknown parameter to getDefaultPlotConfig for specifying unknown sex codes. Updates pedigree.sexrepair to provide clearer error and warning messages when unknown or invalid sex values are detected. Adjusts related test to account for the new parameter.

Added more informative error messages when constructing pedigree objects fails due to incorrect or non-standard sex coding. Expanded and refactored tests for various sex code and ID input scenarios, including comprehensive cross-product testing of input types, missing values, and configuration options. Updated documentation to include new config option 'code_unknown'.

Sets scale_y_reverse limits to use minimum y position and normalizes y positions to start at zero in .adjustSpacing. Also updates test to clarify error handling for character IDs.

Adjust y-axis handling and spacing in pedigree plots

…gree into sexcoding

Introduces a new option to scale point sizes based on pedigree size in plotting functions. Updates documentation, function signatures, and tests to support the new parameter and ensure correct behavior.

Copilot

Pull request overview

This pull request significantly improves sex coding and ID type handling in pedigree data. It adds more informative error messages when sex values are incorrectly specified, expands test coverage with comprehensive input validation tests, introduces new configuration options (code_unknown, point_scale_by_pedigree), and improves robustness in data type coercion and plotting. The changes collectively make the package more user-friendly when handling diverse pedigree data formats.

Key Changes

Enhanced error messaging for sex coding issues with guidance on using configuration options
Added comprehensive test suite for various sex codes, ID types, and missing value combinations
Introduced automatic point size scaling based on pedigree size
Improved ID type consistency handling to prevent mismatches between numeric and character IDs

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
tests/testthat/test-joss_review.R	Removed redundant test cases that are now covered by the more comprehensive test suite
tests/testthat/test-ggpedigree_datainputs.R	Added extensive new test suite systematically testing combinations of sex codes, ID types, and missing parent encodings
tests/testthat/test-ggPedigree.R	Added explicit `code_female` parameter to existing tests for consistency
tests/testthat/test-defaultPlotConfig.R	Updated to reflect new parameter count (162) and added `pedigree_size` argument to buildPlotConfig calls
R/kinship2_pedigree.X	New file with alternative pedigree implementation (.X extension is problematic)
R/kinship2_pedigree.R	Improved sex repair logic to handle numeric strings and updated error messages
R/ggpedigreeHelpers.R	Added y-position normalization to ensure minimum value starts at zero
R/ggpedigree.R	Enhanced ID type coercion for consistency, added NA sex handling, improved spacing, and integrated pedigree_size parameter
R/ggPedigreeInteractive.R	Added pedigree_size parameter to buildPlotConfig call
R/calcCoordinates.R	Wrapped pedigree construction in tryCatch with improved error messages
R/defaultPlotConfig.R	Added `code_unknown` and `point_scale_by_pedigree` parameters with automatic point size scaling logic
man/getDefaultPlotConfig.Rd	Updated documentation for new parameters
man/buildPlotConfig.Rd	Added `pedigree_size` parameter documentation
data-raw/df_warroses.R	Wrapped example plotting code in `if(FALSE)` block
NEWS.md	Updated with summary of changes including error message improvements and test expansion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-19T23:04:42Z

+        stop("Error in constructing pedigree object. Please check that the you've
+           correctly specificed the sex of individuals. Setting code_male may help if non-standard codes are used (e.g., 'M'/'F'; '1,2').")


There are two spelling errors in this error message: "the you've" should be "that you've" and "specificed" should be "specified".

@copilot open a new pull request to apply changes based on this feedback

Copilot · 2025-12-19T23:04:43Z

+        stop("Error in constructing pedigree object. Please check that the you've
+           correctly specificed the sex of individuals. Setting code_male may help if non-standard codes are used (e.g., 'M'/'F'; '1,2').")


There are two spelling errors in this error message: "the you've" should be "that you've" and "specificed" should be "specified". This is identical to the error message on lines 277-278.

Copilot · 2025-12-19T23:04:43Z

+#' Create a pedigree or pedigreeList object
+#'
+#' @param id Identification variable for individual
+#' @param dadid Identification variable for father. Founders' parents should be coded
+#' to NA, or another value specified by missid.
+#' @param momid Identification variable for mother. Founders' parents should be coded
+#' to NA, or another value specified by missid.
+#' @param sex Gender of individual noted in `id'. Either character ("male","female",
+#' "unknown","terminated") or numeric (1="male", 2="female", 3="unknown", 4="terminated")
+#' data is allowed.  For character data the string may be truncated, and of arbitrary case.
+#' @param affected A variable indicating affection status.  A multi-column matrix
+#' can be used to give the status with respect to multiple traits. Logical, factor, and
+#' integer types are converted to 0/1 representing unaffected and affected, respectively.
+#' NAs are considered missing.
+#' @param status Censor/Vital status (0="censored", 1="dead")
+#' @param relation A matrix with 3 required columns (id1, id2, code) specifying special
+#' relationship between pairs of individuals. Codes: 1=Monozygotic twin,  2=Dizygotic twin,
+#' 3=Twin of unknown zygosity, 4=Spouse. (The last is necessary in order to place a marriage
+#' with no children into the plot.) If famid is given in the call to create pedigrees, then
+#' famid needs to be in the last column of the relation matrix. Note for tuples of >= 3 with
+#' a mixture of zygosities, the plotting is limited to showing pairwise zygosity of adjacent
+#' subjects, so it is only necessary to specify the pairwise zygosity, in the order the subjects
+#' are given or appear on the plot.
+#' @param famid An optional vector of family identifiers.  If it is present the result will
+#' contain individual pedigrees for each family in the set, which can be extacted using
+#' subscripts. Individuals need to have a unique id \emph{within} family.
+#' @param missid The founders are those with no father or mother in the pedigree. The dadid
+#' and momid values for these subjects will either be NA or the value of this variable. The
+#' default for missid is 0 if the id variable is numeric, and "" (empty string) otherwise.
+#' @param x pedigree object in print and subset methods
+#' @param max_message_n max number of individuals to list in error messages
+#' @param ... optional arguments passed to internal functions
+#' @param drop logical, used in subset function for dropping dimensionality
+#' @return An object of class \code{pedigree} or \code{pedigreeList} Containing the following items:
+#'  famid id findex mindex sex  affected status relation
+#' @author Terry Therneau
+#' @name pedigree
+#' @export
+
+
+pedigree <- function(id, dadid, momid,
+                     sex, affected,
+                     status, relation,
+                     famid, missid, max_message_n = 5) {
+  n <- length(id)
+  ## Code transferred from noweb to markdown vignette.
+  ## Sections from the noweb/vignettes are noted here with
+  ## Doc: Error and Data Checks
+
+  pedigree.idcheck(
+    id = id,
+    momid = momid,
+    dadid = dadid,
+    sex = sex
+  )
+
+
+  id <- pedigree.idrepair(id = id)
+
+  # momid <- pedigree.idrepair(id = momid)
+  # dadid <- pedigree.idrepair(id = dadid)
+
+  sex <- pedigree.sexrepair(sex = sex)
+
+  if (missing(missid)) {
+    missid <- pedigree.makemissingid(id = id)
+  }
+
+  nofather <- (is.na(dadid) | dadid == missid)
+  nomother <- (is.na(momid) | momid == missid)
+
+  if (!missing(famid)) {
+    if (any(is.na(famid))) stop("The family id cannot contain missing values")
+    if (is.factor(famid) || is.character(famid)) {
+      if (length(grep("^ *$", famid)) > 0) {
+        stop("The family id cannot be a blank or empty string")
+      }
+    }
+    # Make a temporary new id from the family and subject pair
+    oldid <- id
+    id <- paste(as.character(famid), as.character(id), sep = "/")
+    dadid <- paste(as.character(famid), as.character(dadid), sep = "/")
+    momid <- paste(as.character(famid), as.character(momid), sep = "/")
+  }
+  has_famid <- !missing(famid)
+
+  if (any(duplicated(id))) {
+    duplist <- id[duplicated(id)]
+    msg.n <- min(length(duplist), max_message_n)
+    stop(paste("Duplicate subject id:", duplist[1:msg.n]))
+  }
+  findex <- match(dadid, id, nomatch = 0)
+  if (any(sex[findex] != "male")) {
+    who <- unique((id[findex])[sex[findex] != "male"])
+    msg.n <- 1:min(max_message_n, length(who)) # Don't list a zillion
+    stop(paste(
+      "Id not male, but is a father:",
+      paste(who[msg.n], collapse = " ")
+    ))
+  }
+
+  if (any(findex == 0 & !nofather)) {
+    who <- dadid[which(findex == 0 & !nofather)]
+    msg.n <- 1:min(max_message_n, length(who)) # Don't list a zillion
+    stop(paste(
+      "Value of 'dadid' not found in the id list",
+      paste(who[msg.n], collapse = " ")
+    ))
+  }
+
+  mindex <- match(momid, id, nomatch = 0)
+
+  if (any(sex[mindex] != "female")) {
+    who <- unique((id[mindex])[sex[mindex] != "female"])
+    msg.n <- 1:min(max_message_n, length(who))
+    stop(paste(
+      "Id not female, but is a mother:",
+      paste(who[msg.n], collapse = " ")
+    ))
+  }
+
+  if (any(mindex == 0 & !nomother)) {
+    who <- momid[which(mindex == 0 & !nomother)]
+    msg.n <- 1:min(max_message_n, length(who)) # Don't list a zillion
+    stop(paste(
+      "Value of 'momid' not found in the id list",
+      paste(who[msg.n], collapse = " ")
+    ))
+  }
+
+  if (any(mindex == 0 & findex != 0) || any(mindex != 0 & findex == 0)) {
+    who <- id[which((mindex == 0 & findex != 0) | (mindex != 0 & findex == 0))]
+    msg.n <- 1:min(max_message_n, length(who)) # Don't list a zillion
+    stop(paste(
+      "Subjects must have both a father and mother, or have neither",
+      paste(who[msg.n], collapse = " ")
+    ))
+  }
+
+  if (!missing(famid)) {
+    if (any(famid[mindex] != famid[mindex > 0])) {
+      who <- (id[mindex > 0])[famid[mindex] != famid[mindex > 0]]
+      msg.n <- 1:min(max_message_n, length(who))
+      stop(paste(
+        "Mother's family != subject's family",
+        paste(who[msg.n], collapse = " ")
+      ))
+    }
+    if (any(famid[findex] != famid[findex > 0])) {
+      who <- (id[findex > 0])[famid[findex] != famid[findex > 0]]
+      msg.n <- 1:min(max_message_n, length(who))
+      stop(paste(
+        "Father's family != subject's family",
+        paste(who[msg.n], collapse = " ")
+      ))
+    }
+  }
+  ## Doc: Creation of Pedigrees
+  if (missing(famid)) {
+    temp <- list(id = id, findex = findex, mindex = mindex, sex = sex)
+  } else {
+    temp <- list(
+      famid = famid, id = oldid, findex = findex, mindex = mindex,
+      sex = sex
+    )
+  }
+
+  if (!missing(affected)) {
+    temp$affected <- pedigree.process_affected(affected, n)
+  }
+
+
+  if (!missing(status)) {
+    temp$status <- pedigree.process_status(status, n)
+  }
+
+  if (!missing(relation)) {
+    temp$relation <- pedigree.process_relation(
+      relation = relation,
+      has_famid = has_famid,
+      famid = if (has_famid) famid else NULL,
+      id = id,
+      momid = momid,
+      dadid = dadid,
+      sex = sex
+    )
+  }
+  ## Doc: Finish
+  if (missing(famid)) {
+    class(temp) <- "pedigree"
+  } else {
+    class(temp) <- "pedigreeList"
+  }
+  temp
+}
+
+#' @keywords internal
+#' @noRd
+pedigree.process_relation <- function(relation,
+                                      has_famid,
+                                      famid,
+                                      id,
+                                      momid,
+                                      dadid,
+                                      sex) {
+  rel_parsed <- pedigree.parse_relation(relation,
+    has_famid = has_famid
+  )
+  id1 <- rel_parsed$id1
+  id2 <- rel_parsed$id2
+  rel_code <- pedigree.coerce_relation_code(rel_parsed$code)
+  rel_famid <- rel_parsed$famid
+
+  # Ensure everyone in the relationship is in the pedigree
+  if (has_famid) {
+    key1 <- paste(as.character(rel_famid), as.character(id1), sep = "/")
+    key2 <- paste(as.character(rel_famid), as.character(id2), sep = "/")
+  } else {
+    key1 <- id1
+    key2 <- id2
+  }
+
+  indx1 <- match(key1, id, nomatch = 0L)
+  indx2 <- match(key2, id, nomatch = 0L)
+
+  if (any(indx1 == 0L | indx2 == 0L)) {
+    stop("Subjects in relationships that are not in the pedigree")
+  }
+  if (any(indx1 == indx2)) {
+    who <- indx1[indx1 == indx2]
+    stop(paste("Subject", id[who], "is their own spouse or twin"))
+  }
+
+  # Twin consistency checks
+  numeric_code <- as.numeric(rel_code)
+  if (any(numeric_code < 3L)) {
+    twins <- numeric_code < 3L
+    if (any(momid[indx1[twins]] != momid[indx2[twins]])) {
+      stop("Twins found with different mothers")
+    }
+    if (any(dadid[indx1[twins]] != dadid[indx2[twins]])) {
+      stop("Twins found with different fathers")
+    }
+  }
+
+  # MZ twin sex check
+  if (any(rel_code == "MZ twin")) {
+    mztwins <- rel_code == "MZ twin"
+    if (any(sex[indx1[mztwins]] != sex[indx2[mztwins]])) {
+      stop("MZ twins with different sexes")
+    }
+  }
+
+  if (has_famid == TRUE) {
+    data.frame(
+      famid = rel_famid,
+      indx1 = indx1,
+      indx2 = indx2,
+      code  = rel_code
+    )
+  } else {
+    data.frame(
+      indx1 = indx1,
+      indx2 = indx2,
+      code  = rel_code
+    )
+  }
+}
+
+pedigree.parse_relation <- function(relation, has_famid) {
+  if (is.matrix(relation)) {
+    expected_cols <- if (has_famid) 4L else 3L
+    if (ncol(relation) != expected_cols) {
+      if (has_famid) {
+        stop("Relation matrix must have 3 columns + famid")
+      } else {
+        stop("Relation matrix must have 3 columns: id1, id2, code")
+      }
+    }
+    id1 <- relation[, 1]
+    id2 <- relation[, 2]
+    code <- relation[, 3]
+    fam <- if (has_famid) relation[, 4] else NULL
+  } else if (is.data.frame(relation)) {
+    id1 <- relation$id1
+    id2 <- relation$id2
+    code <- relation$code
+    fam <- if (has_famid) relation$famid else NULL
+
+    if (is.null(id1) || is.null(id2) || is.null(code) ||
+      (has_famid && is.null(fam))) {
+      if (has_famid) {
+        stop("Relation data must have id1, id2, code, and family id")
+      } else {
+        stop("Relation data frame must have id1, id2, and code")
+      }
+    }
+  } else {
+    if (has_famid) {
+      stop("Relation argument must be a matrix or a dataframe")
+    } else {
+      stop("Relation argument must be a matrix or a list")
+    }
+  }
+
+  list(
+    id1 = id1,
+    id2 = id2,
+    code = code,
+    famid = fam
+  )
+}
+
+#'
+#' @keywords internal
+#' @noRd
+#' @inheritParams pedigree
+
+pedigree.idcheck <- function(id, momid, dadid, sex) {
+  n <- length(id)
+  if (length(momid) != n) stop("Mismatched lengths, id and momid")
+  if (length(dadid) != n) stop("Mismatched lengths, id and dadid")
+  if (length(sex) != n) stop("Mismatched lengths, id and sex")
+  if (any(is.na(id))) stop("Missing value for the id variable")
+}
+
+#' @keywords internal
+#' @noRd
+#' @inheritParams pedigree
+
+pedigree.idrepair <- function(id) {
+  if (!is.numeric(id)) {
+    id <- as.character(id)
+    if (length(grep("^ *$", id)) > 0) {
+      stop("A blank or empty string is not allowed as the id variable")
+    }
+  }
+  id
+}
+
+#' @keywords internal
+#' @noRd
+#' @inheritParams pedigree
+
+pedigree.sexrepair <- function(sex, codes = c("male", "female", "unknown", "terminated")
+                               ) {
+  # Allow for character/numeric
+  # Allow for character/numeric/factor in the sex variable
+  if (is.factor(sex)) {
+    sex <- as.character(sex)
+  }
+
+  if (is.character(sex)) {
+
+    if(all(unique(sex) %in% codes, na.rm = TRUE)){
+    sex <- charmatch(casefold(sex, upper = FALSE), codes,
+      nomatch = 3
+    )
+    } else if(all(unique(sex) %in% c("M","F","U", "T", NA), na.rm = TRUE)){
+
+      sex <- charmatch(casefold(sex, upper = FALSE), c("M", "F", "U", "T"),
+                       nomatch = 3
+      )
+
+    } else if (all(unique(suppressWarnings(as.numeric(sex))
+                          ) %in% c(0:4), na.rm = TRUE)) {
+      sex <- as.numeric(sex)
+  } else {
+    print(unique(sex))
+    #  stop("Invalid values for 'sex'")
+    sex <- charmatch(casefold(sex, upper = FALSE), codes,
+                     nomatch = 3
+    )
+  }
+  }
+  # assume either 0/1/2/4 =  female/male/unknown/term, or 1/2/3/4
+  #  if only 1/2 assume no unknowns
+  if (min(sex) == 0) {
+    sex <- sex + 1
+  }
+  sex <- ifelse(sex < 1 | sex > 4, 3, sex)
+  if (all(sex > 2)) {
+    stop("Invalid values for 'sex'. All values coded as Unknown")
+  } else if (mean(sex == 3) > 0.25) {
+    warning("More than 25% of the sex values are 'unknown'")
+  }
+  sex <- factor(sex, 1:4, labels = codes)
+
+  sex
+}
+
+
+#' @keywords internal
+#' @noRd
+pedigree.process_status <- function(status, n) {
+  if (length(status) != n) {
+    stop("Wrong length for affected")
+  }
+  if (is.logical(status)) status <- as.integer(status)
+  if (any(status != 0L & status != 1L)) {
+    stop("Invalid status code")
+  }
+  status
+}
+
+#' @keywords internal
+#' @noRd
+pedigree.process_affected <- function(affected, n) {
+  if (is.matrix(affected)) {
+    if (nrow(affected) != n) stop("Wrong number of rows in affected")
+    if (is.logical(affected)) affected <- 1L * affected
+  } else {
+    if (length(affected) != n) {
+      stop("Wrong length for affected")
+    }
+
+    if (is.logical(affected)) affected <- as.numeric(affected)
+    if (is.factor(affected)) affected <- as.numeric(affected) - 1
+  }
+  if (max(affected, na.rm = TRUE) > min(affected, na.rm = TRUE)) {
+    affected <- affected - min(affected, na.rm = TRUE)
+  }
+  if (!all(affected == 0 | affected == 1 | is.na(affected))) {
+    stop("Invalid code for affected status")
+  }
+  affected
+}
+
+
+## Doc: Subscripting a pedigree
+#' @rdname pedigree
+#' @export
+"[.pedigreeList" <- function(x, ..., drop = FALSE) {
+  if (length(list(...)) != 1) stop("Only 1 subscript allowed")
+  ufam <- unique(x$famid)
+  if (is.factor(..1) || is.character(..1)) {
+    indx <- ufam[match(..1, ufam)]
+  } else {
+    indx <- ufam[..1]
+  }
+
+  if (any(is.na(indx))) {
+    stop(paste("Family", (..1[is.na(indx)])[1], "not found"))
+  }
+
+  keep <- which(x$famid %in% indx) # which rows to keep
+  for (i in c("id", "famid", "sex")) {
+    x[[i]] <- (x[[i]])[keep]
+  }
+
+  kept.rows <- (1:length(x$findex))[keep]
+  x$findex <- match(x$findex[keep], kept.rows, nomatch = 0)
+  x$mindex <- match(x$mindex[keep], kept.rows, nomatch = 0)
+
+  # optional components
+  if (!is.null(x$status)) x$status <- x$status[keep]
+  if (!is.null(x$affected)) {
+    if (is.matrix(x$affected)) {
+      x$affected <- x$affected[keep, , drop = FALSE]
+    } else {
+      x$affected <- x$affected[keep]
+    }
+  }
+  if (!is.null(x$relation)) {
+    keep <- !is.na(match(x$relation$famid, indx))
+    if (any(keep)) {
+      x$relation <- x$relation[keep, , drop = FALSE]
+      ## Update twin id indexes
+      x$relation$indx1 <- match(x$relation$indx1, kept.rows, nomatch = 0)
+      x$relation$indx2 <- match(x$relation$indx2, kept.rows, nomatch = 0)
+      ## If only one family chosen, remove famid
+      if (length(indx) == 1) {
+        x$relation$famid <- NULL
+      }
+    } else {
+      x$relation <- NULL
+    } # No relations matrix elements for this family
+  }
+
+  if (length(indx) == 1) {
+    class(x) <- "pedigree"
+  } # only one family chosen
+  else {
+    class(x) <- "pedigreeList"
+  }
+  x
+}
+
+#' @rdname pedigree
+#' @export
+"[.pedigree" <- function(x, ..., drop = FALSE) {
+  if (length(list(...)) != 1) stop("Only 1 subscript allowed")
+  if (is.character(..1) || is.factor(..1)) {
+    i <- match(..1, x$id)
+  } else {
+    i <- (1:length(x$id))[..1]
+  }
+
+  if (any(is.na(i))) paste("Subject", ..1[which(is.na(i))][1], "not found")
+
+  z <- list(
+    id = x$id[i], findex = match(x$findex[i], i, nomatch = 0),
+    mindex = match(x$mindex[i], i, nomatch = 0),
+    sex = x$sex[i]
+  )
+  if (!is.null(x$affected)) {
+    if (is.matrix(x$affected)) {
+      z$affected <- x$affected[i, , drop = FALSE]
+    } else {
+      z$affected <- x$affected[i]
+    }
+  }
+  if (!is.null(x$famid)) z$famid <- x$famid[i]
+
+
+  if (!is.null(x$relation)) {
+    indx1 <- match(x$relation$indx1, i, nomatch = 0)
+    indx2 <- match(x$relation$indx2, i, nomatch = 0)
+    keep <- (indx1 > 0 & indx2 > 0) # keep only if both id's are kept
+    if (any(keep)) {
+      z$relation <- x$relation[keep, , drop = FALSE]
+      z$relation$indx1 <- indx1[keep]
+      z$relation$indx2 <- indx2[keep]
+    }
+  }
+
+  if (!is.null(x$hints)) {
+    temp <- list(order = x$hints$order[i])
+    if (!is.null(x$hints$spouse)) {
+      indx1 <- match(x$hints$spouse[, 1], i, nomatch = 0)
+      indx2 <- match(x$hints$spouse[, 2], i, nomatch = 0)
+      keep <- (indx1 > 0 & indx2 > 0) # keep only if both id's are kept
+      if (any(keep)) {
+        temp$spouse <- cbind(
+          indx1[keep], indx2[keep],
+          x$hints$spouse[keep, 3]
+        )
+      }
+    }
+    z$hints <- temp
+  }
+
+  if (any(z$findex == 0 & z$mindex > 0) | any(z$findex > 0 & z$mindex == 0)) {
+    stop("A subpedigree cannot choose only one parent of a subject")
+  }
+  class(z) <- "pedigree"
+  z
+}
+#' @keywords internal
+#' @noRd
+pedigree.coerce_relation_code <- function(code) {
+  levels <- c("MZ twin", "DZ twin", "UZ twin", "spouse")
+
+  if (is.factor(code)) {
+    code <- as.character(code)
+  }
+
+  if (is.numeric(code)) {
+    if (any(code < 1L | code > 4L | is.na(code))) {
+      stop("Invalid relationship code")
+    }
+    factor(code, levels = 1:4, labels = levels)
+  } else {
+    idx <- match(code, levels)
+    if (any(is.na(idx))) {
+      stop("Invalid relationship code")
+    }
+    factor(idx, levels = 1:4, labels = levels)
+  }
+}
+#' @rdname pedigree
+#' @method print pedigree
+print.pedigree <- function(x, ...) {
+  cat("Pedigree object with", length(x$id), "subjects")
+  if (!is.null(x$famid)) {
+    cat(", family id=", x$famid[1], "\n")
+  } else {
+    cat("\n")
+  }
+  cat("Bit size=", kinship2_bitSize(x)$bitSize, "\n")
+}
+
+#' @rdname pedigree
+#' @method print pedigreeList
+print.pedigreeList <- function(x, ...) {
+  cat(
+    "Pedigree list with", length(x$id), "total subjects in",
+    length(unique(x$famid)), "families\n"
+  )
+}
+
+
+pedigree.makemissingid <- function(id) {
+  ## Doc:  Errors2
+
+  if (is.numeric(id)) {
+    0
+  } else {
+    ""
+  }
+}


A new file "R/kinship2_pedigree.X" has been added that appears to contain an alternative implementation of pedigree functions. The .X extension is unusual for R source files and may cause issues with R package building, as R typically expects .R files. If this is meant to be temporary or experimental code, consider using a different approach like a feature branch or moving it outside the R/ directory. If it's meant to replace the existing kinship2_pedigree.R file, it should be properly renamed with a .R extension.

Copilot · 2025-12-19T23:04:43Z

    #  debug = TRUE,
    config = list(
      code_male = 0,
+      code_female=1,


Missing space after the equals sign. Should be code_female = 1 for consistent formatting with the rest of the codebase.

Copilot · 2025-12-19T23:04:45Z

-    )
+    # if all numeric strings, convert to numeric
+    if (length(unique(suppressWarnings(as.numeric(sex)))) > 1 &&
+      all(is.numeric(unique(suppressWarnings(as.numeric(sex)))), na.rm = TRUE)) {


The condition on line 354-355 has a logical issue. It checks if length(unique(suppressWarnings(as.numeric(sex)))) > 1 AND all(is.numeric(...)). However, as.numeric() always returns a numeric vector, so is.numeric() will always be TRUE. The intended check is likely whether the conversion from character to numeric produces valid numbers (i.e., not all NAs). Consider using !all(is.na(suppressWarnings(as.numeric(sex)))) instead of all(is.numeric(...), na.rm = TRUE) to check if the conversion was successful.

Suggested change

all(is.numeric(unique(suppressWarnings(as.numeric(sex)))), na.rm = TRUE)) {

!all(is.na(suppressWarnings(as.numeric(sex))))) {

Copilot · 2025-12-19T23:04:45Z

+      sex <- as.numeric(sex)
+  } else {
+    print(unique(sex))
+    #  stop("Invalid values for 'sex'")


There's a commented-out debugging print statement that should be removed for production code. If debugging is needed, consider using the config$debug flag that's already established in the codebase.

Suggested change

# stop("Invalid values for 'sex'")

Copilot · 2025-12-19T23:04:46Z

 .adjustSpacing <- function(ds, config) {
+  # set shift y to have min at zero
+  min_y <- min(ds$y_pos, na.rm = TRUE)
+  if (min_y > 0) {


The condition on line 100 checks if min_y > 0, but the goal stated in the comment is to "have min at zero". This logic seems inverted - if the minimum is already at or below zero, no adjustment would be needed. If the minimum is above zero, shifting down makes sense. However, if the minimum is negative, it should also be shifted up to zero. Consider changing the condition to if (min_y != 0) to handle both cases correctly.

Suggested change

if (min_y > 0) {

if (min_y != 0) {

Copilot · 2025-12-19T23:04:46Z

+  expect_silent(buildPlotConfig(default_config, config, pedigree_size = 1))
+  expect_warning(buildPlotConfig(default_config, config))


Line 93 adds a test that expects a warning when pedigree_size is not provided to buildPlotConfig. However, this test appears to expect a warning for all cases where pedigree_size is missing, but according to the implementation in R/defaultPlotConfig.R (lines 768-774), the warning should only occur when point_scale_by_pedigree is TRUE and pedigree_size is NULL. Since the test uses the default config which has point_scale_by_pedigree = TRUE by default, this test should pass, but the test name and structure don't make this dependency clear. Consider adding a comment explaining why this warning is expected.

Copilot · 2025-12-19T23:04:46Z

 test_that("buildPlotConfig returns a list", {
  default_config <- getDefaultPlotConfig(function_name = "ggrelatednessmatrix")
  custom <- list(point_size = 2)
-
-  result <- buildPlotConfig(default_config, custom)
+  pedigree_size <- 3
+  result <- buildPlotConfig(default_config, custom, pedigree_size = pedigree_size)
+  expect_equal(result$point_size, 2 / sqrt(pedigree_size), tolerance = 1e-8)


The test on line 132 expects that when point_scale_by_pedigree is TRUE (the default), the point size will be scaled by dividing by sqrt(pedigree_size). However, this test is for "ggrelatednessmatrix" function which may have different defaults. The behavior might be inconsistent across different function types. Consider verifying that this scaling behavior is intended for all functions or adding assertions about the point_scale_by_pedigree setting.

Copilot · 2025-12-19T23:04:47Z

    #  debug = TRUE,
    config = list(
      code_male = 0,
+      code_female=1,


Missing space after the equals sign. Should be code_female = 1 for consistent formatting with the rest of the codebase.

Copilot · 2025-12-19T23:06:22Z

@smasongarrison I've opened a new pull request, #101, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: smasongarrison <6001608+smasongarrison@users.noreply.github.com>

Fix spelling errors in pedigree construction error messages

smasongarrison and others added 18 commits December 12, 2025 12:55

Remove debug print statements from calcConnections.R

257d629

Eliminated print statements that output the classes of ped$spouseID, ped$personID, ped$momID, and ped$dadID. This cleans up the function by removing unnecessary debug output.

rename

2a0a7d5

improve sex code handling

615e01e

Updates sex code handling in alignPedigreeWithRelations and pedigree.sexrepair to better support non-standard and numeric codes. Enhances tests to cover new config options and warning handling for sex code edge cases.

Merge branch 'dev' into sexcoding

19d32f7

Merge branch 'main' into sexcoding

c05c4b2

Update test-ggpedigree_datainputs.R

a1bae3b

Adjust y-axis handling and spacing in pedigree plots

578f26c

Sets scale_y_reverse limits to use minimum y position and normalizes y positions to start at zero in .adjustSpacing. Also updates test to clarify error handling for character IDs.

Update test-ggpedigree_datainputs.R

8b83f95

Update test-ggpedigree_datainputs.R

a6d2145

Merge branch 'sexcoding' into fix-topmargin

4137501

Merge pull request #99 from R-Computing-Lab/fix-topmargin

a4ecc2c

Adjust y-axis handling and spacing in pedigree plots

Merge branch 'fix-topmargin' into sexcoding

1523b7d

Merge branch 'sexcoding' of https://github.com/R-Computing-Lab/ggpedi…

7ac68cc

…gree into sexcoding

Add point scaling by pedigree size to plot config

0ba1f03

Introduces a new option to scale point sizes based on pedigree size in plotting functions. Updates documentation, function signatures, and tests to support the new parameter and ensure correct behavior.

smasongarrison force-pushed the sexcoding branch 2 times, most recently from 43b803b to 91f5f96 Compare December 19, 2025 22:07

Update test-ggpedigree_datainputs.R

17639d5

smasongarrison force-pushed the sexcoding branch from 91f5f96 to 17639d5 Compare December 19, 2025 22:40

smasongarrison requested a review from Copilot December 19, 2025 22:58

Copilot started reviewing on behalf of smasongarrison December 19, 2025 22:59 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

Initial plan

4468c10

Copilot AI mentioned this pull request Dec 19, 2025

Fix spelling errors in pedigree construction error messages #101

Merged

Copilot AI and others added 2 commits December 19, 2025 23:08

Fix spelling errors in error message

4b179f8

Co-authored-by: smasongarrison <6001608+smasongarrison@users.noreply.github.com>

Merge pull request #101 from R-Computing-Lab/copilot/sub-pr-100

1b3bf99

Fix spelling errors in pedigree construction error messages

smasongarrison merged commit 854c940 into dev Dec 20, 2025
5 checks passed

smasongarrison deleted the sexcoding branch December 20, 2025 17:34

		stop("Error in constructing pedigree object. Please check that the you've
		correctly specificed the sex of individuals. Setting code_male may help if non-standard codes are used (e.g., 'M'/'F'; '1,2').")

	all(is.numeric(unique(suppressWarnings(as.numeric(sex)))), na.rm = TRUE)) {
	!all(is.na(suppressWarnings(as.numeric(sex))))) {

		expect_silent(buildPlotConfig(default_config, config, pedigree_size = 1))
		expect_warning(buildPlotConfig(default_config, config))

Conversation

smasongarrison commented Dec 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

smasongarrison Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants