Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
42ae819
Update test-makeLinks.R
smasongarrison Apr 13, 2026
8bf36d6
Initial plan
Copilot Apr 13, 2026
b6de8f3
Add simulatePedigrees() function to simulate multiple families at once
Copilot Apr 13, 2026
34f818b
Address code review: input validation, use rbindlist, fix test assert…
Copilot Apr 13, 2026
f3910a3
Fix minor punctuation in man/simulatePedigrees.Rd documentation
Copilot Apr 13, 2026
1c193a4
feat: remap IDs to sequential integers in simulatePedigrees()
Copilot Apr 13, 2026
32684f6
Squashed commit of the following:
smasongarrison Apr 13, 2026
b33820a
smarter id handling
smasongarrison Apr 13, 2026
4455fc7
pipes!
smasongarrison Apr 13, 2026
c2fa605
Update v2_pedigree.Rmd
smasongarrison Apr 13, 2026
217e71f
Merge pull request #150 from R-Computing-Lab/copilot/feature-simulate…
smasongarrison Apr 15, 2026
58bcd20
refactor
smasongarrison Apr 16, 2026
da43ec1
condenseMatrixSlots ADD
smasongarrison Apr 16, 2026
4828352
documentation
smasongarrison Apr 17, 2026
caadf62
Split the simulation file into multiple scripts
smasongarrison Apr 17, 2026
b58cc66
docs
smasongarrison Apr 17, 2026
9ea72b4
white space
smasongarrison May 20, 2026
18293c2
gedcom reader fix
smasongarrison May 20, 2026
35455bd
Import stats::setNames; docs and test tweaks
smasongarrison May 20, 2026
dc49c7f
another record
smasongarrison May 20, 2026
ebc711e
Add pedigree matrix docs and author entry
smasongarrison May 22, 2026
e70790e
clean up royal 92
smasongarrison May 22, 2026
9f61013
Support optional mtdna and mit_val=NULL in slicing
smasongarrison May 23, 2026
0c6df99
add focal collumn
smasongarrison May 23, 2026
2647d0b
Create test-ped2focal.R
smasongarrison May 23, 2026
732aade
Add ped2focal/ped2addFocal docs & exports
smasongarrison May 23, 2026
b9b3b38
Update test-sliceFamilies.R
smasongarrison May 23, 2026
49ced9f
docs
smasongarrison May 23, 2026
524223d
Update readGedcom.R
smasongarrison May 23, 2026
833d084
Update readGedcom.R
smasongarrison May 24, 2026
0193e5c
Chain standardize and parse dates
smasongarrison May 24, 2026
3c2824d
styler
smasongarrison May 24, 2026
38755b7
Add generational distance (ped2genDist)
smasongarrison May 24, 2026
425d07c
Clean up docs and comments in ped2genDist.R
smasongarrison May 24, 2026
811d89f
seq_len
smasongarrison May 24, 2026
c707385
clean up names
smasongarrison May 24, 2026
885bf7e
clean up names
smasongarrison May 24, 2026
e78e211
Use full title for Countess of Strathmore
smasongarrison May 25, 2026
389694f
move some build component helpers to own file
smasongarrison May 26, 2026
f7ed177
Merge branch 'dev_main' of https://github.com/R-Computing-Lab/BGmisc …
smasongarrison May 26, 2026
90c6433
cleanup
smasongarrison May 26, 2026
e193850
add more to family tree
smasongarrison May 26, 2026
bcdc0fe
another batch
smasongarrison May 26, 2026
6753454
Update royal92 dataset dates and overrides
smasongarrison May 26, 2026
9f7c3bc
150-299
smasongarrison May 26, 2026
60ef8ca
300-600
smasongarrison May 26, 2026
c63e3d1
600-900
smasongarrison May 26, 2026
16c19a3
handling duplicate, and handling 900-1050
smasongarrison May 26, 2026
e851fc7
1051-1300
smasongarrison May 26, 2026
89393b5
next batch
smasongarrison May 26, 2026
04b3f27
1551-1800
smasongarrison May 26, 2026
d6b0068
1801 through 2050
smasongarrison May 26, 2026
3d485f6
2050-2300
smasongarrison May 26, 2026
2a407c4
2300-2550
smasongarrison May 26, 2026
9356a3a
2551 through 2800
smasongarrison May 26, 2026
b25debf
final batch
smasongarrison May 26, 2026
9acc8a0
added names
smasongarrison May 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,5 @@ CITATION.cff$
^test-clean\.X
^vignettes/articles$
^.claude$
^\.positai$
^\.claude$
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,5 @@ revdep/
/.claude
vignettes/understanding_relatedness.Rmd
vignettes/understanding_relatedness.Xmd
.positai
*.tmp.*
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: BGmisc
Title: An R Package for Extended Behavior Genetics Analysis
Version: 1.7.0.1.1
Version: 1.8.0
Authors@R: c(
person("S. Mason", "Garrison", , "garrissm@wfu.edu", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-4804-6003")),
Expand Down Expand Up @@ -59,4 +59,5 @@ Config/testthat/edition: 3
Encoding: UTF-8
Language: en-US
LazyData: true
RoxygenNote: 7.3.3
Config/roxygen2/version: 8.0.0
RoxygenNote: 8.0.0
10 changes: 10 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -25,20 +25,28 @@ export(dropLink)
export(findLeaves)
export(fitComponentModel)
export(fitPedigreeModel)
export(getGenDist)
export(getWikiTreeSummary)
export(identifyComponentModel)
export(inferRelatedness)
export(makeInbreeding)
export(makeTwins)
export(ped2add)
export(ped2addFocal)
export(ped2ce)
export(ped2cn)
export(ped2cnFocal)
export(ped2com)
export(ped2fam)
export(ped2focal)
export(ped2gen)
export(ped2genDist)
export(ped2genDistFocal)
export(ped2genFocal)
export(ped2graph)
export(ped2maternal)
export(ped2mit)
export(ped2mitFocal)
export(ped2paternal)
export(readGed)
export(readGedcom)
Expand All @@ -49,6 +57,7 @@ export(related_coef)
export(repairIDs)
export(repairSex)
export(simulatePedigree)
export(simulatePedigrees)
export(sliceFamilies)
export(summariseFamilies)
export(summariseMatrilines)
Expand All @@ -68,3 +77,4 @@ importFrom(stats,na.omit)
importFrom(stats,pnorm)
importFrom(stats,qnorm)
importFrom(stats,runif)
importFrom(stats,setNames)
21 changes: 13 additions & 8 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
# BGmisc NEWS

# Development version:
## BGmisc 1.7.0.1.1
* Optimized sliceFamilies to be more abstract
## BGmisc 1.8.0
* Optimized gedcom reader, com2links for speed and memory usage, with a focus on large pedigrees
* Fixed bug in gedcom reader that resulted in document records being added to the final person in the pedigree
* Optimized sliceFamilies to be more abstract, and no longer require mtdna
* Created `.require_openmx()` to make it easier to use OpenMx functions without making OpenMx a dependency
* Smarter string ID handling for ped2id
* Fixed how different-sized matrices are handled by `com2links()`
* Added alignPhenToMatrix function to align phenotypic data to the order of the relatedness matrix
* Added `simulatePedigrees()` function to easily simulate multiple families at once and return them as a single combined data frame
* Refactor openmx wrapper functions
* Added `ped2focal()` core function and component-specific wrappers (`ped2addFocal()`, `ped2mitFocal()`/`ped2mtFocal()`, `ped2cnFocal()`, `ped2genFocal`) to compute relatedness between all pedigree members and a single focal individual, appending the result as a new column on the pedigree data frame. `ped2focal()` is a general function that can be used with any relatedness method, while the component-specific wrappers provide convenient shortcuts for common use cases. Note that Individuals excluded via `keep_ids` are coded as `NA`; all others receive their computed value with genuine zeros made explicit.
* Added `getGenDist()`, `ped2genDistFocal()`, and `ped2genDist()` for computing generational distance between individuals. Supports five methods: `rank` (absolute generation-number difference via `ped2gen`), `path` (minimum parent-child steps through any shared ancestor), `mrca_min` (total steps via the most recent common ancestor), `mrca_max` (total steps via the most distant common ancestor), and `mrca_all` (aggregation across all common ancestors — strategy to be defined). Output forms cover single pairs, a focal column appended to the pedigree, and a full n×n pairwise matrix.
* Optimized `countPatternRows()` in the GEDCOM reader to use `fixed = TRUE` string matching and a pre-extracted column vector, reducing redundant work across 31 pattern passes

# BGmisc 1.7.0.0
* Fixed bug in parList
Expand All @@ -19,7 +24,7 @@
* Allow confidence intervals for pedigree mx wrappers

# BGmisc 1.6.0.1
## CRAN submission
* CRAN submission
* Add OpenMx pedigree model builders and docs
* Added vignette for OpenMx pedigree model builders
* Add option for MZ twins in the additive genetic matrix
Expand All @@ -41,14 +46,14 @@
* Tweaked how sex coding is handled to allow for unknown sex

# BGmisc 1.5.1
## CRAN submission
* CRAN submission
* partially refactored summarizePedigree to be more modular
* added compression control to ped2com
* Minor copy editing


# BGmisc 1.5.0
## CRAN submission
* CRAN submission
* Removed ASOIAF dataset from BGmisc, now in ggpedigree
* Enhancing potter family tree
* updated tests to handle the transition of ASOIAF data to ggpedigree
Expand Down Expand Up @@ -79,7 +84,7 @@
# BGmisc 1.4.2
* Added twinIDs for potter and asoiaf pedigrees
* Added twinID to simulatePedigree function, and extended to include MZ, DZ, and SS twins.
* Added a few more tests for simulatePedigree
* Added additional tests for simulatePedigree
* Added function to easily add new person to a pedigree
* Updated ASOIAF pedigree to reduce missing parents
* Added a few more tests for simulatePedigree helpers
Expand Down
152 changes: 36 additions & 116 deletions R/buildComponent.R
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,6 @@
force_symmetric = force_symmetric
)


#------
# Checkpointing
#------
Expand All @@ -106,7 +105,8 @@
"additive",
"common nuclear",
"mitochondrial",
"mtdna", "mitochondria"
"mtdna", "mitochondria",
"distance"
)
)

Expand Down Expand Up @@ -194,10 +194,8 @@
config = config
)


## B. Resume loop from the next uncomputed index


# Construct sparse matrix
# Garbage collection if gc is TRUE
if (config$gc == TRUE) {
Expand Down Expand Up @@ -259,7 +257,6 @@
# TODO merge twin columns
# --- Step 2: Compute Relatedness Matrix ---


if (config$resume == TRUE && file.exists(checkpoint_files$ram_checkpoint)) {
if (config$verbose == TRUE) cat("Resuming: Loading completed RAM matrix...\n")
r <- readRDS(checkpoint_files$ram_checkpoint)
Expand Down Expand Up @@ -287,6 +284,17 @@
}
maxCount <- config$max_gen + 1

# Ancestor distance matrix: ancDist[i, j] = min parent-child steps from i
# up to ancestor j; NA = j is not an ancestor of i; diagonal = 0 (self).
# Only allocated for the "distance" component; other components ignore it.
if (config$component == "distance") {
ancDist <- matrix(NA_integer_,
nrow = config$nr, ncol = config$nr,
dimnames = list(ped$ID, ped$ID)
)
diag(ancDist) <- 0L
}

if (config$verbose == TRUE) {
cat("About to do RAM path tracing\n")
}
Expand All @@ -304,6 +312,21 @@
while (mtSum != 0 && count < maxCount) {
r <- r + newIsPar
gen <- gen + (Matrix::rowSums(newIsPar) > 0)

# Record first-hit ancestor distances. At this point newIsPar = A^(count+1),
# so the step count for these entries is count + 1.
if (config$component == "distance") {
Ak_t <- methods::as(newIsPar, "TsparseMatrix")
ri <- Ak_t@i + 1L # 0-based → 1-based row (child)
ci <- Ak_t@j + 1L # 0-based → 1-based col (ancestor)
if (length(ri) > 0L) {
fresh <- is.na(ancDist[cbind(ri, ci)])
if (any(fresh)) {
ancDist[cbind(ri[fresh], ci[fresh])] <- count + 1L
}
}
}

newIsPar <- newIsPar %*% isPar
mtSum <- sum(newIsPar)
count <- count + 1
Expand Down Expand Up @@ -341,6 +364,13 @@
verbose_message = "Subsetting generation component to %d target individuals\n"
)
return(gen)
} else if (config$component == "distance") { # no need to do the rest
if (!is.null(config$keep_ids)) {
keep_idx <- match(as.character(config$keep_ids), rownames(ancDist))
keep_idx <- keep_idx[!is.na(keep_idx)]
ancDist <- ancDist[keep_idx, , drop = FALSE]
}
return(ancDist)
} else {
if (config$verbose == TRUE) {
cat("Completed RAM path tracing\n")
Expand Down Expand Up @@ -599,7 +629,6 @@
}
}


if (force_symmetric == TRUE) {
Matrix::forceSymmetric(do.call(rbind, blocks))
} else {
Expand All @@ -611,58 +640,12 @@
result
}

#' Initialize checkpoint files
#' @inheritParams ped2com
#' @keywords internal

initializeCheckpoint <- function(config = list(
verbose = FALSE,
saveable = FALSE,
resume = FALSE,
save_path = "checkpoint/"
)) {
# Define checkpoint files
# Ensure save path exists
if (config$saveable == TRUE && !dir.exists(config$save_path)) {
if (config$verbose == TRUE) cat("Creating save path...\n")
dir.create(config$save_path, recursive = TRUE)
} else if (config$resume == TRUE && !dir.exists(config$save_path)) {
stop("Cannot resume from checkpoint. Save path does not exist.")
}

checkpoint_files <- list(
parList = file.path(config$save_path, "parList.rds"),
lens = file.path(config$save_path, "lens.rds"),
isPar = file.path(config$save_path, "isPar.rds"),
iss = file.path(config$save_path, "iss.rds"),
jss = file.path(config$save_path, "jss.rds"),
isChild = file.path(config$save_path, "isChild.rds"),
r_checkpoint = file.path(config$save_path, "r_checkpoint.rds"),
gen_checkpoint = file.path(config$save_path, "gen_checkpoint.rds"),
newIsPar_checkpoint = file.path(
config$save_path,
"newIsPar_checkpoint.rds"
),
mtSum_checkpoint = file.path(config$save_path, "mtSum_checkpoint.rds"),
ram_checkpoint = file.path(config$save_path, "ram_checkpoint.rds"),
r2_checkpoint = file.path(config$save_path, "r2_checkpoint.rds"),
tcrossprod_checkpoint = file.path(
config$save_path,
"tcrossprod_checkpoint.rds"
),
tcrossprod_ids = file.path(config$save_path, "tcrossprod_ids.rds"),
count_checkpoint = file.path(config$save_path, "count_checkpoint.rds"),
final_matrix = file.path(config$save_path, "final_matrix.rds")
)

checkpoint_files
}

#' Assign parent values based on component type
#' @inheritParams ped2com
.assignParentValue <- function(component) {
# Set parent values depending on the component type
if (component %in% c("generation", "additive")) {
if (component %in% c("generation", "additive", "distance")) {
parVal <- .5
} else if (component %in%
c("common nuclear", "mitochondrial", "mtdna", "mitochondria")) {
Expand All @@ -673,30 +656,6 @@
parVal
}

#' Load or compute a checkpoint
#' @param file The file path to load the checkpoint from.
#' @param compute_fn The function to compute the checkpoint if it doesn't exist.
#' @param config A list containing configuration parameters such as `resume`, `verbose`, and `saveable`.
#' @param message_resume Optional message to display when resuming from a checkpoint.
#' @param message_compute Optional message to display when computing the checkpoint.
#' @param compress a logical specifying whether saving to a named file is to use "gzip" compression, or one of "gzip", "bzip2", "xz" or "zstd" to indicate the type of compression to be used. Ignored if file is a connection.
#' @return The loaded or computed checkpoint.
#' @keywords internal
loadOrComputeCheckpoint <- function(file, compute_fn,
config, message_resume = NULL,
message_compute = NULL,
compress = TRUE) {
if (config$resume == TRUE && file.exists(file)) {
if (config$verbose == TRUE && !is.null(message_resume)) cat(message_resume)
readRDS(file)
} else {
if (config$verbose == TRUE && !is.null(message_compute)) cat(message_compute)
result <- compute_fn()
if (config$saveable == TRUE) saveRDS(result, file = file, compress = compress)
result
}
}

#' Load or compute the isPar matrix
#' @inheritParams loadOrComputeCheckpoint
#' @inheritParams ped2com
Expand Down Expand Up @@ -858,45 +817,6 @@
}
list_of_adjacencies
}

Check notice on line 820 in R/buildComponent.R

View check run for this annotation

codefactor.io / CodeFactor

R/buildComponent.R#L820

Remove trailing blank lines. (trailing_blank_lines_linter)

Check notice on line 821 in R/buildComponent.R

View check run for this annotation

codefactor.io / CodeFactor

R/buildComponent.R#L821

Remove trailing blank lines. (trailing_blank_lines_linter)
#' Subset output to requested IDs
#' @inheritParams ped2com
#' @param component A component to subset.
#' @param keep_ids Character vector of IDs to retain.
#' @param available_ids Character vector of IDs available in \code{x}.
#' @param verbose_message Character. Message prefix to print when \code{config$verbose == TRUE}.
#' @param drop logical. Passed to \code{[} when subsetting matrices.
#' @keywords internal
.subsetKeepIds <- function(component, keep_ids = NULL, available_ids, config,
verbose_message = "Subsetting to %d target individuals\n",
drop = FALSE) {
if (is.null(keep_ids)) {
return(component)
}

idx <- match(keep_ids, available_ids)
missing <- keep_ids[is.na(idx)]

if (length(missing) > 0) {
warning(
length(missing), " keep_ids not found in pedigree and will be dropped: ",
paste(Matrix::head(missing, 5), collapse = ", "),
if (length(missing) > 5) " ..." else ""
)
}

idx <- idx[!is.na(idx)]

if (config$verbose == TRUE) {
cat(sprintf(verbose_message, length(idx)))
}
# consequence is missing data
if (is.matrix(component) || methods::is(component, "Matrix")) {
component <- component[idx, , drop = drop]
} else {
component <- component[idx]
}

Check notice on line 822 in R/buildComponent.R

View check run for this annotation

codefactor.io / CodeFactor

R/buildComponent.R#L822

Remove trailing blank lines. (trailing_blank_lines_linter)
component
}
Loading
Loading