Skip to content

Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4)#381

Open
Copilot wants to merge 21 commits intomainfrom
copilot/add-example-model-hers-dataset
Open

Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4)#381
Copilot wants to merge 21 commits intomainfrom
copilot/add-example-model-hers-dataset

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

Integrates a HERS dataset linear regression example into the "Linear Models Overview" chapter, interspersed with the existing birthweight analysis to mirror the narrative flow of Vittinghoff Ch. 4.

Changes Made

HERS content subfiles (all at repo root, for easy repositioning)

  • _sec_hers_data.qmd — HERS data introduction: data loading, scatter plot (faceted by HT with label_both), pairwise key-variables plot, and data notation (H, B, V variables)
  • _sec_hers_lm_parallel.qmd — Parallel lines model: LDL ~ HT + BMI; fitted-values plot faceted by HT
  • _sec_hers_lm_interact.qmd — Interaction model: LDL ~ HT + BMI + statins + BMI:statins; plot faceted by HT
  • _sec_hers_lm_gof.qmd — Goodness of fit: AIC/BIC table and deviance for both HERS models
  • _sec_hers_lm_diagnostics.qmd — Diagnostics: residuals vs fitted and QQ plots for both hers_lm1 and hers_lm2, all faceted by HT with label_both
  • _sec_hers_lm_model_selection.qmd — Model selection: LRT comparing the two HERS models

Structure in _sec_linreg_understanding.qmd

  • HERS data intro (_sec_hers_data.qmd) inserted directly after the birthweight dataset section
  • HERS parallel model included after the birthweight parallel lines coefficient interpretation
  • HERS interaction model included after the birthweight interaction coefficient comparison table

"Assessing model fit" section (_sec_linreg_assess.qmd)

  • HERS goodness-of-fit examples added after the birthweight GOF subsection
  • HERS diagnostic examples added after the birthweight Diagnostics subsection
  • HERS model selection examples added after the birthweight Model Selection subsection

Other

  • Interaction model uses BMI:statins (not BMI:HT); statin use $V$ added to data notation
  • All HERS plots faceted by HT using facet_wrap(~HT, labeller = label_both)
  • Factor reference levels explicitly set: relevel(ref = "placebo") for HT, relevel(ref = "no") for statins
  • GGally::ggpairs() key-variables panel added to the HERS data tabset, showing LDL, HT, BMI, statins, and age colored by treatment group
  • inst/WORDLIST deduplicated and sorted (105 unique words after merging with main)
  • Deleted unreferenced standalone _sec_exm_hers_lm.qmd

Testing

  • ✅ Lint passes (lintr::lint() on all changed .qmd files)
  • ✅ Spell check passes (spelling::spell_check_package())

Copilot AI and others added 2 commits April 7, 2026 20:05
Copilot AI changed the title [WIP] Add an example model with the HERS dataset for linear regression Add HERS dataset linear regression example (Vittinghoff Ch. 4) Apr 7, 2026
Copilot AI requested a review from d-morrison April 7, 2026 20:08
@d-morrison
Copy link
Copy Markdown
Owner

add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

@d-morrison
Copy link
Copy Markdown
Owner

@copilot add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

Copilot AI and others added 2 commits April 7, 2026 20:41
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 7, 2026

@copilot add a model with interactions, like we did for the birthweight data. intersperse the HERS analysis with the corresponding sections of the birthweight analysis

Done in commit 27418f9. The HERS analysis is now interspersed with the birthweight sections:

  • 2.2.4 Motivating example: hers data — data loading, scatter plot, and notation (H = HT treatment, B = BMI)
  • 2.2.5 Parallel lines regression for hers data — LDL ~ HT + BMI, mirroring the birthweight parallel lines model
  • 2.2.7 Interactions in hers data — LDL ~ HT + BMI + HT:BMI, added right after the birthweight interaction section (2.2.6)

Each HERS section follows the same structure as its birthweight counterpart: model equation, parameter table, and fitted-values plot.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • glmnet.stanford.edu
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R (dns block)
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R unam�� (dns block)
  • olsrr.rsquaredacademy.com
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R (dns block)
    • Triggering command: /opt/R/4.5.3/lib/R/bin/exec/R /opt/R/4.5.3/lib/R/bin/exec/R --no-echo --no-restore --file=/opt/quarto/share/rmd/rmd.R unam�� (dns block)
  • packagemanager.posit.co
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a159b3cb76 (dns block)
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a143be690e (dns block)
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/Rtmpz21USI/renv-download-config-26a1777d9204 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add HERS dataset linear regression example (Vittinghoff Ch. 4) Add HERS dataset linear regression example with interactions, interspersed with birthweight analysis (Vittinghoff Ch. 4) Apr 7, 2026
@d-morrison
Copy link
Copy Markdown
Owner

@copilot add interaction between bmi and statin use to the HERS model, and remove the interaction between bmi and HT

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot add interaction between bmi and statin use to the HERS model, and remove the interaction between bmi and HT

Done in commit f7d07a6. The interaction model is now LDL ~ HT + BMI + statins + BMI:statins:

  • Removed HT:BMI interaction
  • Added statins as a main effect and BMI:statins as the interaction term
  • Added statin use ($U$/$V$) to the data notation section
  • Updated the scatter plot to colour/shape by statin use (no vs yes)

The model equation is now:
$$\mu(h,v,b) = \beta_0 + \beta_H h + \beta_V v + \beta_B b + \beta_{VB}(v \cdot b)$$

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • rspm-sync.rstudio.com
    • Triggering command: /usr/bin/curl /usr/bin/curl --config /tmp/RtmpdqGxja/renv-download-config-2677336aef2b (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@d-morrison
Copy link
Copy Markdown
Owner

@copilot resolve merge conflicts

@d-morrison d-morrison removed their request for review April 10, 2026 18:54
…ORDLIST duplicates

Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/53ac613d-5e7f-468f-87c4-ed701a5c4220

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 14, 2026

@copilot apply changes based on the comments in this thread

Done in commit 8303639:

  • Deleted _sec_exm_hers_lm.qmd — the file was not referenced by any {{< include >}} directive or rendered page, and the standalone HERS section was already superseded by the interspersed _sec_hers_lm_parallel.qmd and _sec_hers_lm_interact.qmd subfiles.
  • Fixed inst/WORDLIST — reduced from 85 entries (21 duplicates) to 64 unique entries, sorted alphabetically.

@github-actions
Copy link
Copy Markdown
Contributor

PR Preview Action v1.8.1
🚀 Preview available at: https://d-morrison.github.io/rme/pr-preview/pr-381/

Built to branch gh-pages at 2026-04-14 20:16 UTC.

@d-morrison
Copy link
Copy Markdown
Owner

@copilot resolve the merge conflicts in this pull request

Copy link
Copy Markdown
Owner

@d-morrison d-morrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot

…odel-hers-dataset

# Conflicts:
#	inst/WORDLIST

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot resolve the merge conflicts in this pull request

Done in commit 0bde169. The only conflict was in inst/WORDLIST — merged both sets (ours with HERS-specific terms like LDL, drinkany, drk, nw, smk; main's with new math/stats terms like bmatrix, idempotent, varepsilon, etc.), deduplicated, and sorted. Result: 105 unique words. Spell check passes.

Copilot AI and others added 2 commits April 16, 2026 08:33
- Extract HERS data intro into new _sec_hers_data.qmd
- Update _sec_hers_lm_parallel.qmd: remove intro, add HT faceting to fitted values plot
- Update _sec_hers_lm_interact.qmd: add HT faceting to interaction model plot
- Insert HERS data include in _sec_linreg_understanding.qmd after birthweight data notation
- Create _sec_hers_lm_gof.qmd with AIC/BIC/deviance/LRT examples
- Create _sec_hers_lm_diagnostics.qmd with residuals and QQ plots faceted by HT
- Create _sec_hers_lm_model_selection.qmd with LRT model comparison
- Update _sec_linreg_assess.qmd to include new HERS assess subfiles

All HERS scatter plots now use facet_wrap(~HT, labeller = label_both).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
LRT is already covered in _sec_hers_lm_model_selection.qmd.
The GOF file should focus on AIC, BIC, and deviance only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot

Done in commits c46e981 and f9d6ec1:

  • HERS data intro moved: Extracted into _sec_hers_data.qmd and inserted directly after the birthweight dataset section (before "Parallel lines regression")
  • All HERS graphs faceted by HT: Added facet_wrap(~HT, labeller = label_both) to scatter plot in _sec_hers_data.qmd, fitted-values plot in _sec_hers_lm_parallel.qmd, and interaction model plot in _sec_hers_lm_interact.qmd
  • All HERS content in subfiles: _sec_hers_data.qmd, _sec_hers_lm_parallel.qmd, _sec_hers_lm_interact.qmd, _sec_hers_lm_gof.qmd, _sec_hers_lm_diagnostics.qmd, _sec_hers_lm_model_selection.qmd
  • HERS examples for all "Evaluating model fit" subsections:
    • Goodness of fit: AIC/BIC table and deviance for hers_lm1 vs hers_lm2 (_sec_hers_lm_gof.qmd)
    • Diagnostics: Residuals vs fitted and QQ plot, both faceted by HT (_sec_hers_lm_diagnostics.qmd)
    • Model selection: LRT comparing the two HERS models (_sec_hers_lm_model_selection.qmd)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates a HERS dataset linear regression example into the “Linear Models Overview” chapter, interleaving it with the existing birthweight analysis to follow the narrative flow of Vittinghoff Ch. 4.

Changes:

  • Adds new HERS-focused Quarto subfiles covering data intro, parallel-lines and interaction models, GOF, diagnostics, and model selection.
  • Inserts the new HERS subfiles into the “Understanding…” and “Assessing model fit” sections via {{< include ... >}}.
  • Updates inst/WORDLIST to reflect new terminology and deduplicate/sort entries.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
inst/WORDLIST Updates spelling whitelist for new HERS terms and general cleanup.
_subfiles/Linear-models-overview/_sec_linreg_understanding.qmd Includes new HERS data + modeling sections into the narrative flow.
_subfiles/Linear-models-overview/_sec_linreg_assess.qmd Includes new HERS GOF/diagnostics/model-selection examples after birthweight sections.
_sec_hers_data.qmd New HERS dataset intro: load data, show table/plot, define notation.
_sec_hers_lm_parallel.qmd New HERS parallel-lines model fit + coefficient table + fitted-values overlay plot.
_sec_hers_lm_interact.qmd New HERS interaction model fit + interpretation + fitted overlay plot.
_sec_hers_lm_gof.qmd New AIC/BIC and deviance comparisons for the HERS models.
_sec_hers_lm_diagnostics.qmd New residual-vs-fitted and QQ diagnostics plots (currently for hers_lm1).
_sec_hers_lm_model_selection.qmd New LRT comparison (lrtest) between hers_lm1 and hers_lm2.

Comment thread _sec_hers_data.qmd Outdated

::: notes

{{< include _sec_hers_intro.qmd >}}
Comment thread _sec_hers_data.qmd
Comment on lines +93 to +97
::: notes
"Placebo" is the **reference level** for the categorical variable $T$,
and "no" is the **reference level** for statin use $U$.
The choice of reference level is arbitrary;
it only affects the interpretation of the intercept and corresponding indicator coefficients.
Comment thread _sec_hers_lm_diagnostics.qmd Outdated
@@ -0,0 +1,34 @@
### Diagnostics for `hers` models
@d-morrison
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread

also, add a graph of the key variables of the HERS data, including the treatment variable, the outcome variable, and the most important covariates

Copilot AI and others added 2 commits April 16, 2026 19:36
…ariables plot

Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/a89643f9-3164-469b-9b64-621cc53426a0

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

also, add a graph of the key variables of the HERS data, inclu...

Done in commits f61a83b and 9d7eb27:

PR review fixes:

  • Fixed include path: {{< include _sec_hers_intro.qmd >}}{{< include _subfiles/shared/_sec_hers_intro.qmd >}}
  • Added explicit relevel() calls: relevel(ref = "placebo") for HT and relevel(ref = "no") for statins (both were already in correct order from as_factor(), but now explicit)
  • Expanded diagnostics into two subsections — one for hers_lm1 (parallel-lines model) and one for hers_lm2 (interaction model), each with residuals vs fitted and QQ plots, both faceted by HT

Key variables graph:

  • Added a "Key variables" tab in the HERS data panel-tabset with a GGally::ggpairs() plot of LDL (outcome), HT (treatment), BMI, statins, and age — colored by HT treatment group

@d-morrison d-morrison marked this pull request as ready for review April 19, 2026 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add an example model with the HERS dataset for linear regression

3 participants