Skip to content

Fix extract_strata() stripping parentheses from strata level labels#338

Merged
ddsjoberg merged 7 commits into
mainfrom
fix/extract-strata-parentheses
Apr 29, 2026
Merged

Fix extract_strata() stripping parentheses from strata level labels#338
ddsjoberg merged 7 commits into
mainfrom
fix/extract-strata-parentheses

Conversation

@ddsjoberg
Copy link
Copy Markdown
Collaborator

Summary

Fixes the greedy regex in extract_strata() that destroyed parentheses in factor level values. For example, a strata level "Drug (B)" was truncated to "B".

Closes ddsjoberg/gtsummary#2388

Changes

The two gsub(".*\\(", "", gsub("\\)", "", ...)) calls were intended to strip strata() wrappers from term labels and strata strings, but they also removed all parentheses from level values. Replaced with targeted patterns:

  • Term labels: sub("^strata\\(\\s*([^,)]+).*\\)$", "\\1", x_terms) — only matches the strata(...) wrapper
  • Strata strings: gsub("strata\\(([^)]+)\\)", "\\1", i) — only replaces strata(varname) with varname

Testing

Added a test case with parentheses in level labels ("Male (M)", "Female (F)"). Verified no regressions with normal levels, = in levels, unstratified models, multiple stratification variables, and quantile mode.

@ddsjoberg ddsjoberg requested a review from Melkiades April 13, 2026 22:21
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 13, 2026

Unit Tests Summary

  1 files  193 suites   1m 23s ⏱️
191 tests 153 ✅ 38 💤 0 ❌
705 runs  638 ✅ 67 💤 0 ❌

Results for commit b8b121b.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 13, 2026

Unit Test Performance Difference

Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
ard_categorical_ci.survey.design 💚 $3.04$ $-1.03$ ard_categorical_ci_data_
ard_summary.survey.design 💚 $24.64$ $-1.66$ unstratified_ard_summary.survey.design_works
ard_survival_survfit 👶 $+0.01$ ard_survival_survfit_preserves_parentheses_in_strata_level_labels
ard_survival_survfit 👶 $+0.00$ extract_strata_returns_safely_and_warns_on_0_row_datasets

Results for commit bd51179

♻️ This comment has been updated with latest results.

@ddsjoberg ddsjoberg removed the request for review from Melkiades April 13, 2026 22:27
The greedy regex gsub('.*\\(', ...) in extract_strata() was intended
to strip strata() wrappers but also destroyed parentheses in factor
level values, e.g. 'Drug (B)' became 'B'.

Replace with targeted patterns that only strip the strata() wrapper.

Fixes ddsjoberg/gtsummary#2388

Co-authored-by: Ona <no-reply@ona.com>
@ddsjoberg ddsjoberg force-pushed the fix/extract-strata-parentheses branch from 71ee5ed to bdbf89a Compare April 13, 2026 22:30
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 13, 2026

badge

Code Coverage Summary

Filename                                Stmts    Miss  Cover    Missing
------------------------------------  -------  ------  -------  -------------------------------------------
R/add_total_n.survey.design.R              10       0  100.00%
R/ard_aod_wald_test.R                      77       2  97.40%   93, 96
R/ard_attributes.survey.design.R            2       0  100.00%
R/ard_car_anova.R                          45       2  95.56%   62, 65
R/ard_car_vif.R                            68       1  98.53%   93
R/ard_categorical_ci.R                    323       1  99.69%   100
R/ard_categorical_ci.survey.design.R      124       1  99.19%   180
R/ard_continuous_ci.R                      28       1  96.43%   38
R/ard_continuous_ci.survey.design.R       138       0  100.00%
R/ard_continuous.survey.design.R          284      20  92.96%   61-66, 100, 191, 200, 351, 382-383, 434-442
R/ard_effectsize_cohens_d.R               103       2  98.06%   69, 122
R/ard_effectsize_hedges_g.R                91       2  97.80%   68, 120
R/ard_emmeans_contrast.R                   99       0  100.00%
R/ard_emmeans_emmeans.R                    97       0  100.00%
R/ard_incidence_rate.R                    104       0  100.00%
R/ard_missing.survey.design.R              89       7  92.13%   45-50, 63
R/ard_regression_basic.R                   31       1  96.77%   61
R/ard_regression.R                         87       0  100.00%
R/ard_smd_smd.R                            69       5  92.75%   57, 83-86
R/ard_stats_anova.R                        95       0  100.00%
R/ard_stats_aov.R                          46       0  100.00%
R/ard_stats_chisq_test.R                   40       1  97.50%   39
R/ard_stats_fisher_test.R                  43       1  97.67%   42
R/ard_stats_kruskal_test.R                 36       1  97.22%   38
R/ard_stats_mantelhaen_test.R              67       1  98.51%   45
R/ard_stats_mcnemar_test.R                 80       2  97.50%   63, 106
R/ard_stats_mood_test.R                    49       1  97.96%   45
R/ard_stats_oneway_test.R                  39       0  100.00%
R/ard_stats_poisson_test.R                 76       1  98.68%   59
R/ard_stats_prop_test.R                    85       1  98.82%   43
R/ard_stats_t_test_onesample.R             41       0  100.00%
R/ard_stats_t_test.R                      112       2  98.21%   65, 111
R/ard_stats_wilcox_test_onesample.R        42       0  100.00%
R/ard_stats_wilcox_test.R                  99       2  97.98%   65, 117
R/ard_survey_svychisq.R                    38       1  97.37%   44
R/ard_survey_svyranktest.R                 54       1  98.15%   44
R/ard_survey_svyttest.R                    53       1  98.11%   42
R/ard_survival_survdiff.R                  89       0  100.00%
R/ard_survival_survfit_diff.R              76       0  100.00%
R/ard_survival_survfit.R                  243       5  97.94%   234-238
R/ard_tabulate_abnormal.R                  76       0  100.00%
R/ard_tabulate_max.R                       51       7  86.27%   54-59, 74
R/ard_tabulate_value.survey.design.R       80       9  88.75%   40-45, 62, 167, 172
R/ard_tabulate.survey.design.R            407      15  96.31%   70-75, 90, 243-246, 290, 335, 535, 549
R/construction_helpers.R                  106      10  90.57%   160-175, 189, 248
R/deprecated.R                             34      34  0.00%    28-86
R/proportion_ci.R                         203       1  99.51%   463
TOTAL                                    4329     142  96.72%

Diff against main

Filename                    Stmts    Miss  Cover
------------------------  -------  ------  -------
R/ard_survival_survfit.R       +7       0  +0.06%
TOTAL                          +7       0  +0.01%

Results for commit: b8b121b

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

@ddsjoberg ddsjoberg requested a review from Melkiades April 28, 2026 19:09
@ddsjoberg
Copy link
Copy Markdown
Collaborator Author

@Melkiades can you review this PR?

Comment thread R/ard_survival_survfit.R
Comment thread R/ard_survival_survfit.R
Copy link
Copy Markdown
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me! The new regex works perfectly. Thanks @ddsjoberg !!

@ddsjoberg
Copy link
Copy Markdown
Collaborator Author

Thank you for the review @Melkiades !!

@ddsjoberg ddsjoberg merged commit 25395de into main Apr 29, 2026
33 checks passed
@ddsjoberg ddsjoberg deleted the fix/extract-strata-parentheses branch April 29, 2026 14:17
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tbl_survfit ignores text before parentheses in variables levels

2 participants