Skip to content

Unify compact top-loci export and SuSiE-Inf as the option for univaraite pipeline#504

Merged
gaow merged 16 commits into
StatFunGen:mainfrom
xueweic:main
May 31, 2026
Merged

Unify compact top-loci export and SuSiE-Inf as the option for univaraite pipeline#504
gaow merged 16 commits into
StatFunGen:mainfrom
xueweic:main

Conversation

@xueweic
Copy link
Copy Markdown
Contributor

@xueweic xueweic commented May 30, 2026

Unify SuSiE top-loci output: one helper, one table

Major

  • Replace build_top_loci_long / _wide / _export trio with single build_top_loci() returning one 22-column table.
  • postprocess_finemapping_fits() runs one method for-loop and rbinds per-method rows; format_finemapping_output() exposes only top_loci.

Minor

  • build_top_loci() keeps the old long-builder input shape; data_x / data_y / other_quantities stay default-NULL.
  • other_quantities reserves region and condition_id (alongside dropped_samples).
  • FineMappingResult S4 slot keeps legacy shape via back-compat alias inside susie_wrapper.R; AllClasses.R / AllMethods.R / vcf_writer.R unchanged.
  • Adds hard gating tests: 22-col order + dtypes, <method>_<idx> CS strings, <method>_0 PIP-only, per-method independent CS numbering, cs_95_purity = 0 rule, overlapping CS within/across methods, single top_loci field, removed-trio absence.

Final top_loci (22 cols)

Group Columns
variant #chr, start, end, a1, a2, variant
phenotype gene, event
marginal n, maf, beta, se
fine-mapping pip, posterior_effect_mean, posterior_effect_se, cs_95, cs_70, cs_50, cs_95_purity
run-level method, grange_start, grange_end

Conventions

  • cs_95 / cs_70 / cs_50 are character "<method>_<idx>" (e.g. "susie_1", "susie_inf_2"); PIP-only retained variants use "<method>_0". Each method numbers CS independently from 1.
  • cs_95_purity = 0.95-coverage purity for the row's (method, cs_95); "<method>_0" rows carry 0.
  • Row uniqueness: (variant, gene, method, cs_membership) — overlapping CS within a method and the same variant across methods both produce separate rows.

SuSiE-Inf as the option for univaraite pipeline

Univariate fine-mapping can now bypass SuSiE-inf initialization when ordinary SuSiE fitting is explicitly requested. The SuSiE-inf path remains the default for compatibility, and TWAS-weight generation continues to require the existing SuSiE-inf workflow.

xueweic added 3 commits May 29, 2026 22:49
This update adds two fine-mapping output improvements while preserving existing default behavior.

First, univariate fine-mapping can now bypass SuSiE-inf initialization when ordinary SuSiE fitting is explicitly requested. The SuSiE-inf path remains the default for compatibility, and TWAS-weight generation continues to require the existing SuSiE-inf workflow.

Second, SuSiE wrapper output now includes a unified top-loci export table alongside the existing wrapper-facing summaries. The export keeps posterior effect summaries separate from marginal association fields, preserves credible-set membership including overlapping credible sets, and leaves existing top_loci and top_loci_long contracts unchanged.

Focused regression tests cover the direct SuSiE path and the unified top-loci export contract.
Unify compact top-loci export at one wrapper boundary
Major:
- build_top_loci_long now emits the full annotated long: posterior
  conditional_effect / conditional_effect_se, cs_purity, per-fit n /
  variant_number / gene_id, and caller-supplied region / event_ID.
- Adds build_top_loci_export(long): pure projection of the annotated
  long to the fixed 21-column compact export schema; stops on missing
  required columns rather than silently filling NA.
Minor:
- build_top_loci_long gains 3 default-NULL params (data_x, data_y,
  other_quantities); existing callers omitting them are unchanged.
- .postprocess_finemapping_fit_common passes those through unchanged.
- other_quantities reserves two subfield names for the unified export:
  region, condition_id (alongside the existing dropped_samples use).
- .empty_top_loci_long extended with the new columns so empty and
  populated long share one schema.
- Adds focused unit tests covering the new columns, the per-fit
  constants, the export helper schema, and the missing-column error.
Loader, univariate_analysis_pipeline, and format_finemapping_output
are unchanged. transMap-side adoption is out of scope of this commit.
Comment thread R/univariate_pipeline.R Outdated
coverage = c(0.95, 0.7, 0.5),
min_abs_corr = 0.8,
finemapping_extra_opts = list(refine = TRUE),
fit_susie_inf = TRUE,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change this to add_susie_inf otherwise users might think we can only fit one model at a time not another.

@gaow gaow merged commit d3af845 into StatFunGen:main May 31, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants