Skip to content

Refactor: consolidate BIDS, add orchestration layer, extract longitudinal#267

Merged
nx10 merged 8 commits into
mainfrom
refactor/bids-consolidation
Apr 6, 2026
Merged

Refactor: consolidate BIDS, add orchestration layer, extract longitudinal#267
nx10 merged 8 commits into
mainfrom
refactor/bids-consolidation

Conversation

@nx10
Copy link
Copy Markdown
Contributor

@nx10 nx10 commented Apr 3, 2026

Major architecture refactor that introduces clean separation of concerns across four layers:

  1. bids/ - BIDS naming contracts (discover, resolve, export per workflow)
  2. orchestration/ - Pipeline loops (filter, iterate, discover -> process -> export)
  3. workflows/ - Processing step chains (computation only)
  4. cli/ - Arg parsing only (delegates to orchestration)

What changed

New rbc.bids/ package (moved from core/bids/ + core/bids2table.py + cli/query.py):

  • builder.py - Bids class
  • _schema.py - auto-generated entity validation
  • query.py - bids2table wrappers
  • session.py - session loading, iteration, grouping constants
  • anatomical.py - discover + export
  • functional.py - discover + resolve + export, with FunctionalInputs TypedDict
  • metrics.py - resolve + export, with MetricsInputs TypedDict
  • qc.py - resolve + export, with QCInputs TypedDict
  • longitudinal.py - resolve + export for longitudinal workflows

New rbc.orchestration/ package:

  • __init__.py - Filters dataclass (participant/session/task filters)
  • anatomical.py - run() + process_session()
  • functional.py - run() + process_session() (returns outputs for downstream use)
  • metrics.py - run() + process_run()
  • qc.py - run()
  • all.py - run() composing per-session stages with in-memory output passing
  • longitudinal.py - run() + process_anat() + process_func()

Simplified CLI modules: Each CLI is now ~40-60 lines (Args dataclass + main() that calls orchestration.run() + register_command()). Zero BIDS or workflow logic.

Deleted

  • src/rbc/core/bids/ (entire package)
  • src/rbc/core/bids2table.py
  • src/rbc/cli/query.py

Bugs fixed

  • Atlas names with underscores (schaefer_200) caused BIDS validation errors
  • Regressor names with hyphens (36-parameter) caused the same issue
  • QC resolve used .mat extension but functional export writes .txt

Tests

  • Moved: test_bids.py, test_bids2table.py, test_query.py to tests/unit/bids/
  • New: tests/unit/bids/test_exports.py (11 tests with real Bids instances)
  • Updated: all CLI test mock patches target orchestration modules

Closes #259, closes #266.

nx10 added 5 commits April 3, 2026 15:13
Move all BIDS-related code from scattered locations into a cohesive
rbc.bids/ package and extract per-workflow export/resolve functions
from CLI modules.

Structural changes:
- core/bids/ -> bids/builder.py (Bids class), bids/_schema.py (auto-generated)
- core/bids2table.py -> bids/query.py (load_table, find_file, etc.)
- New bids/anatomical.py, functional.py, metrics.py, qc.py with
  export_*() and resolve_*() functions extracted from CLI modules
- Tests moved: test_bids.py, test_bids2table.py -> tests/unit/bids/
- New tests/unit/bids/test_exports.py with real Bids instance tests

Bug fixes caught during refactor:
- Atlas names with underscores (schaefer_200) caused BIDS validation
  errors; now sanitized via bids_safe_label() inside export functions
- Regressor names with hyphens (36-parameter) had the same issue
- QC resolve used .mat extension but functional export writes .txt

The CLI modules are now thin orchestration layers that call shared
export/resolve functions instead of duplicating BIDS naming logic.
Unifies the pattern across all resolve functions: resolve_qc already
returned QCInputs (TypedDict), now resolve_functional returns
FunctionalInputs and resolve_metrics returns MetricsInputs. This gives
callers proper type safety when accessing resolved paths.

Part of #259.
Moves SessionTables, load_session, iter_session_files, and the BIDS
grouping constants (SUB_SES_QUERY, ANAT_GROUP_ENTITIES,
FUNC_GROUP_ENTITIES) from cli/ into bids/session.py so that cli/
exclusively handles CLI concerns.

Constants are renamed to drop the leading underscore since they are
now public API in the bids package.

Closes #259.
Adds discover_anatomical(), discover_functional(), and
discover_derivative_runs() to the bids package, removing BIDS-specific
iteration logic (entity extraction, row-to-path conversion, DataFrame
filtering/grouping) from CLI modules.

CLI modules now call discovery functions and receive structured
NamedTuples (AnatomicalRun, FunctionalRun, DerivativeRun) instead
of manually iterating DataFrames.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 3, 2026

Coverage

Coverage Report
FileStmtsMissCoverMissing
rbc
   __init__.py10100% 
   context.py25868%70, 75–77, 79–80, 93–94
   metadata.py560100% 
rbc/bids
   __init__.py90100% 
   _schema.py585499%776, 782, 790, 1101
   anatomical.py220100% 
   builder.py72790%233–235, 362, 364–365, 386
   functional.py42490%46–49
   longitudinal.py250100% 
   metrics.py230100% 
   qc.py170100% 
   query.py684238%100–104, 118–122, 124–125, 127–132, 134, 136, 150, 156–162, 195, 204, 206–213, 222, 254, 263–264
   session.py470100% 
rbc/cli
   __init__.py10100% 
   all.py350100% 
   anatomical.py170100% 
   base.py37197%51
   functional.py250100% 
   longitudinal.py230100% 
   main.py430100% 
   metrics.py320100% 
   qc.py250100% 
rbc/core
   __init__.py30100% 
   common.py21861%42–44, 58, 61–64
   fileops.py27485%69–72
   fsl2itk.py420100% 
   nifti.py192597%236–237, 244–245, 524
   niwrap.py56198%58
rbc/core/anatomical
   __init__.py40100% 
   registration.py14471%45, 137, 152, 167
   segmentation.py24866%61, 71–73, 89, 111, 122, 138
rbc/core/functional
   __init__.py130100% 
   coregistration.py7271%44, 55
   despiking.py7357%32, 36–37
   distortion.py1304069%269–271, 321, 324, 332–335, 341–346, 349, 352–353, 356, 365–369, 375–376, 387–388, 391, 397, 443–444, 447, 455, 461, 470–471, 474, 484, 491
   erosion.py32196%50
   initialization.py9455%35, 42–43, 63
   masking.py342526%53, 55–56, 58–59, 62–65, 69, 91, 134, 183, 197, 208, 223, 233, 249, 258, 271, 285, 296, 306, 319, 328
   motion.py573735%62, 64–67, 69, 71, 73, 76–77, 83–84, 86–87, 95–97, 99, 102, 105, 107, 124–125, 135–138, 159, 169, 171–172, 175, 177–179, 181, 183
   nuisance.py816025%78, 80–85, 87, 89–90, 93, 96–98, 100–102, 104–110, 112–116, 118, 163, 165, 167, 170–171, 173–175, 178, 181–182, 185–187, 190–193, 197, 203–205, 207, 235, 243, 269, 278, 307, 316, 322
   regressors.py89693%163, 193, 325–328
   resampling.py544320%37–42, 74–76, 78, 80–81, 85–87, 90, 93, 105, 107–108, 112, 114, 150–152, 154–155, 159, 161–162, 167–169, 173–174, 177, 180, 187, 199, 201–202, 207, 209
   timing.py161131%46–47, 49–53, 58, 60, 66–67
rbc/core/longitudinal
   __init__.py10100% 
   transform.py46784%106–107, 165–168, 170
rbc/core/metrics
   __init__.py30100% 
   alff.py90198%265
   reho.py660100% 
   smoothing.py7357%36, 42–43
   standardization.py271159%64, 66–68, 70, 72–75, 77–78
   timeseries.py57198%120
rbc/core/qc
   __init__.py60100% 
   dvars.py260100% 
   motion.py310100% 
   registration.py410100% 
   xcp.py410100% 
rbc/orchestration
   __init__.py19289%64, 67
   all.py470100% 
   anatomical.py41295%47–48
   functional.py450100% 
   longitudinal.py60198%146
   metrics.py54590%36, 39, 68, 76–77
   qc.py410100% 
rbc/workflows
   __init__.py100100% 
   anatomical.py451957%76–85, 87, 154–157, 159–162
   functional.py985642%113–114, 122, 127–128, 136, 201–202, 205–206, 209–210, 213, 216–219, 229–231, 241–242, 245, 248, 251–252, 260–261, 268–269, 276–277, 284–285, 288–290, 293–296, 308–309, 321, 325–328, 331–332, 341–342, 350, 357, 426, 432
   metrics.py401757%83–84, 89–90, 93–96, 99–102, 106–109, 111
   qc.py513139%85, 87–88, 91, 94, 97–101, 104, 113–115, 119–122, 124–127, 129–130, 132–134, 137, 154, 161, 163
rbc_resources
   __init__.py300100% 
TOTAL316548484% 

Tests Skipped Failures Errors Time
763 0 💤 0 ❌ 0 🔥 11.812s ⏱️

Creates src/rbc/orchestration/ with per-workflow run() functions that
own the full pipeline loop: filtering, sub/ses iteration, discovery,
processing, and export. CLI modules are now thin arg-parsing wrappers
that delegate to orchestration.run().

New modules:
- orchestration/__init__.py: Filters dataclass
- orchestration/anatomical.py: run() + process_session()
- orchestration/functional.py: run() + process_session()
- orchestration/metrics.py: run() + process_run()
- orchestration/qc.py: run()
- orchestration/all.py: run() composing per-session stages
- orchestration/longitudinal.py: run() + process_anat() + process_func()
- bids/longitudinal.py: resolve + export for longitudinal workflow

CLI modules now contain only: Args dataclass, main() (setup runner +
call orchestration.run()), and register_command().

Closes #266.
@nx10 nx10 changed the title Consolidate BIDS code and extract discovery/export/resolve from CLIs Refactor: consolidate BIDS, add orchestration layer, extract longitudinal Apr 3, 2026
Orchestration run() functions now handle runner setup (init_runner)
and workflow start/complete logging. CLI modules are reduced to pure
arg parsing: construct Filters + RunnerConfig, call run(), return 0.

Introduces RunnerConfig dataclass and init_runner() in orchestration.
Moves _DEFAULT_ENV_VARS from cli/__init__.py to orchestration/.
@nx10 nx10 added the refactor Alterations of code that do not affect function label Apr 3, 2026
@nx10 nx10 requested review from jpillai00 and kaitj April 3, 2026 21:34
Copy link
Copy Markdown
Contributor

@kaitj kaitj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions, one small documentation change, and a few notes to revisit, otherwise largely looks good.

Comment thread scripts/generate_bids_tools.py Outdated
Comment on lines +70 to +89
aex.save(
_require_file(outputs.brain_mask, "brain_mask"),
suffix=Suffix.MASK,
desc="T1w",
)
aex.save(
_require_file(outputs.csf_mask, "csf_mask"),
suffix=Suffix.MASK,
desc="csf",
)
aex.save(
_require_file(outputs.gm_mask, "gm_mask"),
suffix=Suffix.MASK,
desc="gm",
)
aex.save(
_require_file(outputs.wm_mask, "wm_mask"),
suffix=Suffix.MASK,
desc="wm",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revisit if these files are still needed. I think this was originally included because they were needed to compute regressors in template space (which isn't the case anymore I think).

func_q: Bids,
tpl_q: Bids,
func_df: pl.DataFrame,
tpl_df: pl.DataFrame,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More I look at this, I wonder if we should rename this variable to make distinct "standard" templates (e.g. MNI, OASIS, etc.) vs "longitudinal template" or is that more confusing?

extension=".txt",
extra={"from": "bold", "to": "T1w", "mode": "image"},
),
"sbref": func_q.expect(func_df, suffix=Suffix.SBREF, without=["space"]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool, first time I noticed the without argument 🚀

Comment thread src/rbc/bids/longitudinal.py Outdated
Comment thread src/rbc/orchestration/__init__.py Outdated
- Resolve merge conflicts (keep orchestration layer, discard old inline logic)
- Fix docstring paths in generate_bids_tools.py (#1)
- Remove redundant extension=Extension.NII_GZ in longitudinal export (#5)
- Clarify verbose docstring in RunnerConfig (#6)
@nx10 nx10 merged commit 8942e83 into main Apr 6, 2026
8 checks passed
@nx10 nx10 deleted the refactor/bids-consolidation branch April 6, 2026 19:09
@kaitj kaitj mentioned this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Alterations of code that do not affect function

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add orchestration layer between CLI and workflows Move BIDS discovery/query logic from cli/query.py to bids/

2 participants