Reformatting: Annotate subworkflow, OLGA module, Patient subworkflow by dltamayo · Pull Request #75 · KarchinLab/TCRtoolkit

dltamayo · 2026-03-23T17:46:21Z

No description provided.

- parallelize concatenation of files to avoid loading all of them in memory; might remove RESOLVE_SAMPLESHEET - improve calculation of log10_pgen, joining of sample repertoires + olga pgen values - simplify calculation of min/max olga log10_pgen values

- Add patient workflow to run on samples grouped by patient - Move modules such as compare_calc, giana, gliph to patient to reduce scope of algorithms for large datasets - Refactored compare/patient calc to vectorize operations Further cleaning up of code and updating documentation will be required.

github-actions · 2026-03-23T17:49:51Z

Unit Test Results

10 tests 10 ✅ 2m 49s ⏱️
2 suites 0 💤
1 files 0 ❌

Results for commit 85cd394.

♻️ This comment has been updated with latest results.

Copilot

Pull request overview

This PR adds a new patient-level analysis stage to the Nextflow pipeline and refactors the annotate → OLGA plumbing so downstream stages (sample/compare/patient) can reuse shared intermediate artifacts.

Changes:

Introduce a patient workflow level (schema + Cirro form + new PATIENT subworkflow and patient modules/scripts).
Refactor ANNOTATE to produce per-sample processed CDR3 files and a concatenated/sorted CDR3 table, and to emit OLGA-derived stats for reuse downstream.
Rework OLGA sample merge + histogram inputs to use shared OLGA stats and simplify sample-level OLGA steps.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
workflows/tcrtoolkit.nf	Adds `patient` stage wiring and guards; updates call graph (ANNOTATE/SAMPLE/PATIENT/COMPARE).
subworkflows/local/annotate.nf	Refactors annotate subworkflow to use `ANNOTATE_PROCESS`, concatenation via `collectFile`, and emits `olga_stats`.
subworkflows/local/sample.nf	Updates SAMPLE inputs and OLGA calls to use shared `olga_stats`; removes deprecated OLGA max-writing.
subworkflows/local/patient.nf	New patient-level subworkflow: group samples by `meta.patient`, concatenate, run patient metrics + clustering tools.
subworkflows/local/compare.nf	Simplifies compare stage to TCR sharing + OLGA merge only (removes GIANA/GLIPH2 here).
modules/local/annotate/main.nf	Adds `ANNOTATE_PROCESS` module to standardize per-sample CDR3 extraction.
modules/local/olga/main.nf	Adds log10 computation earlier, emits OLGA stats, refactors sample merge to a streaming join, and updates histogram calc inputs.
modules/local/patient/main.nf	New processes for patient concatenation and patient overlap matrix calculation.
bin/patient_calc.py	New vectorized patient overlap calculator writing per-patient matrices.
modules/local/compare/gliph2.nf	Makes GLIPH2 outputs patient-scoped (input tuple includes patient).
modules/local/compare/giana.nf	Makes GIANA outputs patient-scoped and changes logging/output set.
nextflow_schema.json	Extends `workflow_level` validation to allow `patient`.
nextflow.config	Changes default `olga_chunk_length`.
modules/local/sample/sample_calc.nf	Removes the `stub:` block for `SAMPLE_CALC`.
.cirro/process-input.json	Wires new `patient_lvl` form value into params.
.cirro/process-form.json	Adds patient toggle; updates `olga_chunk_length` type/default.
.cirro/preprocess.py	Adds patient flag into workflow_level construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-23T17:56:02Z

+    // Running patient analysis
+    if (levels.contains('patient')) {
+        PATIENT( ANNOTATE.out.processed_samples )
+    }


New patient workflow level is introduced (new subworkflow + new modules and significant OLGA plumbing changes), but the nf-test suite currently only asserts a minimal sample/compare run succeeds. Consider adding at least one nf-test case that runs with workflow_level=patient (and optionally sample,patient) to catch the ANNOTATE dependency and verify expected patient outputs are produced.

Will add nf-test later

Copilot · 2026-03-23T17:56:03Z

@@ -17,12 +17,4 @@ process SAMPLE_CALC {
    """
    sample_calc.py -s '${sample_meta.sample}' -c ${count_table}
    """


This repo uses stub: blocks in several modules (e.g. modules/local/sample/sample_plot.nf, modules/local/compare/compare_plot.nf). Removing the stub: section from SAMPLE_CALC makes -stub-run / lightweight test runs inconsistent and may break workflows that rely on stubbing heavy steps. Consider restoring a minimal stub that creates the expected output files for this process.

Suggested change

"""

"""

stub:

"""

touch sample_stats_${sample_meta.sample}.csv

touch v_family_${sample_meta.sample}.csv

touch d_family_${sample_meta.sample}.csv

touch j_family_${sample_meta.sample}.csv

"""

Will introduce stubs later

dltamayo added 8 commits February 6, 2026 11:00

Remove stub

ca76cf3

Remove nf-schema plugin verision

513cdd7

Make Annotate run as part of sample/compare

1a6284d

Debug Cirro form

5faba54

Decrease olga chunk size

e8c69fa

Re-add nf-schema version

2f30cbe

dimalvovs requested a review from Copilot March 23, 2026 17:49

Copilot started reviewing on behalf of dimalvovs March 23, 2026 17:50 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Address Copilot feedback

85cd394

dltamayo merged commit 2eea4a9 into main Mar 27, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reformatting: Annotate subworkflow, OLGA module, Patient subworkflow#75

Reformatting: Annotate subworkflow, OLGA module, Patient subworkflow#75
dltamayo merged 9 commits intomainfrom
dltamayo-dev

dltamayo commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

dltamayo Mar 27, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

dltamayo Mar 27, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    """
+    """
+    stub:
+    """
+    touch sample_stats_${sample_meta.sample}.csv
+    touch v_family_${sample_meta.sample}.csv
+    touch d_family_${sample_meta.sample}.csv
+    touch j_family_${sample_meta.sample}.csv
+    """

Conversation

dltamayo commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

dltamayo Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

dltamayo Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Mar 23, 2026 •

edited

Loading