Skip to content

Reformatting: Annotate subworkflow, OLGA module, Patient subworkflow#75

Merged
dltamayo merged 9 commits intomainfrom
dltamayo-dev
Mar 27, 2026
Merged

Reformatting: Annotate subworkflow, OLGA module, Patient subworkflow#75
dltamayo merged 9 commits intomainfrom
dltamayo-dev

Conversation

@dltamayo
Copy link
Copy Markdown
Collaborator

No description provided.

- parallelize concatenation of files to avoid loading all of them in memory; might remove RESOLVE_SAMPLESHEET
- improve calculation of log10_pgen, joining of sample repertoires + olga pgen values
- simplify calculation of min/max olga log10_pgen values
- Add patient workflow to run on samples grouped by patient
- Move modules such as compare_calc, giana, gliph to patient to reduce scope of algorithms for large datasets
- Refactored compare/patient calc to vectorize operations

Further cleaning up of code and updating documentation will be required.
@dimalvovs dimalvovs requested a review from Copilot March 23, 2026 17:49
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 23, 2026

Unit Test Results

10 tests   10 ✅  2m 49s ⏱️
 2 suites   0 💤
 1 files     0 ❌

Results for commit 85cd394.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new patient-level analysis stage to the Nextflow pipeline and refactors the annotate → OLGA plumbing so downstream stages (sample/compare/patient) can reuse shared intermediate artifacts.

Changes:

  • Introduce a patient workflow level (schema + Cirro form + new PATIENT subworkflow and patient modules/scripts).
  • Refactor ANNOTATE to produce per-sample processed CDR3 files and a concatenated/sorted CDR3 table, and to emit OLGA-derived stats for reuse downstream.
  • Rework OLGA sample merge + histogram inputs to use shared OLGA stats and simplify sample-level OLGA steps.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
workflows/tcrtoolkit.nf Adds patient stage wiring and guards; updates call graph (ANNOTATE/SAMPLE/PATIENT/COMPARE).
subworkflows/local/annotate.nf Refactors annotate subworkflow to use ANNOTATE_PROCESS, concatenation via collectFile, and emits olga_stats.
subworkflows/local/sample.nf Updates SAMPLE inputs and OLGA calls to use shared olga_stats; removes deprecated OLGA max-writing.
subworkflows/local/patient.nf New patient-level subworkflow: group samples by meta.patient, concatenate, run patient metrics + clustering tools.
subworkflows/local/compare.nf Simplifies compare stage to TCR sharing + OLGA merge only (removes GIANA/GLIPH2 here).
modules/local/annotate/main.nf Adds ANNOTATE_PROCESS module to standardize per-sample CDR3 extraction.
modules/local/olga/main.nf Adds log10 computation earlier, emits OLGA stats, refactors sample merge to a streaming join, and updates histogram calc inputs.
modules/local/patient/main.nf New processes for patient concatenation and patient overlap matrix calculation.
bin/patient_calc.py New vectorized patient overlap calculator writing per-patient matrices.
modules/local/compare/gliph2.nf Makes GLIPH2 outputs patient-scoped (input tuple includes patient).
modules/local/compare/giana.nf Makes GIANA outputs patient-scoped and changes logging/output set.
nextflow_schema.json Extends workflow_level validation to allow patient.
nextflow.config Changes default olga_chunk_length.
modules/local/sample/sample_calc.nf Removes the stub: block for SAMPLE_CALC.
.cirro/process-input.json Wires new patient_lvl form value into params.
.cirro/process-form.json Adds patient toggle; updates olga_chunk_length type/default.
.cirro/preprocess.py Adds patient flag into workflow_level construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread modules/local/olga/main.nf
Comment thread modules/local/olga/main.nf
Comment thread modules/local/olga/main.nf
Comment thread workflows/tcrtoolkit.nf
Comment on lines +110 to +113
// Running patient analysis
if (levels.contains('patient')) {
PATIENT( ANNOTATE.out.processed_samples )
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New patient workflow level is introduced (new subworkflow + new modules and significant OLGA plumbing changes), but the nf-test suite currently only asserts a minimal sample/compare run succeeds. Consider adding at least one nf-test case that runs with workflow_level=patient (and optionally sample,patient) to catch the ANNOTATE dependency and verify expected patient outputs are produced.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add nf-test later

Comment thread workflows/tcrtoolkit.nf Outdated
Comment thread workflows/tcrtoolkit.nf
Comment thread workflows/tcrtoolkit.nf Outdated
@@ -17,12 +17,4 @@ process SAMPLE_CALC {
"""
sample_calc.py -s '${sample_meta.sample}' -c ${count_table}
"""
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repo uses stub: blocks in several modules (e.g. modules/local/sample/sample_plot.nf, modules/local/compare/compare_plot.nf). Removing the stub: section from SAMPLE_CALC makes -stub-run / lightweight test runs inconsistent and may break workflows that rely on stubbing heavy steps. Consider restoring a minimal stub that creates the expected output files for this process.

Suggested change
"""
"""
stub:
"""
touch sample_stats_${sample_meta.sample}.csv
touch v_family_${sample_meta.sample}.csv
touch d_family_${sample_meta.sample}.csv
touch j_family_${sample_meta.sample}.csv
"""

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will introduce stubs later

Comment thread modules/local/compare/giana.nf Outdated
Comment thread subworkflows/local/annotate.nf
@dltamayo dltamayo merged commit 2eea4a9 into main Mar 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants