# Combining MAF Files with pyMut

This notebook demonstrates how to combine two MAF files using the `combine_pymutations` method from pyMut.

## Example: Combining TCGA LAML and PAAD-TP MAF files

We'll load and combine `tcga_laml.maf.gz` and `PAAD-TP.final_analysis_set.maf.gz` files, then save the result to the output folder.


In [1]:
# Import necessary functions
from pyMut.input import read_maf
from pyMut.combination import combine_pymutations


In [2]:
# Load the first MAF file (TCGA LAML)
maf1_path = "../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz"
pymut1 = read_maf(path=maf1_path, assembly="37")


2025-08-01 02:01:59,559 | INFO | pyMut.input | Starting MAF reading: ../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz
2025-08-01 02:01:59,560 | INFO | pyMut.input | Loading from cache: ../../../src/pyMut/data/examples/MAF/.pymut_cache/tcga_laml.maf_8bfbda65c4b23428.parquet
2025-08-01 02:01:59,591 | INFO | pyMut.input | Cache loaded successfully in 0.03 seconds


In [3]:
# Load the second MAF file (PAAD-TP)
maf2_path = "../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz"
pymut2 = read_maf(path=maf2_path, assembly="37")


2025-08-01 02:01:59,667 | INFO | pyMut.input | Starting MAF reading: ../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz
2025-08-01 02:01:59,668 | INFO | pyMut.input | Loading from cache: ../../../src/pyMut/data/examples/MAF/.pymut_cache/tcga_laml.maf_8bfbda65c4b23428.parquet
2025-08-01 02:01:59,684 | INFO | pyMut.input | Cache loaded successfully in 0.02 seconds


In [4]:
# Combine the two PyMutation instances
combined_pymut = combine_pymutations(pymut1, pymut2)


2025-08-01 02:01:59,714 | INFO | pyMut.combination | Starting combination of PyMutation instances: ../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz and ../../../src/pyMut/data/examples/MAF/tcga_laml.maf.gz
2025-08-01 02:01:59,715 | INFO | pyMut.combination | Assembly check passed: both instances have assembly 37
2025-08-01 02:01:59,715 | INFO | pyMut.combination | Combined samples: 193 from first instance + 193 from second instance = 193 unique samples
2025-08-01 02:01:59,763 | INFO | pyMut.combination | Created unique variant identifiers for 2091 variants in first instance and 2091 variants in second instance
2025-08-01 02:01:59,764 | INFO | pyMut.combination | Column analysis: 217 common columns, 0 unique to first instance, 0 unique to second instance
2025-08-01 02:01:59,765 | INFO | pyMut.combination | Found 2091 unique variants across both instances
2025-08-01 02:02:00,211 | INFO | pyMut.combination | Processed 217 common columns, 0 columns unique to first instance, and 0 colu

In [5]:
# Save the combined result to the output folder
output_path = "output/combined_2maf_output.maf"
combined_pymut.to_maf(output_path)


2025-08-01 02:02:00,228 | INFO | pyMut.output | Starting MAF export to: output/combined_2maf_output.maf
2025-08-01 02:02:00,230 | INFO | pyMut.output | Starting to process 2091 variants from 193 samples
2025-08-01 02:02:00,234 | INFO | pyMut.output | Processing sample 1/193: TCGA-AB-2994 (0.5%)
2025-08-01 02:02:00,245 | INFO | pyMut.output | Sample TCGA-AB-2994: 12 variants found
2025-08-01 02:02:00,263 | INFO | pyMut.output | Processing sample 3/193: TCGA-AB-2974 (1.6%)
2025-08-01 02:02:00,275 | INFO | pyMut.output | Sample TCGA-AB-2974: 8 variants found
2025-08-01 02:02:00,312 | INFO | pyMut.output | Processing sample 6/193: TCGA-AB-2879 (3.1%)
2025-08-01 02:02:00,324 | INFO | pyMut.output | Sample TCGA-AB-2879: 5 variants found
2025-08-01 02:02:00,359 | INFO | pyMut.output | Processing sample 9/193: TCGA-AB-2941 (4.7%)
2025-08-01 02:02:00,370 | INFO | pyMut.output | Sample TCGA-AB-2941: 5 variants found
2025-08-01 02:02:00,406 | INFO | pyMut.output | Processing sample 12/193: TCGA-A