# NIfTI Preprocessing Tutorial - Center and flip your data
This notebook demonstrates a full preprocessing pipeline for 3D labelmaps NIfTI files:

### **Pipeline steps**
1. **Automatic selection of a reference scan**
2. **Computation of center of mass (CoM)**
3. **Z-flip detection based on affine matrix**
4. **Intensity conversion to `uint8`**
5. **Recentring relative to the reference CoM**
6. **Saving corrected files**
7. **Logging operations**

This notebook version is designed for clarity and reproducibility, ideal for open-source publication.


In [1]:
import numpy as np
import nibabel as nib
from scipy.ndimage import center_of_mass
import os
from datetime import datetime


## üìÅ Input/Output Paths

In the next cell, please specify the **path to the folder that contains your 3D labelmaps**  
(e.g., binary masks or segmentation labelmaps). These files **must be in NIfTI format** (`.nii.gz`).  

If your files use a different format, you may adapt the loading section accordingly.

The processed (centered + flipped if needed) labelmaps will be saved automatically in: <your_folder>/labels_registered/

This folder is created if it does not already exist.


In [2]:
# Input folder containing NIfTI files
folder_path = "../../data/data_DIASEM/renamed_data/RF/test"

# Output folder for processed files
save_folder = folder_path + "/labels_registered"
os.makedirs(save_folder, exist_ok=True)

# List all NIfTI files
nii_files = sorted([f for f in os.listdir(folder_path) if f.endswith(".nii.gz")])
if len(nii_files) == 0:
    raise RuntimeError(f"No NIfTI files found in {folder_path}")

print(f"Found {len(nii_files)} NIfTI files.")


Found 6 NIfTI files.


## üìù Logging the Preprocessing Steps

To ensure full reproducibility, this notebook automatically logs all operations  
(flip detection, centering translations, saving paths, etc.) into a text file: <output_folder>/processing_log.txt

This makes it easy to:
- track how each NIfTI file was processed,
- verify which volumes were flipped,
- check translation vectors,
- document your preprocessing pipeline.

The following cell sets up a lightweight logging function used throughout the notebook.



In [3]:
# Log file to document operations
log_path = os.path.join(save_folder, "processing_log.txt")

def log(message):
    """Append a message to the log file and print it."""
    timestamp = datetime.now().strftime("[%Y-%m-%d %H:%M:%S]")
    line = f"{timestamp} {message}"
    print(line)
    with open(log_path, "a") as f:
        f.write(line + "\n")

# Start log
with open(log_path, "w") as f:
    f.write("NIfTI Preprocessing Log\n")
    f.write("========================\n\n")

log("Processing started.")


[2025-12-04 14:37:25] Processing started.


## üéØ Automatic Reference Selection

To center all volumes consistently, we need a **reference scan** from which the target center of mass (CoM) is computed.

You may:
- manually specify a reference file (set `ref_file_name = "file.nii.gz"`), or  
- leave it to **automatic mode**, in which case the first NIfTI file in the folder is used.

This automatic reference mode is safer and prevents errors if the user forgets to select one.

The next cell:
1. selects the reference file,
2. loads it,
3. computes its center of mass in millimeters.


In [5]:
# Optional manual reference (set to None to auto-select)
ref_file_name = None

if ref_file_name is None:
    log("No reference file specified -> Using the FIRST NIfTI file as reference.")
    ref_file_name = nii_files[0]

ref_file_path = os.path.join(folder_path, ref_file_name)

# Safety check
if not os.path.exists(ref_file_path):
    log(f"WARNING: Reference file {ref_file_name} not found!")
    log("Falling back to first file in the folder.")
    ref_file_name = nii_files[0]
    ref_file_path = os.path.join(folder_path, ref_file_name)

log(f"Using reference file: {ref_file_name}")

# Load reference scan
img_ref = nib.load(ref_file_path)
data_ref = img_ref.get_fdata()
affine_ref = img_ref.affine

# Center of mass in mm
ref_com_voxel = np.array(center_of_mass(data_ref))
ref_com_mm = nib.affines.apply_affine(affine_ref, ref_com_voxel)

log(f"Reference center of mass (mm): {ref_com_mm}")


[2025-12-04 14:37:34] No reference file specified -> Using the FIRST NIfTI file as reference.
[2025-12-04 14:37:34] Using reference file: TC152_RF.nii.gz
[2025-12-04 14:37:59] Reference center of mass (mm): [-40.94887565  33.82758946  98.90511242]


## ‚öôÔ∏è Processing All NIfTI Files

This is the core of the preprocessing pipeline.  
For each NIfTI file in the folder, the following steps are performed:

1. **Z-axis flip detection**  
   Some 3D volumes are stored with an inverted Z orientation.  
   A heuristic based on the affine matrix detects this and flips the volume if needed.

2. **Conversion to `uint8`**  
   Standardizes intensity storage and reduces file size.

3. **Center of Mass computation**  
   Used to determine how far the volume is from the reference position.

4. **Recenter the volume**  
   The affine translation is updated so the volume aligns with the reference CoM.

5. **Save to the `labels_registered/` folder**

All operations are logged to ensure full reproducibility and transparency.


In [6]:
inversed_files = []

for file_name in nii_files:

    log(f"\n--- Processing {file_name} ---")

    file_path = os.path.join(folder_path, file_name)

    # Load image
    img = nib.load(file_path)
    data = img.get_fdata()
    affine = img.affine.copy()

    # ---------------------------
    # 1. Z-FLIP DETECTION
    # ---------------------------
    tz = affine[2, 3]  # Z translation

    if tz <= -250:  # heuristic threshold
        data = np.flip(data, axis=2)
        inversed_files.append(file_name)
        log("Applied Z-flip.")
    else:
        log("No flip required.")

    # ---------------------------
    # 2. Convert to UINT8
    # ---------------------------
    data = data.astype(np.uint8)
    log("Converted to uint8.")

    # ---------------------------
    # 3. Compute sample CoM
    # ---------------------------
    com_voxel = np.array(center_of_mass(data))
    com_mm = nib.affines.apply_affine(affine, com_voxel)
    log(f"Center of mass (mm): {com_mm}")

    # ---------------------------
    # 4. Recenter relative to reference
    # ---------------------------
    translation = ref_com_mm - com_mm
    affine[:3, 3] += translation
    log(f"Applied translation: {translation}")

    # ---------------------------
    # 5. Save corrected file
    # ---------------------------
    save_path = os.path.join(save_folder, file_name)
    img_new = nib.Nifti1Image(data, affine, img.header)
    nib.save(img_new, save_path)

    log(f"Saved corrected file to {save_path}")


[2025-12-04 14:37:59] 
--- Processing TC152_RF.nii.gz ---
[2025-12-04 14:38:10] No flip required.
[2025-12-04 14:38:11] Converted to uint8.
[2025-12-04 14:38:23] Center of mass (mm): [-40.94887565  33.82758946  98.90511242]
[2025-12-04 14:38:23] Applied translation: [0. 0. 0.]
[2025-12-04 14:38:30] Saved corrected file to ../../data/data_DIASEM/renamed_data/RF/test/labels_registered/TC152_RF.nii.gz
[2025-12-04 14:38:30] 
--- Processing TT129_RF.nii.gz ---
[2025-12-04 14:38:40] No flip required.
[2025-12-04 14:38:41] Converted to uint8.
[2025-12-04 14:38:52] Center of mass (mm): [-17.57421485 -21.20790062  38.92218388]
[2025-12-04 14:38:52] Applied translation: [-23.3746608   55.03549008  59.98292854]
[2025-12-04 14:38:59] Saved corrected file to ../../data/data_DIASEM/renamed_data/RF/test/labels_registered/TT129_RF.nii.gz
[2025-12-04 14:38:59] 
--- Processing ZS153_RF.nii.gz ---
[2025-12-04 14:39:09] Applied Z-flip.
[2025-12-04 14:39:10] Converted to uint8.
[2025-12-04 14:39:22] Center

## ‚úÖ Summary of the Preprocessing

After all files have been processed, we display a summary including:

- the reference file used,
- the reference center of mass (in mm),
- the number of flipped files,
- the list of volumes that required flipping (if any).

A detailed `processing_log.txt` file is available in your output folder,  
containing timestamps and a full record of every step applied to each volume.

This makes the preprocessing pipeline suitable for:
- scientific publications,
- reproducible experiments,
- open-source distribution,
- dataset preparation.


In [7]:
log("\n=== Processing Complete ===")
log(f"Reference file: {ref_file_name}")
log(f"Reference center of mass (mm): {ref_com_mm}")
log(f"Number of flipped files: {len(inversed_files)}")

if len(inversed_files) > 0:
    log(f"Files flipped: {inversed_files}")
else:
    log("No files were flipped.")

print("\nProcessing finished. See log file for details:")
print(log_path)


[2025-12-04 14:40:54] 
=== Processing Complete ===
[2025-12-04 14:40:54] Reference file: TC152_RF.nii.gz
[2025-12-04 14:40:54] Reference center of mass (mm): [-40.94887565  33.82758946  98.90511242]
[2025-12-04 14:40:54] Number of flipped files: 1
[2025-12-04 14:40:54] Files flipped: ['ZS153_RF.nii.gz']

Processing finished. See log file for details:
../../data/data_DIASEM/renamed_data/RF/test/labels_registered/processing_log.txt
