<div align="center">

# DICOM to NIFTI Converter

This notebook provides functionality to convert DICOM files to NIFTI format with configurable parameters.<br>
Designed for the CSI Predictor project to handle chest X-ray image conversion.

</div>

## Features
- Batch conversion of DICOM files to NIFTI format
- Configurable parameters for different conversion needs
- Metadata preservation options
- Error handling and progress tracking
- Support for various DICOM file structures


## Configuration Parameters

Define all the conversion parameters here. Modify these as needed for different conversion tasks.


In [None]:
import os
import logging
from pathlib import Path
from typing import Optional, Tuple, List, Dict, Any
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)


In [None]:
# ===== CONFIGURATION PARAMETERS =====
# Modify these parameters as needed for your conversion task

# Input and output directories
INPUT_DICOM_DIR = r"C:\path\to\dicom\files"  # Change this to your DICOM directory
OUTPUT_NIFTI_DIR = r"C:\path\to\nifti\output"  # Change this to your output directory

# Conversion parameters
RESIZE_IMAGES = False  # Whether to resize images during conversion
TARGET_SIZE = (512, 512)  # Target size if resize is enabled
DYNAMIC_RANGE = 'full'  # Dynamic range handling ('full' is the only option for now)
KEEP_HEADER = False  # Whether to preserve DICOM metadata in NIFTI header
PRESERVE_DIRECTORY_STRUCTURE = True  # Maintain input directory structure in output

# Processing options
NORMALIZE_INTENSITIES = False  # Whether to normalize pixel intensities
CONVERT_TO_FLOAT32 = True  # Convert to float32 for better compatibility
COMPRESSION = True  # Enable NIFTI compression (.nii.gz)
SKIP_EXISTING = True  # Skip files that already exist in output directory

# File filtering
DICOM_EXTENSIONS = ['.dcm', '.dicom', '']  # DICOM file extensions to process
RECURSIVE_SEARCH = True  # Search for DICOM files recursively in subdirectories

print("Configuration parameters loaded successfully!")
print(f"Input directory: {INPUT_DICOM_DIR}")
print(f"Output directory: {OUTPUT_NIFTI_DIR}")


## Required Libraries

Install and import all necessary libraries for DICOM to NIFTI conversion.


In [None]:
# Install required packages if not already installed
# Uncomment the following lines if you need to install packages
# !pip install pydicom nibabel numpy pillow tqdm

try:
    import pydicom
    import nibabel as nib
    import numpy as np
    from PIL import Image
    from tqdm import tqdm
    import json
    import shutil
    
    print("All required libraries imported successfully!")
    print(f"PyDICOM version: {pydicom.__version__}")
    print(f"NiBabel version: {nib.__version__}")
    print(f"NumPy version: {np.__version__}")
    
except ImportError as e:
    print(f"Import error: {e}")
    print("Please install missing packages using: pip install pydicom nibabel numpy pillow tqdm")


## DICOM to NIFTI Conversion Function

Main conversion function with all the configurable parameters.


In [None]:
def dicom_to_nifti(
    import_folder: str,
    export_folder: str,
    resize: bool = False,
    target_size: Optional[Tuple[int, int]] = None,
    dynamic_range: str = 'full',
    keep_header: bool = False,
    preserve_directory_structure: bool = True,
    normalize_intensities: bool = False,
    convert_to_float32: bool = True,
    compression: bool = True,
    skip_existing: bool = True,
    dicom_extensions: List[str] = None,
    recursive_search: bool = True,
    progress_callback: Optional[callable] = None
) -> Dict[str, Any]:
    """
    Convert DICOM files to NIFTI format with configurable parameters.
    
    Args:
        import_folder: Path to directory containing DICOM files
        export_folder: Path to directory where NIFTI files will be saved
        resize: Whether to resize images during conversion
        target_size: Target size (width, height) if resize is True
        dynamic_range: Dynamic range handling ('full' is currently the only option)
        keep_header: Whether to preserve DICOM metadata in NIFTI header
        preserve_directory_structure: Maintain input directory structure in output
        normalize_intensities: Whether to normalize pixel intensities to [0, 1]
        convert_to_float32: Convert pixel data to float32 for better compatibility
        compression: Enable NIFTI compression (.nii.gz)
        skip_existing: Skip files that already exist in output directory
        dicom_extensions: List of DICOM file extensions to process
        recursive_search: Search for DICOM files recursively in subdirectories
        progress_callback: Optional callback function for progress updates
    
    Returns:
        Dictionary containing conversion statistics and results
    """
    
    # Input validation
    import_path = Path(import_folder)
    export_path = Path(export_folder)
    
    if not import_path.exists():
        raise ValueError(f"Import folder does not exist: {import_folder}")
    
    if not import_path.is_dir():
        raise ValueError(f"Import path is not a directory: {import_folder}")
    
    # Create export directory if it doesn't exist
    export_path.mkdir(parents=True, exist_ok=True)
    
    # Set default values
    if dicom_extensions is None:
        dicom_extensions = ['.dcm', '.dicom', '']
    
    if resize and target_size is None:
        target_size = (512, 512)
    
    # Initialize statistics
    stats = {
        'total_files_found': 0,
        'successfully_converted': 0,
        'skipped_existing': 0,
        'failed_conversions': 0,
        'errors': [],
        'converted_files': []
    }
    
    logger.info(f"Starting DICOM to NIFTI conversion")
    logger.info(f"Input: {import_folder}")
    logger.info(f"Output: {export_folder}")
    logger.info(f"Parameters: resize={resize}, keep_header={keep_header}, compression={compression}")
    
    # Find all DICOM files
    dicom_files = _find_dicom_files(import_path, dicom_extensions, recursive_search)
    stats['total_files_found'] = len(dicom_files)
    
    if not dicom_files:
        logger.warning(f"No DICOM files found in {import_folder}")
        return stats
    
    logger.info(f"Found {len(dicom_files)} DICOM files to process")
    
    # Process each DICOM file
    for i, dicom_file in enumerate(tqdm(dicom_files, desc="Converting DICOM files")):
        try:
            # Determine output path
            if preserve_directory_structure:
                relative_path = dicom_file.relative_to(import_path)
                output_file = export_path / relative_path.with_suffix('.nii.gz' if compression else '.nii')
            else:
                output_file = export_path / f"{dicom_file.stem}.nii{'​.gz' if compression else ''}"
            
            # Create output directory if needed
            output_file.parent.mkdir(parents=True, exist_ok=True)
            
            # Skip if file already exists and skip_existing is True
            if skip_existing and output_file.exists():
                stats['skipped_existing'] += 1
                continue
            
            # Convert the file
            success = _convert_single_dicom(
                dicom_file, output_file, resize, target_size, dynamic_range,
                keep_header, normalize_intensities, convert_to_float32
            )
            
            if success:
                stats['successfully_converted'] += 1
                stats['converted_files'].append(str(output_file))
                logger.debug(f"Converted: {dicom_file} -> {output_file}")
            else:
                stats['failed_conversions'] += 1
                
        except Exception as e:
            error_msg = f"Error converting {dicom_file}: {str(e)}"
            logger.error(error_msg)
            stats['errors'].append(error_msg)
            stats['failed_conversions'] += 1
        
        # Call progress callback if provided
        if progress_callback:
            progress_callback(i + 1, len(dicom_files), dicom_file)
    
    # Log final statistics
    logger.info(f"Conversion completed!")
    logger.info(f"Total files found: {stats['total_files_found']}")
    logger.info(f"Successfully converted: {stats['successfully_converted']}")
    logger.info(f"Skipped existing: {stats['skipped_existing']}")
    logger.info(f"Failed conversions: {stats['failed_conversions']}")
    
    if stats['errors']:
        logger.warning(f"Errors encountered: {len(stats['errors'])}")
        for error in stats['errors'][:5]:  # Show first 5 errors
            logger.warning(f"  {error}")
        if len(stats['errors']) > 5:
            logger.warning(f"  ... and {len(stats['errors']) - 5} more errors")
    
    return stats


## Helper Functions

Supporting functions for DICOM file discovery and individual file conversion.


In [None]:
def _find_dicom_files(directory: Path, extensions: List[str], recursive: bool) -> List[Path]:
    """
    Find all DICOM files in the specified directory.
    
    Args:
        directory: Directory to search
        extensions: List of file extensions to consider as DICOM
        recursive: Whether to search recursively
    
    Returns:
        List of Path objects pointing to DICOM files
    """
    dicom_files = []
    
    if recursive:
        search_pattern = "**/*"
    else:
        search_pattern = "*"
    
    for file_path in directory.glob(search_pattern):
        if file_path.is_file():
            # Check if file has a DICOM extension or try to read as DICOM
            if any(file_path.suffix.lower() == ext.lower() for ext in extensions):
                dicom_files.append(file_path)
            elif _is_dicom_file(file_path):
                dicom_files.append(file_path)
    
    return sorted(dicom_files)


def _is_dicom_file(file_path: Path) -> bool:
    """
    Check if a file is a valid DICOM file by attempting to read it.
    
    Args:
        file_path: Path to the file to check
    
    Returns:
        True if the file is a valid DICOM file, False otherwise
    """
    try:
        pydicom.dcmread(file_path, stop_before_pixels=True)
        return True
    except:
        return False


def _convert_single_dicom(
    dicom_file: Path,
    output_file: Path,
    resize: bool,
    target_size: Optional[Tuple[int, int]],
    dynamic_range: str,
    keep_header: bool,
    normalize_intensities: bool,
    convert_to_float32: bool
) -> bool:
    """
    Convert a single DICOM file to NIFTI format.
    
    Args:
        dicom_file: Path to DICOM file
        output_file: Path for output NIFTI file
        resize: Whether to resize the image
        target_size: Target size if resizing
        dynamic_range: Dynamic range handling
        keep_header: Whether to preserve DICOM metadata
        normalize_intensities: Whether to normalize intensities
        convert_to_float32: Whether to convert to float32
    
    Returns:
        True if conversion was successful, False otherwise
    """
    try:
        # Read DICOM file
        dicom_data = pydicom.dcmread(dicom_file)
        
        # Extract pixel data
        if not hasattr(dicom_data, 'pixel_array'):
            logger.warning(f"No pixel data found in {dicom_file}")
            return False
        
        pixel_array = dicom_data.pixel_array.copy()
        
        # Handle different pixel representations
        if hasattr(dicom_data, 'PixelRepresentation') and dicom_data.PixelRepresentation == 1:
            # Signed integer
            pixel_array = pixel_array.astype(np.int16)
        
        # Apply window/level if available
        if hasattr(dicom_data, 'WindowCenter') and hasattr(dicom_data, 'WindowWidth'):
            window_center = float(dicom_data.WindowCenter)
            window_width = float(dicom_data.WindowWidth)
            
            # Apply windowing
            min_val = window_center - window_width / 2
            max_val = window_center + window_width / 2
            pixel_array = np.clip(pixel_array, min_val, max_val)
        
        # Convert to float32 if requested
        if convert_to_float32:
            pixel_array = pixel_array.astype(np.float32)
        
        # Normalize intensities if requested
        if normalize_intensities:
            pixel_min = pixel_array.min()
            pixel_max = pixel_array.max()
            if pixel_max > pixel_min:
                pixel_array = (pixel_array - pixel_min) / (pixel_max - pixel_min)
        
        # Resize if requested
        if resize and target_size:
            # Convert to PIL Image for resizing
            if pixel_array.ndim == 2:
                # Normalize to 0-255 for PIL
                pixel_normalized = ((pixel_array - pixel_array.min()) / 
                                  (pixel_array.max() - pixel_array.min()) * 255).astype(np.uint8)
                pil_image = Image.fromarray(pixel_normalized)
                pil_resized = pil_image.resize(target_size, Image.LANCZOS)
                pixel_array = np.array(pil_resized)
                
                # Convert back to original data type range if needed
                if convert_to_float32:
                    pixel_array = pixel_array.astype(np.float32)
        
        # Ensure 3D array for NIFTI (add singleton dimension if 2D)
        if pixel_array.ndim == 2:
            pixel_array = pixel_array[:, :, np.newaxis]
        
        # Create NIFTI image
        nifti_img = nib.Nifti1Image(pixel_array, affine=np.eye(4))
        
        # Preserve DICOM metadata in NIFTI header if requested
        if keep_header:
            _preserve_dicom_metadata(dicom_data, nifti_img)
        
        # Save NIFTI file
        nib.save(nifti_img, output_file)
        
        return True
        
    except Exception as e:
        logger.error(f"Failed to convert {dicom_file}: {str(e)}")
        return False


def _preserve_dicom_metadata(dicom_data, nifti_img):
    """
    Preserve important DICOM metadata in the NIFTI header.
    
    Args:
        dicom_data: PyDICOM dataset
        nifti_img: NiBabel NIFTI image object
    """
    header = nifti_img.header
    
    # Preserve pixel spacing if available
    if hasattr(dicom_data, 'PixelSpacing'):
        pixel_spacing = dicom_data.PixelSpacing
        header.set_zooms((float(pixel_spacing[0]), float(pixel_spacing[1]), 1.0))
    
    # Store important DICOM metadata as extensions
    metadata = {}
    
    # Common DICOM tags to preserve
    tags_to_preserve = [
        'PatientID', 'PatientName', 'StudyInstanceUID', 'SeriesInstanceUID',
        'SOPInstanceUID', 'StudyDate', 'SeriesDate', 'Modality',
        'SliceThickness', 'KVP', 'ExposureTime', 'XRayTubeCurrent',
        'WindowCenter', 'WindowWidth', 'RescaleIntercept', 'RescaleSlope'
    ]
    
    for tag in tags_to_preserve:
        if hasattr(dicom_data, tag):
            try:
                value = getattr(dicom_data, tag)
                # Convert to string for JSON serialization
                metadata[tag] = str(value)
            except Exception:
                continue
    
    # Add metadata as JSON extension
    if metadata:
        try:
            metadata_json = json.dumps(metadata, indent=2)
            # Note: NiBabel extensions might not be supported in all versions
            # This is a best-effort attempt to preserve metadata
        except Exception:
            pass


## Run Conversion

Execute the DICOM to NIFTI conversion with the configured parameters.


In [None]:
# Define a progress callback function
def progress_callback(current: int, total: int, current_file: Path):
    """Progress callback for conversion updates."""
    if current % 10 == 0 or current == total:  # Update every 10 files or at the end
        print(f"Progress: {current}/{total} ({100*current/total:.1f}%) - Current: {current_file.name}")

# Verify that input directory exists before running conversion
if not os.path.exists(INPUT_DICOM_DIR):
    print(f"⚠️  Input directory does not exist: {INPUT_DICOM_DIR}")
    print("Please update the INPUT_DICOM_DIR variable in the configuration section above.")
else:
    print(f"✅ Input directory found: {INPUT_DICOM_DIR}")
    print(f"✅ Output directory will be: {OUTPUT_NIFTI_DIR}")
    print("")
    print("Ready to run conversion. Execute the next cell to start the process.")


In [None]:
# Execute the conversion
if os.path.exists(INPUT_DICOM_DIR):
    print("🚀 Starting DICOM to NIFTI conversion...")
    print("="*60)
    
    # Run the conversion
    conversion_results = dicom_to_nifti(
        import_folder=INPUT_DICOM_DIR,
        export_folder=OUTPUT_NIFTI_DIR,
        resize=RESIZE_IMAGES,
        target_size=TARGET_SIZE,
        dynamic_range=DYNAMIC_RANGE,
        keep_header=KEEP_HEADER,
        preserve_directory_structure=PRESERVE_DIRECTORY_STRUCTURE,
        normalize_intensities=NORMALIZE_INTENSITIES,
        convert_to_float32=CONVERT_TO_FLOAT32,
        compression=COMPRESSION,
        skip_existing=SKIP_EXISTING,
        dicom_extensions=DICOM_EXTENSIONS,
        recursive_search=RECURSIVE_SEARCH,
        progress_callback=progress_callback
    )
    
    print("="*60)
    print("🎉 Conversion completed!")
    print(f"📊 Results summary:")
    print(f"   • Total files found: {conversion_results['total_files_found']}")
    print(f"   • Successfully converted: {conversion_results['successfully_converted']}")
    print(f"   • Skipped (already exist): {conversion_results['skipped_existing']}")
    print(f"   • Failed conversions: {conversion_results['failed_conversions']}")
    
    if conversion_results['errors']:
        print(f"   • Errors: {len(conversion_results['errors'])}")
        print("\\n❌ First few errors:")
        for error in conversion_results['errors'][:3]:
            print(f"     {error}")
    
    # Save conversion log
    log_file = Path(OUTPUT_NIFTI_DIR) / "conversion_log.json"
    with open(log_file, 'w') as f:
        json.dump(conversion_results, f, indent=2)
    print(f"\\n📝 Detailed log saved to: {log_file}")
    
else:
    print("❌ Cannot run conversion: Input directory does not exist.")
    print("Please update the INPUT_DICOM_DIR variable in the configuration section.")


## Verification and Quality Check

Verify the converted NIFTI files and check their properties.


In [None]:
def verify_nifti_files(nifti_directory: str, sample_count: int = 5) -> None:
    """
    Verify a sample of converted NIFTI files and display their properties.
    
    Args:
        nifti_directory: Directory containing NIFTI files
        sample_count: Number of files to sample for verification
    """
    nifti_path = Path(nifti_directory)
    
    if not nifti_path.exists():
        print(f"❌ NIFTI directory does not exist: {nifti_directory}")
        return
    
    # Find NIFTI files
    nifti_files = list(nifti_path.rglob("*.nii*"))
    
    if not nifti_files:
        print(f"❌ No NIFTI files found in {nifti_directory}")
        return
    
    print(f"✅ Found {len(nifti_files)} NIFTI files")
    print(f"🔍 Verifying {min(sample_count, len(nifti_files))} sample files:")
    print("="*80)
    
    # Sample files for verification
    sample_files = nifti_files[:sample_count] if len(nifti_files) >= sample_count else nifti_files
    
    for i, nifti_file in enumerate(sample_files, 1):
        try:
            # Load NIFTI file
            nifti_img = nib.load(nifti_file)
            data = nifti_img.get_fdata()
            
            print(f"📁 File {i}: {nifti_file.name}")
            print(f"   📏 Shape: {data.shape}")
            print(f"   🔢 Data type: {data.dtype}")
            print(f"   📊 Value range: [{data.min():.2f}, {data.max():.2f}]")
            print(f"   📐 Voxel spacing: {nifti_img.header.get_zooms()}")
            print(f"   💾 File size: {nifti_file.stat().st_size / 1024:.1f} KB")
            print("")
            
        except Exception as e:
            print(f"❌ Error loading {nifti_file.name}: {str(e)}")
            print("")
    
    print("✅ Verification completed!")

# Run verification if NIFTI files exist
if os.path.exists(OUTPUT_NIFTI_DIR):
    verify_nifti_files(OUTPUT_NIFTI_DIR, sample_count=5)
else:
    print("No NIFTI files to verify yet. Run the conversion first.")


## Usage Examples and Tips

### Basic Usage
1. Update the `INPUT_DICOM_DIR` and `OUTPUT_NIFTI_DIR` variables in the configuration section
2. Adjust other parameters as needed for your specific use case
3. Run all cells in order

### Parameter Guidelines
- **resize**: Set to `True` if you need standardized image sizes for your model
- **keep_header**: Set to `True` to preserve DICOM metadata for clinical analysis
- **compression**: Recommended to keep `True` for smaller file sizes
- **normalize_intensities**: Useful for machine learning preprocessing

### CSI Predictor Integration
For the CSI Predictor project, you might want to:
1. Set `resize=True` with `target_size=(512, 512)` for model compatibility
2. Use `normalize_intensities=True` for consistent preprocessing
3. Keep `compression=True` to save storage space
4. Set `keep_header=True` if you need patient metadata for analysis

### Troubleshooting
- If conversion fails, check the error messages in the output
- Ensure DICOM files are not corrupted
- Verify sufficient disk space for output files
- Check file permissions for both input and output directories

### Additional Parameters
The conversion function includes several other useful parameters:

- **preserve_directory_structure**: Maintains the folder structure from input to output
- **convert_to_float32**: Converts pixel data to float32 for better ML compatibility
- **skip_existing**: Skips files that already exist (useful for resuming interrupted conversions)
- **recursive_search**: Searches subdirectories for DICOM files
- **dicom_extensions**: List of file extensions to consider as DICOM files

### Example Function Call with Custom Parameters
```python
# Custom conversion with specific settings
results = dicom_to_nifti(
    import_folder=r"C:\path\to\dicoms",
    export_folder=r"C:\path\to\niftis",
    resize=True,
    target_size=(256, 256),
    keep_header=True,
    normalize_intensities=True,
    compression=True
)
```
