<div align="center">

# ArchiMed Images V2.0 - Clean & Modular

**Enhanced lung segmentation with separate left/right detection**  
*Organized with external function files for better maintainability*

</div>

## ✨ Key Improvements in V2.0
- **🗂️ Modular Design** - Functions organized in external `.py` files
- **📝 Format Selection** - Choose between PNG or NIFTI output formats  
- **🫁 Advanced Segmentation** - Separate left/right lung detection
- **🔧 Enhanced Processing** - Ultra-sensitive detection with multiple fallbacks
- **📊 Better Visualization** - Color-coded overlays and comprehensive reporting

## 📁 Output Files
For each image, creates:
- `{file_id}.png/.nii.gz` - Processed image in selected format
- `{file_id}_left_lung_mask.png` - Left lung only  
- `{file_id}_right_lung_mask.png` - Right lung only
- `{file_id}_combined_mask.png` - Both lungs combined
- `{file_id}_overlay.png` - Color-coded visualization


In [None]:
# ===== CONFIGURATION PARAMETERS =====

# Data Paths
CSV_FOLDER = "/home/pyuser/data/Paradise_CSV/"
CSV_LABELS_FILE = "Labeled_Data_RAW.csv"
CSV_SEPARATOR = ";"

DOWNLOAD_PATH = '/home/pyuser/data/Paradise_DICOMs'
# IMAGES_PATH = '/home/pyuser/data/Paradise_Images'
# MASKS_PATH = '/home/pyuser/data/Paradise_Masks'
IMAGES_PATH = '/home/pyuser/data/Tests_Images'
MASKS_PATH = '/home/pyuser/data/Tests_Masks'

# ===== NEW V2.0 FEATURE: OUTPUT FORMAT SELECTION =====
# Choose output format: 'png' or 'nifti'
OUTPUT_FORMAT = 'nifti'  # Options: 'png', 'nifti'

# Processing Settings
TARGET_SIZE = (518, 518)  # Final image size (width, height)
CROP_MARGIN = 25          # Margin around lung segmentation for cropping (pixels)

# ArchiMed Integration
USE_ARCHIMED = True       # Try ArchiMed first (recommended)
DOWNLOAD_IF_MISSING = True  # Download from ArchiMed if files not found locally

# Segmentation Parameters
MODEL_SENSITIVITY = 0.0001           # Lower = more sensitive (0.0001 to 0.5)
ENABLE_HISTOGRAM_EQUALIZATION = True # Enhance contrast before segmentation
ENABLE_GAUSSIAN_BLUR = True          # Reduce noise before segmentation
USE_MULTIPLE_THRESHOLDS = True       # Try multiple sensitivity levels
AGGRESSIVE_MORPHOLOGY = True         # More aggressive mask cleanup
KEEP_LARGEST_COMPONENT_ONLY = True   # Keep only largest component per lung
ENABLE_DEBUG_OUTPUT = False          # Print segmentation debug info

# Visualization Settings
SAVE_MASKS = True        # Save segmentation masks and overlays
LUNG_FILL_OPACITY = 0.25    # Lung mask fill opacity (0.0 to 1.0)
LUNG_BORDER_OPACITY = 0.50  # Lung mask border opacity (0.0 to 1.0)

print("✅ Configuration loaded successfully!")
print(f"📝 Output format: {OUTPUT_FORMAT.upper()}")
print(f"🎯 Target size: {TARGET_SIZE[0]}x{TARGET_SIZE[1]}")
print(f"🫁 Model sensitivity: {MODEL_SENSITIVITY}")
print(f"💾 Save masks: {SAVE_MASKS}")


In [None]:
# ===== IMPORTS AND INITIALIZATION =====

import os
import sys
from typing import Dict, Any, List

# Import our modular functions from the functions package
from functions import *

print("📦 All function modules imported successfully from functions package!")

# Create configuration dictionary for easy passing
config = {
    # Data paths
    'CSV_FOLDER': CSV_FOLDER,
    'CSV_LABELS_FILE': CSV_LABELS_FILE,
    'CSV_SEPARATOR': CSV_SEPARATOR,
    'DOWNLOAD_PATH': DOWNLOAD_PATH,
    'IMAGES_PATH': IMAGES_PATH,
    'MASKS_PATH': MASKS_PATH,
    
    # Processing settings
    'OUTPUT_FORMAT': OUTPUT_FORMAT,
    'TARGET_SIZE': TARGET_SIZE,
    'CROP_MARGIN': CROP_MARGIN,
    
    # ArchiMed integration
    'USE_ARCHIMED': USE_ARCHIMED,
    'DOWNLOAD_IF_MISSING': DOWNLOAD_IF_MISSING,
    
    # Segmentation parameters
    'MODEL_SENSITIVITY': MODEL_SENSITIVITY,
    'ENABLE_HISTOGRAM_EQUALIZATION': ENABLE_HISTOGRAM_EQUALIZATION,
    'ENABLE_GAUSSIAN_BLUR': ENABLE_GAUSSIAN_BLUR,
    'USE_MULTIPLE_THRESHOLDS': USE_MULTIPLE_THRESHOLDS,
    'AGGRESSIVE_MORPHOLOGY': AGGRESSIVE_MORPHOLOGY,
    'KEEP_LARGEST_COMPONENT_ONLY': KEEP_LARGEST_COMPONENT_ONLY,
    'ENABLE_DEBUG_OUTPUT': ENABLE_DEBUG_OUTPUT,
    
    # Visualization settings
    'SAVE_MASKS': SAVE_MASKS,
    'LUNG_FILL_OPACITY': LUNG_FILL_OPACITY,
    'LUNG_BORDER_OPACITY': LUNG_BORDER_OPACITY,
}

# Validate configuration
is_valid, errors = validate_configuration(config)
if not is_valid:
    print("❌ Configuration validation failed:")
    for error in errors:
        print(f"   • {error}")
    sys.exit(1)

print("✅ Configuration validated successfully!")

# Print configuration summary
print_configuration_summary(config)


In [None]:
# ===== SEGMENTATION MODEL INITIALIZATION =====

print("🔄 Initializing lung segmentation model...")

# Initialize the segmentation model (TorchXRayVision or fallback)
segmentation_model, model_type = initialize_segmentation_model()

# Store model type in config for use by other functions
config['model_type'] = model_type

print(f"✅ Segmentation model ready: {model_type}")

if model_type == 'torchxray':
    print("🎯 Using TorchXRayVision for advanced left/right lung detection")
elif model_type == 'fallback':
    print("⚡ Using fallback method (combined lung detection only)")
    print("💡 Install TorchXRayVision for separate left/right lung detection")


In [None]:
# ===== FILE DISCOVERY AND DOWNLOAD =====

print("🔍 Starting file discovery and download process...")
print("=" * 60)

# Use the comprehensive file discovery and download management
dicom_files, discovery_stats = manage_file_discovery(config)

# Print discovery summary
print_file_discovery_summary(
    discovery_stats['local_files_found'],
    discovery_stats['downloaded_files'], 
    discovery_stats['failed_downloads']
)

# Check if we have files to process
if not dicom_files:
    print("❌ No DICOM files found!")
    print("💡 Check your configuration:")
    print(f"   • CSV path: {config.get('CSV_FOLDER')}/{config.get('CSV_LABELS_FILE')}")
    print(f"   • Download path: {config.get('DOWNLOAD_PATH')}")
    print("   • ArchiMed connection status")
else:
    print(f"\n🎉 Ready to process {len(dicom_files)} DICOM files!")
    
    # Show sample file paths for verification
    if len(dicom_files) <= 5:
        print("📋 Files to process:")
        for dicom_file in dicom_files:
            print(f"   • {os.path.basename(dicom_file)}")
    else:
        print("📋 Sample files to process:")
        for dicom_file in dicom_files[:3]:
            print(f"   • {os.path.basename(dicom_file)}")
        print(f"   • ... and {len(dicom_files) - 3} more files")


In [None]:
# ===== MAIN PROCESSING PIPELINE =====

if dicom_files:
    print("\n🚀 Starting main processing pipeline...")
    print("=" * 60)
    
    # Create output directories
    os.makedirs(config['IMAGES_PATH'], exist_ok=True)
    if config['SAVE_MASKS']:
        os.makedirs(config['MASKS_PATH'], exist_ok=True)
    
    # Initialize results tracking
    results = {
        'total_files_found': len(dicom_files),
        'successfully_processed': 0,
        'segmentation_successes': 0,
        'failed_conversions': 0,
        'skipped_existing': 0,
        'mask_files_created': 0,
        'processed_files': [],
        'errors': [],
        'model_type': model_type
    }
    
    # Create progress tracker
    pbar = create_progress_tracker(len(dicom_files), "Processing DICOM files")
    
    for dicom_path in dicom_files:
        try:
            file_id = os.path.splitext(os.path.basename(dicom_path))[0]
            
            # Generate output path based on format
            output_path = get_output_path(dicom_path, config['IMAGES_PATH'], config['OUTPUT_FORMAT'])
            
            # Skip if output already exists (optional)
            if os.path.exists(output_path):
                results['skipped_existing'] += 1
                pbar.set_postfix({
                    "Processed": results['successfully_processed'],
                    "Skipped": results['skipped_existing']
                })
                pbar.update(1)
                continue
            
            # Step 1: Read and normalize DICOM
            image_array, dicom_data, status = read_dicom_file(dicom_path)
            if image_array is None:
                results['errors'].append(f"{file_id}: {status}")
                results['failed_conversions'] += 1
                pbar.update(1)
                continue
            
            # Normalize to uint8 for processing
            image_array = normalize_image_array(image_array, 'uint8')
            
            # Step 2: Apply segmentation and processing
            processed_image, processing_info = process_image_with_segmentation(image_array, file_id, config)
            
            # Step 3: Save visualization files if segmentation succeeded
            visualization_success = False
            if config['SAVE_MASKS'] and processing_info['segmentation_success']:
                visualization_success = save_visualization_files(
                    processing_info, config['MASKS_PATH'], image_array, config
                )
                if visualization_success:
                    results['mask_files_created'] += 4  # L/R/Combined/Overlay
            
            # Step 4: Convert and save in final format
            conversion_success = convert_dicom_to_format(
                dicom_path, output_path, config['OUTPUT_FORMAT'], 
                config['TARGET_SIZE'], processed_image, processing_info
            )
            
            if conversion_success:
                results['successfully_processed'] += 1
                
                # Track segmentation success
                if processing_info['segmentation_success']:
                    results['segmentation_successes'] += 1
                
                # Record file processing info
                results['processed_files'].append({
                    'file_id': file_id,
                    'success': True,
                    'segmentation_success': processing_info['segmentation_success'],
                    'visualization_success': visualization_success,
                    'output_path': output_path
                })
            else:
                results['failed_conversions'] += 1
                results['errors'].append(f"{file_id}: Conversion failed")
            
            # Update progress bar
            pbar.set_postfix({
                "Processed": results['successfully_processed'],
                "Segmented": results['segmentation_successes']
            })
            
        except Exception as e:
            file_id = os.path.splitext(os.path.basename(dicom_path))[0]
            results['errors'].append(f"{file_id}: {str(e)}")
            results['failed_conversions'] += 1
        
        pbar.update(1)
    
    pbar.close()
    
    # Print comprehensive results summary
    print_processing_summary(results, config)
    
    # Display sample results
    display_sample_results(results, config, num_samples=5)
    
    # Save processing log
    log_path = os.path.join(config['IMAGES_PATH'], 'processing_log.json')
    save_processing_log(results, config, log_path)

else:
    print("⏭️ No files to process - skipping processing pipeline")
    results = {
        'total_files_found': 0,
        'successfully_processed': 0,
        'segmentation_successes': 0,
        'failed_conversions': 0,
        'errors': ["No DICOM files found"],
        'model_type': model_type
    }


In [None]:
# ===== NIFTI DEBUGGING AND TESTING =====

if config['OUTPUT_FORMAT'].lower() == 'nifti' and dicom_files:
    print("\n🧪 NIFTI Debugging Section")
    print("=" * 50)
    
    # Test with the first available DICOM file
    test_dicom = dicom_files[0]
    print(f"Testing NIFTI conversion with: {os.path.basename(test_dicom)}")
    
    # Run the isolated NIFTI test
    test_result = test_nifti_conversion(test_dicom, config['IMAGES_PATH'])
    
    if test_result:
        print("✅ Basic NIFTI conversion working!")
    else:
        print("❌ Basic NIFTI conversion failed - check dependencies")
        print("💡 Make sure nibabel is installed: pip install nibabel")
    
    print("=" * 50)

# ===== SINGLE FILE TESTING (OPTIONAL) =====
# Uncomment and modify this section to test with a single DICOM file

"""
# Test with a single DICOM file
test_dicom_path = "/path/to/your/test_file.dcm"  # Modify this path

if os.path.exists(test_dicom_path):
    print(f"🧪 Testing with single file: {os.path.basename(test_dicom_path)}")
    
    file_id = os.path.splitext(os.path.basename(test_dicom_path))[0]
    
    # Read DICOM
    image_array, dicom_data, status = read_dicom_file(test_dicom_path)
    
    if image_array is not None:
        print(f"✅ DICOM read successfully - Shape: {image_array.shape}")
        
        # Normalize
        image_array = normalize_image_array(image_array, 'uint8')
        
        # Process with segmentation
        processed_image, processing_info = process_image_with_segmentation(image_array, file_id, config)
        
        print(f"🫁 Segmentation: {'✅ Success' if processing_info['segmentation_success'] else '❌ Failed'}")
        
        # Save test outputs
        test_output_dir = os.path.join(config['IMAGES_PATH'], 'test_output')
        os.makedirs(test_output_dir, exist_ok=True)
        
        # Save processed image
        output_path = get_output_path(test_dicom_path, test_output_dir, config['OUTPUT_FORMAT'], f"{file_id}_test")
        conversion_success = convert_dicom_to_format(
            test_dicom_path, output_path, config['OUTPUT_FORMAT'], 
            config['TARGET_SIZE'], processed_image, processing_info
        )
        
        if conversion_success:
            print(f"✅ Test image saved: {output_path}")
        
        # Save test masks
        if config['SAVE_MASKS'] and processing_info['segmentation_success']:
            test_masks_dir = os.path.join(config['MASKS_PATH'], 'test_output')
            save_visualization_files(processing_info, test_masks_dir, image_array, config)
            print(f"✅ Test masks saved to: {test_masks_dir}")
    
    else:
        print(f"❌ Failed to read DICOM: {status}")
        
else:
    print("ℹ️ Single file testing section - modify test_dicom_path to use")
"""

print("ℹ️ Single file testing section available - uncomment and modify to use")


## 🎉 Pipeline Completed!

### 📁 Output Structure

Your processed files are organized as follows:

```
{IMAGES_PATH}/
├── {file_id}.png/.nii.gz          # Processed images in selected format
└── processing_log.json            # Detailed processing log

{MASKS_PATH}/                       # (if SAVE_MASKS = True)
├── {file_id}_left_lung_mask.png    # Left lung masks
├── {file_id}_right_lung_mask.png   # Right lung masks  
├── {file_id}_combined_mask.png     # Combined lung masks
└── {file_id}_overlay.png           # Color-coded visualizations
```

### 🔧 Key V2.0 Improvements

1. **🗂️ Modular Architecture**: Functions organized in separate files for better maintainability
2. **📝 Format Flexibility**: Easy switching between PNG and NIFTI output formats
3. **🫁 Advanced Segmentation**: Separate left/right lung detection with TorchXRayVision
4. **📊 Enhanced Reporting**: Comprehensive progress tracking and result summaries
5. **🔍 Better Error Handling**: Detailed error reporting and validation
6. **⚡ Optimized Processing**: Streamlined pipeline with better memory management

### 🛠️ Function Files

- **`functions_conversion.py`**: DICOM reading, format conversion, file I/O
- **`functions_pre-processing.py`**: Image enhancement, segmentation, morphological operations  
- **`functions_visualisation.py`**: Overlay creation, progress tracking, result display
- **`functions_archimed.py`**: ArchiMed integration, file discovery, CSV management

### 💡 Next Steps

- Check the output directories for your processed images
- Review the `processing_log.json` for detailed processing information
- Use the overlay images to verify segmentation quality
- Adjust configuration parameters as needed for your specific use case
