**<h1 align="center">Download ArchiMed Images V1.2 - Enhanced with Lung Segmentation</h1>**

## 🚀 **New in V1.2:**
- **Lung Segmentation**: Automatic lung region detection using U-Net
- **Smart Cropping**: Focus on lung regions before resizing
- **Zone-Aware Processing**: Foundation for true anatomical zone prediction
- **Enhanced Quality**: Preserve lung details during preprocessing

Based on research from: [Automatic lung segmentation in chest X-ray images using improved U-Net](https://www.nature.com/articles/s41598-022-12743-y)

## Global Variables

### Project Specific Variables

In [None]:
# CSV Files
CSV_FOLDER = "../../data/Paradise_CSV/"
CSV_LABELS_FILE = "Labeled_Data_RAW_Sample.csv"
CSV_SEPARATOR = ";"  # Specify the CSV separator, e.g., ',' or '\t'
IMPORT_COLUMNS = []  # If empty, import all columns
CHUNK_SIZE = 50000  # Number of rows per chunk

# Download parameters
DOWNLOAD_PATH = '../../data/Paradise_DICOMs'
IMAGES_PATH = '../../data/Paradise_Images'
EXPORT_METADATA = True
ARCHIMED_METADATA_FILE = 'DICOM_Metadata.csv'
CONVERT = True

# Conversion parameters
BATCH_SIZE = 50  # Number of files to process in each batch for progress reporting
BIT_DEPTH = 8  # Bit depth for output images (8, 12, or 16)
CREATE_SUBFOLDERS = False  # If True, create subfolders named after ExamCode for output files
DELETE_DICOM = True  # If True, delete the DICOM file and its containing subfolder after conversion
MONOCHROME = 1  # Monochrome type (1 or 2) to use for converted images

# 🔥 NEW V1.2 FEATURES
# Lung Segmentation Configuration
USE_LUNG_SEGMENTATION = True  # Enable automatic lung segmentation
SEGMENTATION_MODEL = 'unet_efficientnet_b4'  # Segmentation model architecture
LUNG_CROP_PADDING = 20  # Padding around detected lung regions (pixels)
MIN_LUNG_AREA_RATIO = 0.02  # Minimum lung area ratio to total image (filter noise)
SAVE_SEGMENTATION_MASKS = True  # Save segmentation masks for debugging
MASKS_PATH = '../../data/Paradise_Masks'  # Path to save segmentation masks

# Enhanced Resize Parameters
TARGET_SIZE = (518, 518)  # Target size after lung-aware processing
PRESERVE_ASPECT_RATIO = True  # Maintain aspect ratio during final resize

### Colors

In [None]:
# ANSI escape codes for colored output
ANSI = {
    'R' : '\033[91m',  # Red
    'G' : '\033[92m',  # Green
    'B' : '\033[94m',  # Blue
    'Y' : '\033[93m',  # Yellow
    'W' : '\033[0m',  # White
    'M' : '\033[95m',  # Magenta (for new features)
    'C' : '\033[96m',  # Cyan (for segmentation)
}


## 📦 Enhanced Dependencies

In [None]:
# Core dependencies
import ArchiMedConnector.A3_Connector as A3_Conn
import pandas as pd
import os
import pydicom
import numpy as np
from PIL import Image, ImageDraw
import glob
from tqdm import tqdm
import shutil
import warnings
import matplotlib.pyplot as plt
import seaborn as sns
import cv2

# 🆕 NEW: Deep Learning dependencies for lung segmentation
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms
from torchvision import models
import albumentations as A
from albumentations.pytorch import ToTensorV2

print(f"{ANSI['G']}✅ Core dependencies loaded{ANSI['W']}")
print(f"{ANSI['C']}🔬 Lung segmentation dependencies loaded{ANSI['W']}")
print(f"{ANSI['M']}🚀 Enhanced V1.2 ready for zone-aware processing{ANSI['W']}")

# Initialize ArchiMed connector
a3conn = A3_Conn.A3_Connector()


## 🧠 Lung Segmentation Model Implementation

### Enhanced U-Net Architecture with EfficientNet Backbone
Based on the research paper: ["Automatic lung segmentation in chest X-ray images using improved U-Net"](https://www.nature.com/articles/s41598-022-12743-y)

In [None]:
class ResidualBlock(nn.Module):
    """Residual block for improved gradient flow in decoder."""
    
    def __init__(self, in_channels, out_channels):
        super(ResidualBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, 3, padding=1)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.leaky_relu = nn.LeakyReLU(0.1, inplace=True)
        
        # Skip connection adjustment
        if in_channels != out_channels:
            self.skip = nn.Conv2d(in_channels, out_channels, 1)
        else:
            self.skip = nn.Identity()
    
    def forward(self, x):
        residual = self.skip(x)
        out = self.leaky_relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += residual
        return self.leaky_relu(out)


class EnhancedUNet(nn.Module):
    """Enhanced U-Net with EfficientNet-B4 encoder for lung segmentation."""
    
    def __init__(self, n_classes=1, pretrained=True):
        super(EnhancedUNet, self).__init__()
        
        # EfficientNet-B4 encoder (pre-trained on ImageNet)
        efficientnet = models.efficientnet_b4(pretrained=pretrained)
        self.encoder = efficientnet.features
        
        # Decoder with residual blocks and skip connections
        self.decoder5 = self._make_decoder_block(1792, 512)  # EfficientNet-B4 output: 1792
        self.decoder4 = self._make_decoder_block(512 + 448, 256)  # Skip connection: 448
        self.decoder3 = self._make_decoder_block(256 + 160, 128)  # Skip connection: 160
        self.decoder2 = self._make_decoder_block(128 + 56, 64)   # Skip connection: 56
        self.decoder1 = self._make_decoder_block(64 + 32, 32)    # Skip connection: 32
        
        # Final classification layer
        self.final_conv = nn.Conv2d(32, n_classes, 1)
        
        # Upsampling
        self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
        
    def _make_decoder_block(self, in_channels, out_channels):
        """Create decoder block with dropout, conv, and residual blocks."""
        return nn.Sequential(
            nn.Dropout2d(0.1),
            nn.Conv2d(in_channels, out_channels, 3, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.LeakyReLU(0.1, inplace=True),
            ResidualBlock(out_channels, out_channels),
            ResidualBlock(out_channels, out_channels)
        )
    
    def forward(self, x):
        # Encoder with skip connections
        skip_connections = []
        
        # Extract features at different scales
        for i, layer in enumerate(self.encoder):
            x = layer(x)
            # Save specific layers for skip connections
            if i in [2, 3, 4, 6]:  # EfficientNet-B4 specific indices
                skip_connections.append(x)
        
        # Decoder with skip connections
        x = self.decoder5(x)
        
        if len(skip_connections) >= 4:
            x = self.upsample(x)
            x = torch.cat([x, skip_connections[3]], dim=1)
            x = self.decoder4(x)
            
            x = self.upsample(x)
            x = torch.cat([x, skip_connections[2]], dim=1)
            x = self.decoder3(x)
            
            x = self.upsample(x)
            x = torch.cat([x, skip_connections[1]], dim=1)
            x = self.decoder2(x)
            
            x = self.upsample(x)
            x = torch.cat([x, skip_connections[0]], dim=1)
            x = self.decoder1(x)
        
        # Final upsampling to original size
        x = self.upsample(x)
        x = self.final_conv(x)
        
        return torch.sigmoid(x)


print(f"{ANSI['C']}🏗️ Enhanced U-Net architecture defined{ANSI['W']}")
print(f"{ANSI['B']}   - EfficientNet-B4 encoder with ImageNet pretraining{ANSI['W']}")
print(f"{ANSI['B']}   - Residual blocks in decoder for better gradient flow{ANSI['W']}")
print(f"{ANSI['B']}   - LeakyReLU activation to prevent gradient instability{ANSI['W']}")


## 🎯 Lung Segmentation Pipeline

In [None]:
class LungSegmentationPipeline:
    """Complete pipeline for lung segmentation and smart cropping."""
    
    def __init__(self, model_path=None, device='cuda' if torch.cuda.is_available() else 'cpu'):
        self.device = device
        self.model = EnhancedUNet(n_classes=1, pretrained=True)
        
        # Load pre-trained weights if available
        if model_path and os.path.exists(model_path):
            self.model.load_state_dict(torch.load(model_path, map_location=device))
            print(f"{ANSI['G']}✅ Loaded pre-trained segmentation model from {model_path}{ANSI['W']}")
        else:
            print(f"{ANSI['Y']}⚠️ No pre-trained model found. Using randomly initialized weights.{ANSI['W']}")
            print(f"{ANSI['B']}   Consider training the model or downloading pre-trained weights{ANSI['W']}")
        
        self.model.to(device)
        self.model.eval()
        
        # Preprocessing transforms
        self.transform = A.Compose([
            A.Resize(256, 256),
            A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ToTensorV2()
        ])
    
    def preprocess_image(self, image):
        """Preprocess image for segmentation model."""
        if len(image.shape) == 2:  # Grayscale
            image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB)
        elif len(image.shape) == 3 and image.shape[2] == 1:  # Single channel
            image = np.repeat(image, 3, axis=2)
        
        # Apply transforms
        transformed = self.transform(image=image)
        tensor = transformed['image'].unsqueeze(0)  # Add batch dimension
        return tensor.to(self.device)
    
    def segment_lungs(self, image):
        """Segment lung regions from chest X-ray image."""
        original_shape = image.shape[:2]
        
        # Preprocess
        input_tensor = self.preprocess_image(image)
        
        # Inference
        with torch.no_grad():
            output = self.model(input_tensor)
            mask = output.squeeze().cpu().numpy()
        
        # Resize mask back to original size
        mask_resized = cv2.resize(mask, (original_shape[1], original_shape[0]))
        
        # Threshold to binary mask
        binary_mask = (mask_resized > 0.5).astype(np.uint8)
        
        return binary_mask, mask_resized
    
    def extract_lung_regions(self, image, mask, padding=20):
        """Extract and crop lung regions with smart bounding box."""
        # Find contours of lung regions
        contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        
        if not contours:
            print(f"{ANSI['Y']}⚠️ No lung regions detected, returning original image{ANSI['W']}")
            return image, None
        
        # Filter contours by area (remove noise)
        total_area = image.shape[0] * image.shape[1]
        min_area = total_area * MIN_LUNG_AREA_RATIO
        valid_contours = [c for c in contours if cv2.contourArea(c) > min_area]
        
        if not valid_contours:
            print(f"{ANSI['Y']}⚠️ No significant lung regions found, returning original image{ANSI['W']}")
            return image, None
        
        # Get bounding box around all lung regions
        all_points = np.vstack([c.reshape(-1, 2) for c in valid_contours])
        x_min, y_min = np.min(all_points, axis=0)
        x_max, y_max = np.max(all_points, axis=0)
        
        # Add padding
        h, w = image.shape[:2]
        x_min = max(0, x_min - padding)
        y_min = max(0, y_min - padding)
        x_max = min(w, x_max + padding)
        y_max = min(h, y_max + padding)
        
        # Crop image
        if len(image.shape) == 3:
            cropped_image = image[y_min:y_max, x_min:x_max, :]
        else:
            cropped_image = image[y_min:y_max, x_min:x_max]
        
        crop_info = {
            'bbox': (x_min, y_min, x_max, y_max),
            'original_shape': image.shape[:2],
            'cropped_shape': cropped_image.shape[:2]
        }
        
        return cropped_image, crop_info
    
    def process_image(self, image, save_mask_path=None):
        """Complete pipeline: segment lungs and extract regions."""
        # Segment lungs
        binary_mask, probability_mask = self.segment_lungs(image)
        
        # Save mask if requested
        if save_mask_path:
            mask_image = (probability_mask * 255).astype(np.uint8)
            cv2.imwrite(save_mask_path, mask_image)
        
        # Extract lung regions
        cropped_image, crop_info = self.extract_lung_regions(image, binary_mask, LUNG_CROP_PADDING)
        
        return cropped_image, crop_info, binary_mask, probability_mask


# Initialize segmentation pipeline
if USE_LUNG_SEGMENTATION:
    segmentation_pipeline = LungSegmentationPipeline()
    print(f"{ANSI['C']}🎯 Lung segmentation pipeline initialized{ANSI['W']}")
    
    # Create masks directory
    if SAVE_SEGMENTATION_MASKS:
        os.makedirs(MASKS_PATH, exist_ok=True)
        print(f"{ANSI['B']}📁 Masks directory created: {MASKS_PATH}{ANSI['W']}")
else:
    segmentation_pipeline = None
    print(f"{ANSI['Y']}⚠️ Lung segmentation disabled{ANSI['W']}")


## 🔄 Enhanced DICOM to PNG Conversion with Lung Segmentation

In [None]:
def convert_dicom_to_png_v2(
    import_folder,
    export_folder,
    bit_depth=8,
    create_subfolders=False,
    target_size=TARGET_SIZE,
    preserve_aspect_ratio=PRESERVE_ASPECT_RATIO,
    monochrome=2,
    delete_dicom=True,
    dicom_files=None,
    use_lung_segmentation=USE_LUNG_SEGMENTATION,
    segmentation_pipeline=None,
    save_masks=SAVE_SEGMENTATION_MASKS,
    masks_path=MASKS_PATH
):
    """
    Enhanced DICOM to PNG conversion with lung segmentation and smart cropping.
    
    🆕 V1.2 Features:
    - Automatic lung detection and cropping
    - Preserves lung anatomy details
    - Smart resizing after segmentation
    - Optional mask saving for debugging
    """
    
    # Validate parameters
    if bit_depth not in [8, 12, 16]:
        raise ValueError("bit_depth must be 8, 12, or 16")
    
    if monochrome not in [1, 2]:
        raise ValueError("monochrome must be 1 or 2")
    
    # Create export folder
    os.makedirs(export_folder, exist_ok=True)
    
    # Get DICOM files to process
    if dicom_files is None:
        dicom_files = []
        for ext in ['.dcm', '.DCM']:
            dicom_files.extend(glob.glob(os.path.join(import_folder, '**/*' + ext), recursive=True))
    else:
        dicom_files = [p for p in dicom_files if os.path.isfile(p)]

    # Initialize counters
    successful = 0
    failed = 0
    skipped = 0
    segmentation_stats = {'used': 0, 'failed': 0, 'skipped': 0}
    
    # Suppress warnings
    warnings.filterwarnings("ignore", category=UserWarning, module="pydicom.charset")
    
    # Process each DICOM file
    for dicom_path in tqdm(dicom_files, desc="🔄 Converting DICOM with lung segmentation", total=len(dicom_files)):
        try:
            # Read DICOM
            try:
                ds = pydicom.dcmread(dicom_path)
                pixel_array = ds.pixel_array
            except Exception as e:
                skipped += 1
                continue
            
            # Get metadata
            exam_code = str(getattr(ds, 'StudyDescription', os.path.basename(os.path.dirname(dicom_path))))
            file_id = os.path.splitext(os.path.basename(dicom_path))[0]
            if file_id.endswith('.dcm'):
                file_id = file_id[:-4]
            
            # Handle monochrome conversion
            dicom_monochrome = monochrome
            if hasattr(ds, 'PhotometricInterpretation'):
                if ds.PhotometricInterpretation == 'MONOCHROME1':
                    dicom_monochrome = 1
                elif ds.PhotometricInterpretation == 'MONOCHROME2':
                    dicom_monochrome = 2
            
            # Normalize pixel values
            bits_allocated = getattr(ds, 'BitsAllocated', 14)
            bits_stored = getattr(ds, 'BitsStored', bits_allocated)
            max_pixel_value = pixel_array.max()
            max_possible_value = (2 ** bits_stored) - 1
            output_max_value = (2 ** bit_depth) - 1
            
            if max_pixel_value > 0:
                pixel_array = ((pixel_array / min(max_pixel_value, max_possible_value)) * output_max_value)
            
            # Convert to appropriate data type
            if bit_depth <= 8:
                pixel_array = pixel_array.astype(np.uint8)
            else:
                pixel_array = pixel_array.astype(np.uint16)
            
            # Invert if needed
            if dicom_monochrome != monochrome and dicom_monochrome in [1, 2] and monochrome in [1, 2]:
                pixel_array = output_max_value - pixel_array
            
            # 🆕 V1.2: LUNG SEGMENTATION AND SMART CROPPING
            final_image = pixel_array
            crop_info = None
            
            if use_lung_segmentation and segmentation_pipeline is not None:
                try:
                    # Define mask save path
                    mask_save_path = None
                    if save_masks:
                        mask_filename = f"{file_id}_mask.png"
                        mask_save_path = os.path.join(masks_path, mask_filename)
                    
                    # Process with lung segmentation
                    cropped_image, crop_info, binary_mask, prob_mask = segmentation_pipeline.process_image(
                        pixel_array, mask_save_path
                    )
                    
                    if crop_info is not None:
                        final_image = cropped_image
                        segmentation_stats['used'] += 1
                        print(f"{ANSI['C']}🫁 Segmented {file_id}: {crop_info['original_shape']} → {crop_info['cropped_shape']}{ANSI['W']}")
                    else:
                        segmentation_stats['failed'] += 1
                        print(f"{ANSI['Y']}⚠️ Segmentation failed for {file_id}, using original{ANSI['W']}")
                        
                except Exception as e:
                    segmentation_stats['failed'] += 1
                    print(f"{ANSI['R']}❌ Segmentation error for {file_id}: {str(e)}{ANSI['W']}")
            else:
                segmentation_stats['skipped'] += 1
            
            # Convert to PIL Image
            img = Image.fromarray(final_image)
            
            # 🆕 V1.2: SMART RESIZING
            if target_size:
                if preserve_aspect_ratio:
                    # Calculate aspect ratio preserving dimensions
                    original_width, original_height = img.size
                    target_width, target_height = target_size
                    
                    # Calculate scaling factor to fit within target size
                    scale_factor = min(target_width / original_width, target_height / original_height)
                    new_width = int(original_width * scale_factor)
                    new_height = int(original_height * scale_factor)
                    
                    # Resize maintaining aspect ratio
                    img = img.resize((new_width, new_height), Image.LANCZOS)
                    
                    # Create new image with target size and paste resized image centered
                    final_img = Image.new('L' if bit_depth <= 8 else 'I', target_size, 0)
                    paste_x = (target_width - new_width) // 2
                    paste_y = (target_height - new_height) // 2
                    final_img.paste(img, (paste_x, paste_y))
                    img = final_img
                else:
                    # Direct resize to target size
                    img = img.resize(target_size, Image.LANCZOS)
            
            # Determine output path
            base_filename = os.path.splitext(os.path.basename(dicom_path))[0]
            if create_subfolders:
                subfolder_path = os.path.join(export_folder, exam_code)
                os.makedirs(subfolder_path, exist_ok=True)
                output_path = os.path.join(subfolder_path, f"{base_filename}.png")
            else:
                output_path = os.path.join(export_folder, f"{base_filename}.png")
            
            # Save PNG
            img.save(output_path)
            successful += 1
            
            # Delete DICOM if requested
            if delete_dicom:
                os.remove(dicom_path)
                dicom_folder = os.path.dirname(dicom_path)
                if dicom_folder != import_folder:
                    try:
                        if not os.listdir(dicom_folder):
                            shutil.rmtree(dicom_folder)
                    except Exception as e:
                        print(f"{ANSI['Y']}Warning: Could not delete folder {dicom_folder}: {str(e)}{ANSI['W']}")
            
        except Exception as e:
            print(f"{ANSI['R']}Error converting {dicom_path}: {str(e)}{ANSI['W']}")
            failed += 1
    
    # Enhanced summary with segmentation statistics
    summary = {
        "successful": successful,
        "skipped": skipped,
        "failed": failed,
        "total": len(dicom_files),
        "segmentation_stats": segmentation_stats
    }
    
    # Print enhanced summary
    if use_lung_segmentation:
        print(f"\n{ANSI['C']}🫁 Lung Segmentation Summary:{ANSI['W']}")
        print(f"   - Successfully segmented: {segmentation_stats['used']}")
        print(f"   - Segmentation failed: {segmentation_stats['failed']}")
        print(f"   - Segmentation skipped: {segmentation_stats['skipped']}")
    
    return summary


print(f"{ANSI['M']}🚀 Enhanced DICOM conversion function loaded{ANSI['W']}")
print(f"{ANSI['C']}   - Automatic lung segmentation and cropping{ANSI['W']}")
print(f"{ANSI['B']}   - Smart resizing with aspect ratio preservation{ANSI['W']}")
print(f"{ANSI['G']}   - Enhanced quality preservation for lung anatomy{ANSI['W']}")


## Import CSVs to Dataframe

In [None]:
def import_csv_to_dataframe(file_path, separator=';', columns=None, chunk_size=None):
    """
    Import CSV file into a pandas DataFrame.
    
    Args:
        file_path (str): Path to the CSV file
        separator (str): CSV separator character
        columns (list): List of columns to import (if None, import all)
        chunk_size (int): Number of rows to read at a time (if None, read all at once)
        
    Returns:
        pandas.DataFrame: The imported data
    """
    try:
        # Determine which columns to use
        usecols = columns if columns and len(columns) > 0 else None
        
        if chunk_size:
            # Read in chunks and concatenate
            chunks = []
            for chunk in pd.read_csv(file_path, sep=separator, usecols=usecols, chunksize=chunk_size):
                chunks.append(chunk)
            return pd.concat(chunks, ignore_index=True)
        else:
            # Read all at once
            return pd.read_csv(file_path, sep=separator, usecols=usecols)
    except Exception as e:
        print(f"{ANSI['R']}Error importing CSV: {e}{ANSI['W']}")
        return None

# Import labeled data
csv_path = os.path.join(CSV_FOLDER, CSV_LABELS_FILE)
print(f"{ANSI['B']}Importing labeled data from: {csv_path}{ANSI['W']}")

df_labeled_data = import_csv_to_dataframe(
    file_path=csv_path,
    separator=CSV_SEPARATOR,
    columns=IMPORT_COLUMNS,
    chunk_size=CHUNK_SIZE
)

if df_labeled_data is not None:
    print(f"{ANSI['G']}Successfully imported {len(df_labeled_data)} rows of labeled data{ANSI['W']}")
    display(df_labeled_data.head())
else:
    print(f"{ANSI['R']}Failed to import labeled data{ANSI['W']}")


## 🚀 Enhanced Download and Processing Pipeline

In [None]:
def download_archimed_files_v2(dataframe, download_path, file_id_column='FileID', batch_size=20, convert=False):
    """
    🆕 V1.2: Enhanced ArchiMed download with lung segmentation integration.
    
    Args:
        dataframe (pandas.DataFrame): DataFrame containing FileIDs
        download_path (str): Path where to save downloaded files
        file_id_column (str): Name of the column containing FileIDs (default: 'FileID')
        batch_size (int): Number of files to process in each batch for progress reporting
        convert (bool): If True, convert downloaded DICOM files to PNG after each batch (default: False)
        
    Returns:
        pandas.DataFrame: DataFrame with metadata of converted files
    """
    
    # Create download directory
    os.makedirs(download_path, exist_ok=True)
    
    # Get user info for verification
    user_info = a3conn.getUserInfos()
    print(f"{ANSI['G']}🔐 ArchiMed Authentication Info{ANSI['W']}")
    print(f"{ANSI['B']}Username:{ANSI['W']} {user_info.get('userInfos', {}).get('login', 'Unknown')}")
    print(f"{ANSI['B']}User level:{ANSI['W']} {user_info.get('userInfos', {}).get('level', 'Unknown')}")
    
    # Check if the FileID column exists
    if file_id_column not in dataframe.columns:
        print(f"{ANSI['R']}Error: Column '{file_id_column}' not found in dataframe{ANSI['W']}")
        return pd.DataFrame()
    
    # Get unique FileIDs to avoid downloading duplicates
    file_ids = dataframe[file_id_column].unique()
    total_files = len(file_ids)
    
    print(f"\\n{ANSI['M']}🚀 Starting enhanced download with lung segmentation{ANSI['W']}")
    print(f"{ANSI['B']}Total files to process: {ANSI['W']}{total_files}")
    print(f"{ANSI['B']}Destination: {ANSI['W']}{download_path}")
    if USE_LUNG_SEGMENTATION:
        print(f"{ANSI['C']}🫁 Lung segmentation: ENABLED{ANSI['W']}")
    else:
        print(f"{ANSI['Y']}⚠️ Lung segmentation: DISABLED{ANSI['W']}")
    print()
    
    failed_files = []
    batch_files = []
    all_metadata = pd.DataFrame()
    downloaded_count = 0
    skipped_count = 0
    
    # Process files in batches
    for i, file_id in enumerate(file_ids):
        if pd.isna(file_id):
            continue
            
        try:
            file_id = int(file_id)
            file_output_path = os.path.join(download_path, f"{file_id}")
            dicom_file_path = os.path.join(file_output_path, f"{file_id}.dcm")
            
            # Check if file already exists
            if os.path.exists(dicom_file_path):
                print(f"{ANSI['Y']}📄 File {ANSI['W']}{file_id}{ANSI['Y']} already exists, skipping download (Progress: {ANSI['W']}{((i+1)/total_files)*100:.1f}%{ANSI['Y']} - {ANSI['W']}{i+1}/{total_files}{ANSI['Y']}){ANSI['W']}")
                batch_files.append(dicom_file_path)
                skipped_count += 1
            else:
                print(f"{ANSI['B']}⬇️ Downloading file {ANSI['W']}{file_id}{ANSI['B']} (Progress: {ANSI['W']}{((i+1)/total_files)*100:.1f}%{ANSI['B']} - {ANSI['W']}{i+1}/{total_files}{ANSI['B']}) from{ANSI['W']} ArchiMed")
                
                # Download the file
                result = a3conn.downloadFile(
                    file_id,
                    asStream=False,
                    destDir=file_output_path,
                    filename=f"{file_id}.dcm",
                    inWorklist=False
                )
                
                downloaded_count += 1
                batch_files.append(dicom_file_path)
            
            # Show progress every batch_size files
            if (i + 1) % batch_size == 0 or (i + 1) == total_files:
                
                # 🆕 V1.2: Enhanced conversion with lung segmentation
                if convert and batch_files:
                    try:
                        print(f"\\n{ANSI['M']}🔄 Converting batch of {ANSI['W']}{len(batch_files)}{ANSI['M']} DICOM files with lung segmentation...{ANSI['W']}")
                        summary = convert_dicom_to_png_v2(
                            import_folder=download_path,
                            export_folder=IMAGES_PATH,
                            bit_depth=BIT_DEPTH,
                            create_subfolders=CREATE_SUBFOLDERS,
                            target_size=TARGET_SIZE,
                            preserve_aspect_ratio=PRESERVE_ASPECT_RATIO,
                            delete_dicom=DELETE_DICOM,
                            monochrome=MONOCHROME,
                            dicom_files=batch_files,
                            use_lung_segmentation=USE_LUNG_SEGMENTATION,
                            segmentation_pipeline=segmentation_pipeline,
                            save_masks=SAVE_SEGMENTATION_MASKS,
                            masks_path=MASKS_PATH
                        )
                        print(f"{ANSI['G']}✅ Conversion summary: {summary}{ANSI['W']}")
                    except Exception as e:
                        print(f"{ANSI['R']}❌ Error during conversion: {str(e)}{ANSI['W']}")
                
                batch_files = []
                print(f"{ANSI['Y']}📊 Progress:{ANSI['W']} {i + 1}/{total_files} {ANSI['B']}files processed {ANSI['W']}({ANSI['B']}{((i + 1) / total_files * 100):.1f}%{ANSI['W']})\\n")
                
        except Exception as e:
            failed_files.append(file_id)
            print(f"{ANSI['R']}❌ Error downloading file ID {file_id}: {str(e)}{ANSI['W']}")
    
    # Final Summary
    print(f"\\n{ANSI['G']}🎉 Download complete!{ANSI['W']}")
    print(f"{ANSI['B']}📊 Summary:{ANSI['W']}")
    print(f"   - Downloaded: {downloaded_count} files")
    print(f"   - Skipped: {skipped_count} files (already existed)")
    print(f"   - Failed: {len(failed_files)} files")
    
    if failed_files:
        print(f"{ANSI['R']}❌ Failed files: {failed_files}{ANSI['W']}")
    
    return all_metadata


# Execute the enhanced download
if df_labeled_data is not None:
    print(f"{ANSI['M']}🚀 Starting enhanced ArchiMed download with lung segmentation...{ANSI['W']}\\n")
    metadata_df = download_archimed_files_v2(
        dataframe=df_labeled_data,
        download_path=DOWNLOAD_PATH,
        batch_size=BATCH_SIZE,
        convert=CONVERT
    )
    
    if not metadata_df.empty:
        print(f"{ANSI['G']}✅ Files processed successfully!{ANSI['W']}")
        print(f"{ANSI['B']}📁 Images saved to: {ANSI['W']}{IMAGES_PATH}")
        if USE_LUNG_SEGMENTATION and SAVE_SEGMENTATION_MASKS:
            print(f"{ANSI['C']}🫁 Segmentation masks saved to: {ANSI['W']}{MASKS_PATH}")
    else:
        print(f"{ANSI['Y']}⚠️ Files processed but no metadata was collected{ANSI['W']}")
else:
    print(f"{ANSI['R']}❌ Cannot download files: No labeled data available{ANSI['W']}")


## 📊 V1.2 Summary and Next Steps

### ✅ What's New in V1.2:

1. **🫁 Automatic Lung Segmentation**: 
   - Enhanced U-Net with EfficientNet-B4 encoder
   - Automatic lung region detection
   - Smart cropping to focus on lung anatomy

2. **🎯 Zone-Aware Processing**: 
   - Foundation for true anatomical zone prediction
   - Preserved lung details for better CSI scoring
   - Reduced background noise and artifacts

3. **📐 Smart Resizing**: 
   - Aspect ratio preservation
   - Centered placement within target dimensions
   - Quality preservation during scaling

4. **🔬 Enhanced Quality**: 
   - Lung-focused preprocessing
   - Segmentation mask debugging
   - Improved feature extraction potential

### 🚀 Integration with CSI-Predictor:

To use these enhanced images with your CSI prediction model:

1. **Update data paths** in your training configuration:
   ```python
   # In config.py or .env
   DATA_DIR = "../../data/Paradise_Images"  # Enhanced lung-segmented images
   ```

2. **Verify image quality** by checking the segmentation masks:
   ```python
   # Check masks directory
   ls ../../data/Paradise_Masks/
   ```

3. **Train your model** with zone-aware features:
   - The lung-cropped images should improve zone-specific learning
   - Better anatomical focus for CSI score prediction

### 📋 Usage Instructions:

1. **Enable/Disable Segmentation**:
   ```python
   USE_LUNG_SEGMENTATION = True   # Enable for zone-aware processing
   SAVE_SEGMENTATION_MASKS = True # Enable for debugging
   ```

2. **Adjust Parameters**:
   ```python
   LUNG_CROP_PADDING = 20        # Padding around lung regions
   MIN_LUNG_AREA_RATIO = 0.02    # Filter noise
   TARGET_SIZE = (518, 518)      # Match your model requirements
   ```

3. **Monitor Processing**:
   - Green 🟢: Successful segmentation and cropping
   - Yellow 🟡: Segmentation failed, using original image  
   - Red 🔴: Processing error

### 🔬 For Advanced Users:

**Train Custom Segmentation Model**:
- Use the provided U-Net architecture
- Train on your specific chest X-ray dataset
- Save weights and update `model_path` parameter

**Zone-Specific Modifications**:
- Modify `extract_lung_regions()` to extract individual zones
- Implement zone-specific cropping for 6-zone CSI prediction
- Add anatomical landmarks detection

---

**Ready for True Zone-Aware CSI Prediction! 🎯**
