# DicomStructureFile Class Demonstration

This notebook demonstrates the functionality of the `DicomStructureFile` class using a test DICOM RT Structure file.

## Overview
The `DicomStructureFile` class provides a convenient interface for:
- Loading DICOM RT Structure files
- Extracting structure information and metadata
- Reading contour sequences and converting them to ContourPoints
- Converting contour data to pandas DataFrames for analysis

## 1. Import Required Libraries

First, let's import the necessary libraries including our custom DicomStructureFile class.

In [2]:
# Import required libraries
import sys
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Add the src directory to the Python path
sys.path.append('src')

# Import our custom DicomStructureFile class
from dicom import DicomStructureFile

# Import related classes
from types_and_classes import ContourPoints, SliceIndexType

print("Libraries imported successfully!")

ModuleNotFoundError: No module named 'pydicom'

## 2. Load the DICOM Structure File

Now let's load the test DICOM RT Structure file using the DicomStructureFile class.

In [None]:
# Define the path to the test DICOM file
test_file_name = "RS.GJS_Struct_Tests.MultiVolume_A.dcm"
tests_dir = Path("Tests")

# Load the DICOM structure file using file_name parameter
dicom_file = DicomStructureFile(
    top_dir=tests_dir,
    file_name=test_file_name
)

print(f"Successfully loaded: {dicom_file}")
print(f"File path: {dicom_file.file_path}")
print(f"Is RT Structure file: {dicom_file.is_structure_file()}")

## 3. Explore Structure File Properties

Let's examine the basic properties and metadata of the loaded DICOM structure file.

In [None]:
# Get structure file information
structure_info = dicom_file.get_structure_info()

print("=== DICOM Structure File Information ===")
for key, value in structure_info.items():
    print(f"{key}: {value}")

# Access some basic DICOM dataset attributes
dataset = dicom_file.dataset
print("\n=== Additional DICOM Attributes ===")
print(f"Modality: {getattr(dataset, 'Modality', 'Not available')}")
print(f"Study Date: {getattr(dataset, 'StudyDate', 'Not available')}")
print(f"Manufacturer: {getattr(dataset, 'Manufacturer', 'Not available')}")
print(f"Software Version: {getattr(dataset, 'SoftwareVersions', 'Not available')}")

# Check if ROI contour sequence exists
if hasattr(dataset, 'ROIContourSequence'):
    print(f"\nNumber of ROI Contours: {len(dataset.ROIContourSequence)}")
else:
    print("\nNo ROI Contour Sequence found")

## 4. Access Structure Set Information

Let's examine the structure set and ROI information in more detail.

In [None]:
# Examine the Structure Set ROI Sequence
if hasattr(dataset, 'StructureSetROISequence'):
    print("=== Structure Set ROI Information ===")
    roi_info = []
    
    for i, roi in enumerate(dataset.StructureSetROISequence):
        roi_data = {
            'ROI Number': getattr(roi, 'ROINumber', 'N/A'),
            'ROI Name': getattr(roi, 'ROIName', 'N/A'),
            'ROI Generation Algorithm': getattr(roi, 'ROIGenerationAlgorithm', 'N/A'),
            'Referenced Frame of Reference': getattr(roi, 'ReferencedFrameOfReferenceUID', 'N/A')
        }
        roi_info.append(roi_data)
        
        print(f"\nROI {i+1}:")
        for key, value in roi_data.items():
            print(f"  {key}: {value}")

    # Convert to DataFrame for easier viewing
    roi_df = pd.DataFrame(roi_info)
    print("\n=== ROI Summary Table ===")
    print(roi_df.to_string(index=False))
else:
    print("No Structure Set ROI Sequence found")

## 5. Extract ROI Contour Data

Now let's use our custom method to extract contour data and convert it to ContourPoints objects.

In [None]:
# Extract contour sequences using our custom method
contour_points = dicom_file.read_contour_sequences()

print(f"=== Contour Extraction Results ===")
print(f"Total ContourPoints objects created: {len(contour_points)}")

# Examine the first few contour points
if contour_points:
    print(f"\n=== Sample ContourPoints Objects ===")
    for i, cp in enumerate(contour_points[:5]):  # Show first 5
        print(f"\nContourPoint {i+1}:")
        print(f"  ROI Number: {cp.roi}")
        print(f"  Slice Index: {cp.slice_index}")
        print(f"  Number of Points: {len(cp.points) if cp.points else 0}")
        if cp.points and len(cp.points) > 0:
            print(f"  First point: {cp.points[0]}")
            print(f"  Last point: {cp.points[-1]}")
            
    # Show statistics by ROI
    roi_counts = {}
    for cp in contour_points:
        roi_num = cp.roi
        if roi_num not in roi_counts:
            roi_counts[roi_num] = {'slices': 0, 'total_points': 0}
        roi_counts[roi_num]['slices'] += 1
        if cp.points:
            roi_counts[roi_num]['total_points'] += len(cp.points)
    
    print(f"\n=== Contour Statistics by ROI ===")
    for roi_num, stats in roi_counts.items():
        print(f"ROI {roi_num}: {stats['slices']} slices, {stats['total_points']} total points")
        
else:
    print("No contour points found in the dataset")

## 6. Access and Analyze the Contour DataFrame

Let's examine the automatically generated DataFrame that contains all contour point data.

In [None]:
# Access the DataFrame containing all contour points
contour_df = dicom_file.contour_dataframe

print("=== Contour DataFrame Information ===")
if contour_df is not None and not contour_df.empty:
    print(f"DataFrame shape: {contour_df.shape}")
    print(f"Columns: {list(contour_df.columns)}")
    
    # Show basic statistics
    print(f"\n=== DataFrame Summary ===")
    print(contour_df.describe())
    
    # Show the first few rows
    print(f"\n=== First 10 Rows ===")
    print(contour_df.head(10))
    
    # Show unique ROI numbers and slice indices
    print(f"\n=== Data Distribution ===")
    print(f"Unique ROI numbers: {sorted(contour_df['roi'].unique())}")
    print(f"Slice index range: {contour_df['slice_index'].min()} to {contour_df['slice_index'].max()}")
    print(f"X coordinate range: {contour_df['x'].min():.3f} to {contour_df['x'].max():.3f}")
    print(f"Y coordinate range: {contour_df['y'].min():.3f} to {contour_df['y'].max():.3f}")
    
    # Count points per ROI
    roi_point_counts = contour_df['roi'].value_counts().sort_index()
    print(f"\n=== Points per ROI ===")
    for roi_num, count in roi_point_counts.items():
        print(f"ROI {roi_num}: {count} points")
        
else:
    print("Contour DataFrame is empty or not available")

## 7. Visualize Structure Contours

Let's create visualizations of the structure contours using matplotlib.

In [None]:
# Create visualizations of the contours
if contour_df is not None and not contour_df.empty:
    
    # Plot 1: All contours on a single plot (X-Y view)
    plt.figure(figsize=(12, 5))
    
    # Subplot 1: All ROIs overlaid
    plt.subplot(1, 2, 1)
    colors = plt.cm.tab10(np.linspace(0, 1, len(contour_df['roi'].unique())))
    
    for i, roi_num in enumerate(sorted(contour_df['roi'].unique())):
        roi_data = contour_df[contour_df['roi'] == roi_num]
        plt.scatter(roi_data['x'], roi_data['y'], 
                   c=[colors[i]], alpha=0.6, s=1, label=f'ROI {roi_num}')
    
    plt.xlabel('X Coordinate (mm)')
    plt.ylabel('Y Coordinate (mm)')
    plt.title('All ROI Contours (X-Y View)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.axis('equal')
    
    # Subplot 2: Contours by slice (show a few selected slices)
    plt.subplot(1, 2, 2)
    
    # Get a few representative slices
    unique_slices = sorted(contour_df['slice_index'].unique())
    selected_slices = unique_slices[::max(1, len(unique_slices)//5)][:5]  # Select up to 5 slices
    
    for i, slice_idx in enumerate(selected_slices):
        slice_data = contour_df[contour_df['slice_index'] == slice_idx]
        if not slice_data.empty:
            plt.scatter(slice_data['x'], slice_data['y'], 
                       alpha=0.7, s=2, label=f'Slice {slice_idx}')
    
    plt.xlabel('X Coordinate (mm)')
    plt.ylabel('Y Coordinate (mm)')
    plt.title('Contours by Selected Slices')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.axis('equal')
    
    plt.tight_layout()
    plt.show()
    
    # Plot 2: 3D visualization if we have multiple slices
    if len(contour_df['slice_index'].unique()) > 1:
        fig = plt.figure(figsize=(10, 8))
        ax = fig.add_subplot(111, projection='3d')
        
        for i, roi_num in enumerate(sorted(contour_df['roi'].unique())):
            roi_data = contour_df[contour_df['roi'] == roi_num]
            ax.scatter(roi_data['x'], roi_data['y'], roi_data['slice_index'],
                      c=colors[i], alpha=0.6, s=1, label=f'ROI {roi_num}')
        
        ax.set_xlabel('X Coordinate (mm)')
        ax.set_ylabel('Y Coordinate (mm)')
        ax.set_zlabel('Slice Index (mm)')
        ax.set_title('3D View of All ROI Contours')
        ax.legend()
        
        plt.show()
    
else:
    print("No contour data available for visualization")

## 8. Export and Save Structure Data

Finally, let's demonstrate how to export the structure data to different formats.

In [None]:
# Export structure data to different formats
if contour_df is not None and not contour_df.empty:
    
    # 1. Save DataFrame to CSV
    csv_filename = "contour_points_export.csv"
    contour_df.to_csv(csv_filename, index=False)
    print(f"Contour points exported to: {csv_filename}")
    
    # 2. Save DataFrame to Excel with multiple sheets (one per ROI)
    excel_filename = "contour_points_by_roi.xlsx"
    with pd.ExcelWriter(excel_filename, engine='openpyxl') as writer:
        # Save all data to one sheet
        contour_df.to_excel(writer, sheet_name='All_Contours', index=False)
        
        # Save each ROI to a separate sheet
        for roi_num in sorted(contour_df['roi'].unique()):
            roi_data = contour_df[contour_df['roi'] == roi_num]
            sheet_name = f'ROI_{roi_num}'
            roi_data.to_excel(writer, sheet_name=sheet_name, index=False)
    
    print(f"Contour points exported to Excel: {excel_filename}")
    
    # 3. Create a summary report
    summary_data = []
    for roi_num in sorted(contour_df['roi'].unique()):
        roi_data = contour_df[contour_df['roi'] == roi_num]
        summary_data.append({
            'ROI_Number': roi_num,
            'Total_Points': len(roi_data),
            'Unique_Slices': roi_data['slice_index'].nunique(),
            'X_Range': f"{roi_data['x'].min():.2f} to {roi_data['x'].max():.2f}",
            'Y_Range': f"{roi_data['y'].min():.2f} to {roi_data['y'].max():.2f}",
            'Slice_Range': f"{roi_data['slice_index'].min():.1f} to {roi_data['slice_index'].max():.1f}"
        })
    
    summary_df = pd.DataFrame(summary_data)
    summary_filename = "roi_summary_report.csv"
    summary_df.to_csv(summary_filename, index=False)
    print(f"ROI summary report exported to: {summary_filename}")
    
    # 4. Display the summary
    print(f"\n=== ROI Summary Report ===")
    print(summary_df.to_string(index=False))
    
    # 5. Show file sizes
    import os
    print(f"\n=== Export File Sizes ===")
    for filename in [csv_filename, excel_filename, summary_filename]:
        if os.path.exists(filename):
            size_kb = os.path.getsize(filename) / 1024
            print(f"{filename}: {size_kb:.2f} KB")
    
else:
    print("No contour data available for export")

## Summary

This notebook has demonstrated the complete functionality of the `DicomStructureFile` class:

1. **Loading DICOM Files**: Successfully loaded a DICOM RT Structure file using either file_name or file_path parameters
2. **Metadata Extraction**: Retrieved comprehensive structure file information and ROI details
3. **Contour Processing**: Extracted all contour sequences and converted them to `ContourPoints` objects
4. **Data Analysis**: Automatically generated a pandas DataFrame for easy analysis and manipulation
5. **Visualization**: Created 2D and 3D visualizations of the structure contours
6. **Data Export**: Exported the contour data to multiple formats (CSV, Excel) with summary reports

### Key Features Demonstrated:
- Flexible initialization with multiple parameter options
- Robust error handling and validation
- Automatic DataFrame generation for easy data analysis
- Integration with existing `ContourPoints` and related classes
- Comprehensive metadata extraction
- Multiple export format support

### Next Steps:
- The exported CSV and Excel files can be used for further analysis
- The `DicomStructureFile` class can be integrated into larger workflow pipelines
- The DataFrame format makes it easy to perform statistical analysis and filtering
- The visualization capabilities help with quality assurance and data validation