# Clean Core Images

This notebook demonstrates how to clean and process core images using GeoSuite's imaging module.

## Overview

GeoSuite's imaging module provides tools for processing core photographs:
- Extracting slabbed core images from service company templates
- Cropping and rescaling images
- Extracting depth information from filenames
- Batch processing entire directories

This notebook will show you how to:

1. Load and display core images
2. Crop core images to remove templates/borders
3. Rescale images for efficient storage
4. Extract depth information from filenames
5. Batch process entire directories of core images

In [None]:
# Import GeoSuite modules
import numpy as np
import matplotlib.pyplot as plt
import os
from pathlib import Path

try:
    from geosuite.imaging import (
        crop_core_image,
        process_core_directory,
        extract_depth_from_filename
    )
    from skimage import io as skio
    IMAGING_AVAILABLE = True
    print("GeoSuite imaging module imported successfully!")
except ImportError as e:
    IMAGING_AVAILABLE = False
    print(f"Note: Imaging module requires scikit-image. Install with: pip install scikit-image")
    print(f"Error: {e}")
    print("\nCreating synthetic image for demonstration...")
    
    # Create synthetic core image for demonstration
    synthetic_image = np.random.randint(0, 255, (5000, 3000, 3), dtype=np.uint8)
    # Add some structure to simulate a core
    synthetic_image[1000:4000, 1000:2000, :] = np.random.randint(100, 200, (3000, 1000, 3), dtype=np.uint8)
    print(f"Created synthetic image: {synthetic_image.shape}")

In [None]:
## 1. Load and Display Core Image

Load a core image (or use synthetic data for demonstration). In practice, you would load from your image directory.

In [None]:
# Load core image
# In practice: image_path = 'path/to/your/core/image.jpg'
# image = skio.imread(image_path)

# For demonstration, we'll use synthetic data
if IMAGING_AVAILABLE:
    # Create a more realistic synthetic core image with template borders
    # Simulate a service company template with borders and text
    full_image = np.ones((5000, 4000, 3), dtype=np.uint8) * 240  # Light gray background
    # Add colored borders (template)
    full_image[0:100, :, :] = [200, 200, 255]  # Top border
    full_image[-100:, :, :] = [200, 200, 255]  # Bottom border
    full_image[:, 0:200, :] = [200, 200, 255]  # Left border
    full_image[:, -200:, :] = [200, 200, 255]  # Right border
    
    # Add core section (the actual core image)
    core_section = np.random.randint(80, 180, (4500, 3000, 3), dtype=np.uint8)
    full_image[200:4700, 500:3500, :] = core_section
    
    print(f"Loaded synthetic core image: {full_image.shape}")
    print(f"Image dimensions: {full_image.shape[0]} Ã— {full_image.shape[1]} pixels")
    
    # Display original image
    fig, ax = plt.subplots(figsize=(12, 8))
    ax.imshow(full_image)
    ax.set_title('Original Core Image with Template Borders')
    ax.axis('off')
    plt.tight_layout()
    plt.show()
else:
    print("Using simple synthetic image (imaging module not available)")
    full_image = synthetic_image


## 2. Crop Core Image

Use `crop_core_image()` to remove template borders and extract just the core section. You need to specify the pixel coordinates of the core region.


In [None]:
if IMAGING_AVAILABLE:
    # Define crop coordinates (in pixels from top-left corner)
    # These values depend on your image template
    top = 200      # Distance from top of image to top of core
    left = 500     # Distance from left edge to left of core
    bottom = 4700  # Distance from top of image to bottom of core
    right = 3500   # Distance from left edge to right of core
    
    # Optionally rescale to reduce file size (0.3 = 30% of original size)
    scale_factor = 0.3
    
    # Crop and rescale the image
    cropped_image = crop_core_image(
        image=full_image,
        top=top,
        left=left,
        bottom=bottom,
        right=right,
        scale_factor=scale_factor
    )
    
    print(f"Cropped image shape: {cropped_image.shape}")
    print(f"Original size: {full_image.shape[0] * full_image.shape[1] / 1e6:.1f} MP")
    print(f"Cropped size: {cropped_image.shape[0] * cropped_image.shape[1] / 1e6:.1f} MP")
    
    # Display cropped image
    fig, axes = plt.subplots(1, 2, figsize=(16, 6))
    
    axes[0].imshow(full_image)
    axes[0].set_title('Original Image with Template')
    axes[0].axis('off')
    
    axes[1].imshow(cropped_image)
    axes[1].set_title('Cropped Core Image')
    axes[1].axis('off')
    
    plt.tight_layout()
    plt.show()
else:
    print("Cropping not available without imaging module")


## 3. Extract Depth from Filename

Use `extract_depth_from_filename()` to extract depth range information from image filenames.


In [None]:
if IMAGING_AVAILABLE:
    # Example filename formats
    example_filenames = [
        'pharos-1_wl_1m_core_4963_00m_to_4964_00m.jpg',
        'well_123_core_2500_50_to_2501_50.jpg'
    ]
    
    print("Extracting depth information from filenames:")
    print("=" * 60)
    
    for filename in example_filenames:
        try:
            # Default parameters for Pharos-1 format
            # Adjust indices based on your filename format
            top_depth, bottom_depth = extract_depth_from_filename(
                filename,
                top_start=21,   # Start index for top depth
                top_end=28,     # End index for top depth
                bottom_start=33, # Start index for bottom depth
                bottom_end=40   # End index for bottom depth
            )
            
            print(f"\nFilename: {filename}")
            print(f"  Top depth: {top_depth}")
            print(f"  Bottom depth: {bottom_depth}")
        except Exception as e:
            print(f"\nError extracting depth from {filename}: {e}")
    
    print("\nNote: Adjust top_start, top_end, bottom_start, bottom_end")
    print("      based on your specific filename format")
else:
    print("Depth extraction not available without imaging module")


## 4. Batch Process Core Image Directory

Use `process_core_directory()` to batch process all images in a directory.


In [None]:
if IMAGING_AVAILABLE:
    # Example: Batch process a directory of core images
    # In practice, replace these paths with your actual directories
    input_dir = 'data/raw/core_images/'  # Directory containing raw images
    output_dir = 'data/processed/core_images/'  # Directory for processed images
    
    print("Example batch processing command:")
    print("=" * 60)
    print(f"""
    processed_files = process_core_directory(
        input_dir='{input_dir}',
        output_dir='{output_dir}',
        top=200,        # Crop parameters (adjust for your template)
        left=500,
        bottom=4700,
        right=3500,
        scale_factor=0.3,  # Rescale to 30% of original size
        pattern='*.jpg'     # File pattern to match
    )
    
    print(f"Processed {{len(processed_files)}} images")
    """)
    
    print("\nKey parameters:")
    print("  - top, left, bottom, right: Crop coordinates in pixels")
    print("  - scale_factor: Rescaling factor (0.3 = 30% of original)")
    print("  - pattern: Glob pattern for input files (e.g., '*.jpg')")
    print("\nNote: This function will:")
    print("  1. Extract depth from filenames")
    print("  2. Crop each image to remove template")
    print("  3. Rescale for efficient storage")
    print("  4. Save with depth-labeled filenames")
else:
    print("Batch processing not available without imaging module")


## Summary

This notebook demonstrated:

-  Loading core images
-  Using GeoSuite imaging functions (when available)

### Next Steps

- Install scikit-image for full imaging capabilities: `pip install geosuite[imaging]`
- Load your own core images
- Apply cleaning and enhancement algorithms
- Process images for automated analysis