# ear‑vision‑ml Repository Overview

This notebook provides a step‑by‑step walkthrough of the core functionality of the **ear‑vision‑ml** repository with runnable examples. Each section corresponds to a key module in the codebase.


## 1. Introduction

The repository implements a pipeline for training and evaluating ear‑vision models, handling data loading, model configuration, training callbacks, and exporting models to various formats. The high‑level architecture is based on a Dependency Injection (DI) container that wires services together.


## 2. Core Configuration (`config_utils.py` & `constants.py`)

Configuration files are validated against a JSON schema. The helper functions load a YAML/JSON config and expose constants such as default paths.


In [None]:
from src.core import config_utils, constants
from omegaconf import OmegaConf

# Load a sample configuration (the repository ships a default example)
# We use OmegaConf directly as config_utils provides safe accessors but not a loader
try:
    sample_cfg = OmegaConf.load('configs/sample_config.yaml')
    print('Loaded config keys:', list(sample_cfg.keys()))
except FileNotFoundError:
    print('Sample config not found, creating a dummy one.')
    sample_cfg = OmegaConf.create({'data': {'batch_size': 32}})
    print('Created dummy config keys:', list(sample_cfg.keys()))

print('Shuffle buffer size from constants:', constants.SHUFFLE_BUFFER_SIZE)


## 3. Dependency Injection (`di.py`)

The DI container registers services (e.g., data loaders, model factories) and resolves them on demand.


In [None]:
from src.core.di import Container

container = Container()
# Resolve the dataset loader service
# Note: In a real app, we would register services first. Here we check if it resolves or raises error.
try:
    loader = container.resolve('dataset_loader')
    print('Resolved service type:', type(loader))
except Exception as e:
    print(f'Service resolution failed as expected (no config loaded): {e}')


## 4. Data Loading (`dataset_loader.py`)

The repository provides flexible dataset loading strategies (e.g., TFRecord, synthetic). Below we load a tiny synthetic dataset and visualise a sample image.


In [None]:
from src.core.data.dataset_loader import SyntheticDataLoader
from omegaconf import OmegaConf
try:
    import matplotlib.pyplot as plt
except ImportError:
    plt = None

# Create a config for the synthetic loader
cfg = OmegaConf.create({
    'data': {
        'dataset': {
            'image_size': [64, 64],
            'num_classes': 3,
            'batch_size': 2
        }
    },
    'model': {
        'num_classes': 3
    },
    'task': {
        'name': 'classification'
    }
})

loader = SyntheticDataLoader()
ds = loader.load_train(cfg)

# Take one batch
for images, labels in ds.take(1):
    print('Dataset batch shape:', images.shape, 'Labels batch shape:', labels.shape)
    if plt:
        plt.imshow(images[0].numpy().astype('float32'), cmap='gray')
        plt.title(f'Label: {labels[0]}')
        plt.axis('off')
        plt.show()


## 5. Training Callbacks (`callbacks.py`)

Callbacks hook into the training loop to log metrics, save checkpoints, or adjust learning rates. The example demonstrates a simple progress logger.


In [None]:
from src.core.training.callbacks import WarmUpLearningRate

# Initialize the callback
callback = WarmUpLearningRate(target_lr=0.001, warmup_epochs=3)
print('Initialized WarmUpLearningRate callback')

# Simulate epoch begin to see if it runs
try:
    callback.on_epoch_begin(0)
    print('Callback on_epoch_begin executed successfully')
except Exception as e:
    print(f'Callback execution failed (expected if model not attached): {e}')


## 6. Exporters (`exporter.py`)

Models can be exported to CoreML, ONNX, etc. The snippet below creates a dummy Keras model and exports it to CoreML. It gracefully skips if TensorFlow is not available.


In [None]:
try:
    import tensorflow as tf
except ImportError:
    tf = None
from src.core.export.exporter import CoreMLExporter
from omegaconf import OmegaConf
from pathlib import Path
import shutil

if tf:
    # Build a tiny model
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(64, 64, 3)),
        tf.keras.layers.Conv2D(8, 3, activation='relu'),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(2, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
    
    # Prepare config for export
    export_cfg = OmegaConf.create({
        'task': {'name': 'classification'},
        'data': {
            'dataset': {
                'image_size': [64, 64],
                'class_names': ['class_0', 'class_1']
            }
        },
        'model': {'num_classes': 2},
        'export': {
            'export': {
                'coreml': {
                    'enabled': True,
                    'quantize': False
                }
            }
        }
    })
    
    exporter = CoreMLExporter()
    output_dir = Path('exported_models')
    
    # Clean up previous run
    if output_dir.exists():
        shutil.rmtree(output_dir)
        
    coreml_path = exporter.export(model, output_dir=output_dir, cfg=export_cfg)
    if coreml_path:
        print('CoreML model saved to', coreml_path)
    else:
        print('CoreML export skipped or failed (check logs).')
else:
    print('TensorFlow not installed; skipping export example.')


## 7. Utilities & Logging (`logging_utils.py`)

A consistent logging configuration is provided.


In [None]:
from src.core.logging_utils import get_logger
logger = get_logger(__name__)
logger.info('Logging from the notebook works!')


## 8. Running Scripts Overview

Key entry‑point scripts include `run_otoscopic_baselines.sh` (full training pipeline) and `scripts/generate_fixtures.py` (data generation). They glue together the components demonstrated above.


## 9. Testing Overview

The repository ships unit and integration tests under the `tests/` directory. Run them with:
```bash
pytest -vv
```


## 10. Conclusion & Further Resources

You now have a high‑level understanding of the main modules and how to use them. For deeper dives, explore the `src/core/` package, the `scripts/` folder, and the extensive test suite.
