In [None]:
from IPython.display import display, Math, Latex

# Multi-view 3D Ultrasound Fusion for  Complete Fetal Head Reconstruction

* Ultrasound images suffer from artefacts which limit its diagnostic value, notably acoustic shadow.

* Shadows are dependent on probe orientation, with each view giving a distinct, partial view of the anatomy.

* We fuse the partially imaged fetal head anatomy, acquired from many views (hundreds), into a single coherent compounding of the full anatomy.

<img width="100%" src="head-fusion-illustration.png">

# Example of 3D ultrasound stream input data

<br>
<video width="80%" src="pose-uncorrected.mp4" controls autoplay loop autopause mute></video>


# Image artifacts confound interpretation

* namely - shadow, speckle, anisotropic resolution

<img width="80%" src="artifacts.svg"> 

[//]: # (also refractive distortion, attenuation)
[//]: # (typically caused by refraction or strong reflector differences in acoustic impedance between tissues, like bone interfaces, create hadows that obscure the anatomy behind.)


# View dependent modality
<br>
<div class="row">
    <div class="column" style="width:60%;object-fit:cover;">
<ul><li>no single 3D representation of the anatomy unlike CT / MRI </li>
<li>each anatomical structure is best visualised from a particular probe orientation, typically where tissues interfaces are perpendicular to the US beam.</li></ul>
    <img width="95%" src="views-of-fetal-head.svg"> 
    </div>
    <div class="column" style="width:40%;object-fit:cover;">
        <ul><li>resolution varies across the volume
            <li>beam can be focussed</li>
        </ul>
        <img width="95%" src="US-resolutin.jpg"> 
    </div>
</div>


[//]: # (TODO add face view, side view etc)



    

# Current screening practice
 
<div class="bgimgclass" style="height:100%">
    <div class="row">
        <div class="column" style="width:50%;object-fit:cover;">
<br>            
<ul>
<li> predominantly 2D
     <ul><li> faster acquisition, increased resolution</li> 
</ul>
<li> navigate to standard anatomical plane, take biometric measurement </li>
<li> takes training and expertise to navigate and recognise anomalies </li>
<li> limits assessment of 3D shape / structure </li>
</ul>
<br>
        <img width="80%" src="bpd.jpg"> 
        </div>
        <div class="column" style="width:50%">
            <img width="80%" src="screening.png"> 
            <img width="80%" src="Biparietal_diameter_by_gestational_age.png"> 
        </div>
    </div>
</div>


# Image fusion to the rescue!

* Automatic alignment and fusion of images could
    - remove view dependency
        - ie provide a single 3D representation of the anatomy
    - improve image quality
    - make images easier to interpret
    - no requirement to navigate standard planes
        - reduce skill/training needed
    - improve diagnosis
    - image analysis of 3D anatomy

* Complimentary to 2D ultrasound exam

# What is image fusion?

* Combine all the information from multiple images to produce a single more informative image.

* Surpass limits of our sensor

* Fused image may be
    * more accurate
    * more detailed
    * larger field of view
    * better optimised for human perception

# Multi-focus image fusion

* Preserve high frequency detail from each image

<img width="100%" src="Sample_of_Multi-Focus_image_fusion.png">



# Multi-exposure image fusion
* increase dynamic range

<img width="50%" src="HDRI-Example.jpg">
<img width="60%" src="Old_saint_pauls_1.jpg">



# Stitching / mosaicing - extending FOV
<img width="100%" src="Fenway_Park.jpg">



# Multi-modality fusion

* Combine complimentary information
    * Low resolution PET scan showing metabolic information
    * High resolution MR showing anatomy
    
<img width="100%" src="pet-mri.jpg">

# Goals for fetal head ultrasound fusion

* Eliminate speckle
    * increase signal-to-noise ratio
* Eliminate shadows
* Preserve features at highest resolution
* Increase field of view to whole head at later gestation
* View independent

# Laplacian pyramid - for multi-scale fusion

* Bandpass filter - localise features in space and in frequency
* Reversible process
* Super fast to compute on GPU (10ms, with pytorch)

<img width="80%" src="laplacian_pyramid.svg">

# Fusing $n$ image pyramids

* independent fusion of features at each scale - via weighted average

$P^{j}=(L_{0}^{j},\,L_{1}^{j}\,,L_{2}^{j}\,,I_{2}^{j}\,)$ - pyramid for the $j$th image.

$\bar{P}=(\bar{L_{0}},\,\bar{L_{1}}\,,\bar{L_{2}}\,,\bar{I_{2}}\,)$ - average pyramid

$\bar{L_{i}}=\sum_{j=1}^{n}W_{i}^{j}\circ L_{i}^{j}$. 

$\bar{I_{2}}=\sum_{j=1}^{n}W_{2}^{j}\circ I_{2}^{j}$.


$W_{i}^{j}$ - voxel weight map

$\circ$ - element-wise multiplication

[//]: # (TODO show example for 2 images)

# Saliency-based weighting

* A simple average of the aligned images, results in a suboptimal fusion, where the most detailed image features are degraded.

* Select and average only the best (salient) image features from a subset of images in order to maximise information content.

* Difference images provide an edge-detection map for free!

* Upweight strong (salient) edges

$W_{i}^{j}(x)=|L_{i}^{j}(x)|\;\forall x$.
<br>
$x$ - voxel index


[//]: # (TODO show saliency image)


# Saliency fusion is easily corrupted
<br>
<img width="70%" src="misaligned-registrations-without-proposed.svg">


# Fudge compromise - tune saliency weighting with $\alpha$ 

Exponentiating weight maps allows control of saliency weighting.

$W(x)^{\alpha}\;\forall x$
<br>

* $\uparrow\alpha$
    * sharper fusion but more artefacts
    * reduced SNR (a subset of images with strong edges will dominate).

* $\alpha = 0$
    * reduces to mean of all images

# Feature consistency - a better approach

* Replace fixed alpha with a voxel-wise feature consistency map.
    * automatically control influence of saliency
* Compare each image to a fusion of all other images
    * compute local cross correlation (LNCC)
    * downweight dissimilar regions


$C_{i}^{j}=\textrm{LNCC}(I_{i}^{j},\,\hat{I}_{i}^{j})$

$\hat{I}_{0}^{j}$ - fusion of all images except $j$


$\hat{I}_{1}^{j}$, $\hat{I}_{2}^{j}$ are downsampled fusions


$\begin{array}{c}
\bar{L_{i}}=\sum_{j=1}^{n}\textrm{pow}(W_{i}^{j},\alpha\,C_{i}^{j}\circ\bar{C}_{i})\circ L_{i}^{j},\\
\bar{I_{2}}=\sum_{j=1}^{n}\textrm{pow}(W_{2}^{j},\alpha\,C_{2}^{j}\circ\bar{C}_{2})\circ I_{2}^{j}.
\end{array}$

$\bar{C}_{i}=\frac{1}{n}\sum_{j=1}^{n}C_{i}^{j}$ - mean consistency map - allows greater smoothing to be applied in highly inconsistent regions across all images, such as outside of the head.



# Feature consistency fusion is robust

<img width="75%" src="misaligned-registrations-consistency.svg">


# Image alignment is critical to image fusion quality!

* Poorly aligned images introduce blurriness and may even distort the correct geometric relationship between anatomical structures.



# Previous methods - 1D chain of temporal neighbours 

* Slow, smooth sweeping motions to acquire data.
* Temporal neighbours assumed to be well initialised for registration
    * with sufficient overlap
* Scene is assumed to be static   
* Estimate correspondences between frames using similarity-based registration

<img width="75%" src="s-acquisition.png"> 

# Previous methods - fusion approaches
1. Align on the fly 
    * register each new frame to lastest fused representation

2. Refine iteratively offline
    * estimate local transformations between temporal neighbours
    * optimise global transformations to fusion space, by minimising local transformation errors
    * infer new neighbours in fusion space
    
<img width="85%" src="topology-refinement.svg"> 

# Challenges for aligning 4D fetal US streams 

* Fetal movement, lost probe contact, fast probe movement / angulation
    * temporally adjacent frames are often not well initialised for registration
    
* Heterogeneous appearance - dependent on view
    - voxel-based similarity measures are ill-suited
        - even robust alignment techniques will fail. e.g block- matching
    - could transform to view-independent domain, e.g extract surface models of the head (Khanal, 2018).
 
* Previous methods are not robust enough for fetal imaging
    * a single failed registration can cause misalignment of all subsequent frames!

# New approach - direct pose estimation

* Directly estimate transformation of anatomy to canonical pose / fusion space.

* Register each frame independently - break dependence on fragile chain of pairwise registrations

* Allows alignment of many more images than previous state-of-the-art (hundreds or thousands)

# Direct pose estimation

Original stream
<video width="80%" src="pose-uncorrected.mp4" controls autoplay loop autopause mute></video>

Pose corrected stream
<video width="80%" src="stream-pose-correction.mp4" controls autoplay loop autopause mute></video>



# LSTM spatial transformer network

* Advantages of iterative approach
    * better performance with much reduced network capacity
        * less overfitting
    * exploit long term rewards, develop a robust strategy
        * take the easiest route / not shortest
        * rubik's cube analogy
    * allows finer alignment
    
<img width="150%" src="pose-correction.svg">



# Based on previous work for US-MR fetal brain registration

Wright et al., LSTM Spatial Co-transformer Networks for Registration of 3D Fetal US and MR Brain Images (MICCAI 2018)

<div class="bgimgclass" style="height:100%">
    <div class="row">
        <div class="column" style="width:50%;object-fit:cover;">
            <video width="100%" src="registration.mp4" controls autoplay loop autopause mute></video>
        </div>
        <div class="column" style="width:50%">
            <ul>
                <li> Supervised - training data aligned to spatio-temporal atlas template </li>
                <li> Robust - can register image pairs with any initial orientation, and large translations (up to 50mm) </li> 
                <li> Accuracy is the same regardless of initialisation </li> 
                <li> Mean translation error ~ 0.6mm </li> 
                <li> Mean rotation error ~ 2.9 degrees </li> 
           </ul> 
        </div>
    </div>
</div>

# Complete pipeline for fetal head reconstruction

* Last step - can refine alignment (& fusion) by iteratively registering images to current fused representation using block matching.

<img width="75%" src="overview.svg">


# Comparison of fused images

<img width="75%" src="compoundings.svg">

# Applications - Faster screening for facial abnormality

* 1 minute to acquire 240 images for our method vs 10 minutes for an expert sonographer to find a good view of the face.

* Face is often obscured by the limbs or maternal tissue, thus acquiring a satisfactory view may be not be possible, especially when the fetus is moving frequently.

* Pose corrected, fused volume is easily masked e.g. via atlas-based segmentation.

<img width="75%" src="head-shot-clinic.png">


# Faster screening for facial abnormality

<img width="100%" src="head-renderings.svg">

# Easier interpretation of skull development

* Premature fusion of cranial sutures results in abnormal cranial shape and increased risk for cognitive disabilities.

* Navigating to the optimal view to visualise a single suture is difficult and time consuming, especially when the fetus is moving.

* Our method allows easy interpretation of head shape and sutures through manipulation of a 3D model.

<img width="100%" src="sutures2.svg">


# Low cost alternative to MRI for neurodevelopmental assessment

* Multi-view fusion has fewer artefacts compared with single-view images, which allows easier application of image analysis methods for registration and segmentation, etc.
* Could lead to low cost automated 3D biometrics in the future.
<img width="80%" src="mr-vs-us.svg">


# Problems / Future work 

## Sonographic feedback

* Deployment requires real-time feedback of views acquired
    - sonographers are scanning blind
    - guarantee full coverage 
    - speed up acquisition
    

* currently exploring options such as real-time volume rendering

<video width="80%" src="on-the-fly-compounding.mp4" controls autoplay loop autopause mute></video>