## A (very) brief introduction to Color Doppler 

Color doppler is a medical imaging modality for blood flow imaging.  
It (suprasingly) bases on the Doppler effect.  
Doppler effect is a shift of a received signal frequency, when the wave source is moving relatively to the receiver.  
In medicine 'sources' (i.e. moving tissue, specifically blood) do not emits acoustic waves, but are 'iluminated' by a ultrasound pulses produced by a probe.  
It can be shown [Evans2000] that the shift in the received frequency is given by following equation:

$
f_d = f_t - f_r = \frac{2f_tv\cos{\alpha}}{c}
$

where
* $f_d$ - frequency shift, or doppler frequency,
* $f_t$ - transmitted frequency, 
* $f_r$ - received frequency, 
* $v$ - speed of the blood,
* $\alpha$ - the angle between the ultrasound beam and the direction of motion of the blood,
* $c$ - speed of sound in the medium. It is usually assumed that in soft tissue $c = 1540 [m/s]$.



<!-- When the medical probe transmits the ultrasound pulse, and it is scattered on moving blood (i.e. on blood cells), the received echoes changes in phase.    -->
In classical approach the probe transmits a series of (quite long) ultrasound pulses, receiving echoes after each transmit.  
The series consists of $N$ transmit/receive (TR) events, and the higher $N$, the higher sensitivity, but lower doppler framerate.  
Typically $N$ could be in the range of $8-16$ for classical methods and $32-256$ for synthetic aperture methods, however there are no strict rules.  
The TR events in the series are repeated with constant Pulse Repetition Frequency (PRF).  
Thus the time between TR events - Pulse Repetition Interval (PRI) is equal $\frac{1}{PRF}$.  
The received signals are IQ demodulated. The $f_d$ can be estimated from IQ signal by means of autocorrelation esitmator:  

$
\overline{f_d} = \frac{1}{2\pi{}PRI} 
    \tan^{-1}{\left\{ 
        \frac{\sum^{N}_{i=1}{Q(i)I(i-1) - I(i)Q(i-1)}}
             {\sum^{N}_{i=1}{I(i)I(i-1) + Q(i)Q(i-1)}}
    \right\}}
$



<!-- , and next the phase is estimated for each sample.  
Then, for each sample the phase changes $\Delta{\theta}$ from TR to TR are calculated. Sometimes it is refered as phase changes in 'slow time'.  
The doppler frequency (averaged over time) can be calculated from the formula

$
\overline{f_d} = \frac{1}{N-1} \sum_{n=1}^{N-1} \frac{\Delta{\theta}_{n}}{PRI}
$

 -->
Then, we can use the following formula to estimate (average) speed of the blood flow.  

$
v_s = \frac{\overline{f_d}}{f_t} \frac{c}{2\cos{\alpha}}
$





## The complete processing for Color Doppler using CUDA

In [1]:
# import numpy as np
import cupy as cp

Kernel definitions

In [2]:
source = r'''
extern "C" __global__
void test_sum(const float* x1, 
          const float* x2, 
          float* y) 
{
     int i = blockDim.x * blockIdx.x + threadIdx.x;
     y[i] = x1[i] + x2[i];
}
'''
test_kernel = cp.RawKernel(source, 'test_sum')


In [21]:
source = r"""
extern "C" __global__ 
void dopplerColor(float* color, 
                  float* power, 
                  float2 const* iqImg, 
                  int const nZPix, 
                  int const nXPix, 
                  int const nRep)
{
    int z = blockIdx.x * blockDim.x + threadIdx.x;
    int x = blockIdx.y * blockDim.y + threadIdx.y;
    
    float2 iqPixCurr, iqPixPrev;
    float auxPower;
    float2 auxColor = {0.f, 0.f};
    
    if (z>=nZPix || x>=nXPix) {
        return;
    }
    
    /* Color & Power estimation */
    iqPixCurr = iqImg[z + x * nZPix];
    auxPower = iqPixCurr.x * iqPixCurr.x + iqPixCurr.y * iqPixCurr.y;
    for (int iRep=1; iRep<nRep; iRep++) {
        iqPixPrev = iqPixCurr;
        iqPixCurr = iqImg[z + x * nZPix + iRep * nZPix * nXPix];
        
        auxPower += iqPixCurr.x * iqPixCurr.x + iqPixCurr.y * iqPixCurr.y;
        auxColor.x += iqPixCurr.x * iqPixPrev.x + iqPixCurr.y * iqPixPrev.y;
        auxColor.y += iqPixCurr.y * iqPixPrev.x - iqPixCurr.x * iqPixPrev.y;
    }
    //color[z + x*nZPix] = atan2f(auxColor.y, auxColor.x);
    color[z + x*nZPix] = 10;
    power[z + x*nZPix] = auxPower / nRep;
}

"""
cd_kernel = cp.RawKernel(source, 'dopplerColor')


In [None]:
source = r"""
#include <cupy/complex.cuh>
extern "C" __global__ 
void dopplerColor(float *color, 
                  float *power, 
                  const complex<float> *iqImg, 
                  int const nZPix, 
                  int const nXPix, 
                  int const nRep)
{
    int z = blockIdx.x * blockDim.x + threadIdx.x;
    int x = blockIdx.y * blockDim.y + threadIdx.y;
    
    float2 iqPixCurr, iqPixPrev;
    float auxPower;
    float2 auxColor = {0.f, 0.f};
    
    if (z>=nZPix || x>=nXPix) {
        return;
    }
    
    /* Color & Power estimation */
    iqPixCurr = iqImg[z + x * nZPix];
    auxPower = iqPixCurr.x * iqPixCurr.x + iqPixCurr.y * iqPixCurr.y;
    for (int iRep=1; iRep<nRep; iRep++) {
        iqPixPrev = iqPixCurr;
        iqPixCurr = iqImg[z + x * nZPix + iRep * nZPix * nXPix];
        
        auxPower += iqPixCurr.x * iqPixCurr.x + iqPixCurr.y * iqPixCurr.y;
        auxColor.x += iqPixCurr.x * iqPixPrev.x + iqPixCurr.y * iqPixPrev.y;
        auxColor.y += iqPixCurr.y * iqPixPrev.x - iqPixCurr.x * iqPixPrev.y;
    }
    //color[z + x*nZPix] = atan2f(auxColor.y, auxColor.x);
    color[z + x*nZPix] = 10;
    power[z + x*nZPix] = auxPower / nRep;
}

"""
cd_kernel = cp.RawKernel(source, 'dopplerColor')

In [22]:
#  create some test data
nx = 8
nz = 8
n_frames = 8

# iqdata = cp.ones((nz, nx, n_frames)).astype(complex)
# iqdata = cp.arange(nx * nz * n_frames).astype(complex).reshape(nz, nx, n_frames)
iqdata = cp.arange(nx * nz * n_frames).reshape(nz, nx, n_frames)
# iqdata = iqdata + 1j*iqdata.T
color =  cp.zeros((nz, nx))
power =  cp.zeros((nz, nx))


x1 = cp.arange(25, dtype=cp.float32).reshape(5, 5)
x2 = cp.arange(25, dtype=cp.float32).reshape(5, 5)
y = cp.zeros((5, 5), dtype=cp.float32)
# print(y)
# %time test_kernel((5,5), (5,), (x1, x2, y))  # grid, block and arguments
# print(y)

grid = (nx, nz)
block = (n_frames, )

print(iqdata[0:8, 0, 0])
print(iqdata[0:8, 0, 1])
print(iqdata[0:8, 0, 2])
# print(color[0:8, 0:8, 0])
%time cd_kernel(grid, block, (color, power, iqdata, nz, nx, n_frames))
print(color)


[  0  64 128 192 256 320 384 448]
[  1  65 129 193 257 321 385 449]
[  2  66 130 194 258 322 386 450]


CompileException: /tmp/tmpmi6i5wjm/b8ad4434f51357edfa2b1f0f3f5d7aa3_2.cubin.cu(23): error: no operator "=" matches these operands
            operand types are: float2 = const thrust::complex<float>

/tmp/tmpmi6i5wjm/b8ad4434f51357edfa2b1f0f3f5d7aa3_2.cubin.cu(27): error: no operator "=" matches these operands
            operand types are: float2 = const thrust::complex<float>

2 errors detected in the compilation of "/tmp/tmpmi6i5wjm/b8ad4434f51357edfa2b1f0f3f5d7aa3_2.cubin.cu".


[[0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]]


## A very brief introduction to Vector Doppler 

## The complete processing for Vector Doppler using CUDA

## Doing B-mode and Color/Vector processing in the same time - a single GPU (multiple streams) example

### References

[Evans2000] Evans, David H., and W. Norman McDicken. Doppler ultrasound: physics, instrumentation and signal processing. Wiley-Blackwell, 2000.