# &#x1F4DD; REPORT

# Homework &#x0031;&#xFE0F;&#x20E3; 

Homework policy: the homework is individual. Students are encouraged to discuss with fellow students to try to find the main structure of the solution for a problem, especially if they are totally stuck at the beginning of the problem. However, they should work out the details themselves and write down in their own words only what they understand themselves. For every answer you provide, try to give it in its simplest form, while answering correctly. Results that are available in the course notes can be used and referenced and do not need to be rederived.

You can answer in French or in English. Do not forget to answer all subquestions. Word processing (Word, Latex,...) would be appreciated, or scanned readable handwritten notes.

#### ___Spatial Processing: Linear Interference Cancellation___
---

### **&#x2488;** Adaptation of the Spatial ICMF via LMS

<img src=images/ICMF_via_LMS.png width='' height='' > </img>

Consider adaptation of the spatial Interference Canceling Matched Filter (ICMF) depicted in the figure above. The received signal $y[k]$ contains m subchannels and the interference canceling filter $\mathbf{f}$ is represented as a row vector. For a generic value of $\mathbf{f}$, we get the error signal 
$$
\begin{equation}
\epsilon[k](\mathbf{f}) = d[k]− \mathbf{f} \; \mathbf{x}[k] . \qquad
\end{equation}
$$

For the adaptation of $\mathbf{f}$, the error signal is $\epsilon[k]$ which also provides an estimate of the transmitted symbol sequence $a[k]$ (up to a scale factor $\|\mathbf{h}\|^2$), the desired response signal is $d[k]$, which is the spatial matched filter output $\mathbf{y}_1[k]$, and the input signal is $\mathbf{x}[k]$, which is also the output $\mathbf{y}_2[k]$ of the orthogonal complement filter $\mathbf{h}^{\perp H}$. The transmitted symbol sequence $a[k]$ and the additive noise sequence $\mathbf{v}[k]$ are both considered to be temporally white and mutually independent, whereas the noise is spatially colored with covariance matrix $R_\mathbf{VV}$ (the noise could contain interference). The transmitted power is $\sigma_a^2$. We assume in a first instance that $\hat{\mathbf{h}}[k] = \mathbf{h}$ (and hence $\hat{\mathbf{h}}^\perp[k] = \mathbf{h}^\perp$).

#### **&#x1F516;** **(&#x5F;)** ___Diagram Interpretation___

---

To express all the signal functions present in the diagram, let's analyze the flow and components step by step.

1. **Input Signal**: $a[k] $ This is the input signal to the system.
2. **Convolution with $ \mathbf{h} $**: $ y[k] = h * a[k] + v[k] $: The input signal $ a[k] $ is convolved with a filter $ h $, and noise $ v[k] $ is added. Here, $ y[k] $ is the resulting signal after the convolution and noise addition.
3. **Splitting $ y[k] $ and Applying Filters**: $ y[k] $ is split into two paths. One path is passed through the filter $ h^H[k] $ and the other through $ h^{\perp H}[k] $.
4. **Desired Signal**: $ d[k] = y_1[k] = h^H[k] * y[k] $: The output of the `matched filter` $ h^H[k] $ applied to $ y[k] $ is the desired signal $ d[k] $.
5. **Reference Signal**: $ x[k] = y_2[k] = h^{\perp H}[k] * y[k] $: The output of the filter $ h^{\perp H}[k] $ applied to $ y[k] $ is the reference signal $ x[k] $.
6. **Adaptive Filter Output**:  $ \hat{a}[k] = \mathbf{f}[k] * x[k] $: The reference signal $ x[k] $ is passed through the adaptive filter $ \mathbf{f}[k] $ to produce the estimated signal $ \hat{a}[k] $.
7. **Error Signal**: $ \epsilon[k] = d[k] - \hat{a}[k] $: The error signal $ \epsilon[k] $ is the difference between the desired signal $ d[k] $ and the estimated signal $ \hat{a}[k] $. $\boxed{ \color{green} See \text{ << Playing with the Error Signal >> } below }$

This setup is typical in adaptive filtering applications where an adaptive filter $ \mathbf{f}[k] $ is adjusted to minimize the error $ \epsilon[k] $ between the desired signal $ d[k] $ and the output $ \hat{a}[k] $ of the adaptive filter. The filters $ h^H[k] $ and $ h^{\perp H}[k] $ are used to decompose the signal $ y[k] $ into components that can be separately processed to achieve the desired filtering and error minimization.

### Playing with the Error Signal

Given the diagram structure and the common practices in adaptive filtering, the $ \epsilon[k] = d[k] - \hat{a}[k] $ formulation is typically used to adapt the filter $\mathbf{f}[k]$ to minimize the difference between the desired signal and the estimated interference. This helps in improving the desired signal's quality.

Meawhile the diagram suggests $ \epsilon[k] = \hat{a}[k] $ which indicates that the filter output itself is considered as the error, which might imply that the primary focus is on the filter output (perhaps for monitoring or secondary processing).

### Interference Cancellation && Interference Cancelling

While the terms interference cancellation and interference cancelling are often used interchangeably, they can imply slightly different aspects of the process. Interference cancellation is a more comprehensive term covering the entire process of dealing with interference, whereas interference cancelling focuses on the specific, often $\color{salmon}\text{real-time}$, techniques used to mitigate interference.

The primary difference lies in the interpretation and application:
- **$ \epsilon[k] = d[k] - \hat{a}[k] $**: This is used for adaptive filtering to minimize the difference between the desired signal and the filter output, common in Least Mean Squares (LMS) algorithms.
- **$\color{salmon} \epsilon[k] = \hat{a}[k] $**: This treats the filter output as the error signal itself, often seen in `interference` or noise `cancellation contexts` where the goal is to adapt the filter to minimize the interference component.

For clarity and correctness in our specific application, we will ensure to cross-check with the context provided by the diagram and any accompanying documentation. 

In the subsequent sections, the `Error Signal` $\epsilon$ will change form:
- $\epsilon[k](f) = d[k] - \mathbf{f} \, \mathbf{x}[k]$ for a generic value of $\mathbf{f}$
- $e[k] = \epsilon[k](\mathbf{f}^o)$ for the optimal error signal. Note the $e$, not $\epsilon$
- $e[k] = d[k] - \mathbf{f} \, \mathbf{x}[k]$ for the unperturbed $e^o[k]$
- let's make sure we keep track of the right notation by trying to give a context wherever possible.

#### **&#x1F516;** **(&#x61;)** ___LMMSE design___

---

Express the LMMSE filter $\mathbf{f}^o$, that minimizes $\sigma_{\epsilon}^2$, in terms of $\mathbf{h}$, $\mathbf{h}^{\perp}$ and $R_\mathbf{VV}$.
Let $e[k] = \epsilon[k](\mathbf{f}^o)$ be the optimal error signal. Derive an expression for $e[k]$ in terms of the quantities in the figure.

Introduce the matrix square root $R_{\mathbf{VV}} = R_{\mathbf{VV}}^{1/2}R_{\mathbf{VV}}^{H/2}$ and the transformed quantities
$\mathbf{h}^{'} = R_{\mathbf{VV}}^{H/2}\mathbf{h}$. Note that if $R_{\mathbf{VV}}$ is not a multiple of identity, then $h'$ and $h^{\perp '}$ are no longer orthogonal. Also introduce $\mathbf{v}^′[k] = R_{\mathbf{VV}}^{−1/2}\mathbf{v}[k]$ for which $R_{\mathbf{V'V'}} = I_m$ . Find now a simplified expression for $e[k]$ and show that the corresponding MMSE is
$$
\begin{equation}
\sigma_e^2 = \sigma_a^2 \| \mathbf{h} \|^4 + \| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2
\end{equation}
$$
where $P_{\mathbf{g}} = \mathbf{g}(\mathbf{g}^H \mathbf{g})^{−1}\mathbf{g}^H , P_{\mathbf{g}}^{\perp} = I − P_{\mathbf{g}}$ are the projection matrices on the column space of $\mathbf{g}$ and its orthogonal complement respectively. When $R_{\mathbf{VV}} = \sigma_v^2 I_m$, what does $\| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2$ simplify to?

**&#x1F4DD;** Express the LMMSE filter $\mathbf{f}^o$, that minimizes $\sigma_{\epsilon}^2$, in terms of $\mathbf{h}$, $\mathbf{h}^{\perp}$ and $R_\mathbf{VV}$.

---

To derive the Linear Minimum Mean Square Error (LMMSE) filter in a condensed form that efficiently incorporates the spatial properties of channel vectors and the noise covariance matrix, let's follow a straightforward approach:

***Overview***

The LMMSE filter aims to minimize the expected square error between the desired response and the output of a filter applied to an input signal. This filter accounts for the signal and noise characteristics to optimize signal estimation.

In mathematical terms, the LMMSE filter is obtained by solving the following optimization problem: $ \mathbf{f}^o = \underset{\mathbf{f}}{\text{arg min}} \; E\left[ \left( d[k] - \mathbf{f} \mathbf{x}[k] \right)^2 \right] $ Where $E[\cdot]$ denotes the expectation operator.

***Derivation Steps***

- ##### **Step 1: Define Variables**
  
- **Received Signal $ \mathbf{y}[k] $**: Composed of the transmitted signal affected by the channel $ \mathbf{h} $ and noise $ \mathbf{v}[k] $.
  $
  \mathbf{y}[k] = \mathbf{h} a[k] + \mathbf{v}[k]
  $

- **Desired Response $ d[k] $**: The output from the matched filter targeting the component in the direction of $ \mathbf{h} $.
  $
  d[k] = \mathbf{h}^H \mathbf{y}[k]
  $

- **Input Signal $ \mathbf{x}[k] $**: Extracts the component orthogonal to $ \mathbf{h} $.
  $
  \mathbf{x}[k] = \mathbf{h}^{\perp H} \mathbf{y}[k]
  $

- ##### **Step 2: Calculate Covariance**

- **Auto-Covariance $ R_{\mathbf{xx}} $** and **Cross-Covariance $ R_{d\mathbf{x}} $** based on $ \mathbf{x}[k] $ and $ d[k] $:
  $
  R_{\mathbf{xx}} = \mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp, \quad R_{d\mathbf{x}} = \mathbf{h}^H R_{\mathbf{VV}} \mathbf{h}^\perp
  $

- ##### **Step 3: LMMSE Filter Formula**
  
- The filter that minimizes the mean squared error:
  $
  \mathbf{f}^o = R_{d\mathbf{x}} R_{\mathbf{xx}}^{-1}
  $
  Substituting the derived covariance expressions, we obtain:

  $ {\color{salmon} \framebox[1][10]{ Solution: } }$ 

  $
  \boxed{
  \mathbf{f}^o = (\mathbf{h}^H R_{\mathbf{VV}} \mathbf{h}^\perp) (\mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp)^{-1}
  }
  $


**&#x1F4DD;** Derive an expression for $e[k]$ in terms of the quantities in the figure.

---

Given the transformation involving the matrix square root of the noise covariance matrix $ R_{\mathbf{VV}} $ and its relation to the transformed quantities, we can derive an updated expression for $ e[k] $ in this new transformed space. The aim is to simplify and normalize the relationships by transforming the channel vector and noise vector such that the new noise vector $ \mathbf{v}'[k] $ has a covariance matrix of $ I_m $, the identity matrix.

- #### Transformations Introduced:

1. **Noise Covariance Square Root**:
   $
   R_{\mathbf{VV}} = R_{\mathbf{VV}}^{1/2}R_{\mathbf{VV}}^{H/2}
   $
2. **Transformed Channel Vector**:
   $
   \mathbf{h}' = R_{\mathbf{VV}}^{H/2}\mathbf{h}
   $
3. **Transformed Noise Vector**:
   $
   \mathbf{v}'[k] = R_{\mathbf{VV}}^{-1/2}\mathbf{v}[k]
   $
   where $ R_{\mathbf{V'V'}} = I_m $.

#### Step 1: Redefine the Projections and Signal Components

Under these transformations, the relationship between $ d[k] $, $ \mathbf{x}[k] $, and the projections of vectors changes:
- **Received Signal** $ \mathbf{y}[k] $:
  $
  \mathbf{y}[k] = \mathbf{h} a[k] + \mathbf{v}[k] = R_{\mathbf{VV}}^{-1/2} (\mathbf{h}' a[k] + \mathbf{v}'[k])
  $

- **Desired Signal** $ d[k] $:
  $
  d[k] = \mathbf{h}^H \mathbf{y}[k] = (\mathbf{h}')^H R_{\mathbf{VV}}^{-1/2} \mathbf{y}[k] = (\mathbf{h}')^H (\mathbf{h}' a[k] + \mathbf{v}'[k])
  $

- **Input Signal** $ \mathbf{x}[k] $:
  $
  \mathbf{x}[k] = \mathbf{h}^{\perp H} \mathbf{y}[k]
  $
  We need to define $ \mathbf{h}^{\perp '} $ such that it is orthogonal in the transformed space, which requires additional analysis or assumptions since $ \mathbf{h}' $ and $ \mathbf{h}^{\perp '} $ are not necessarily orthogonal due to the non-identity $ R_{\mathbf{VV}} $.

#### Step 2: Expression for $ e[k] $ in the Transformed Space

Assuming we have derived or defined an appropriate $ \mathbf{h}^{\perp '} $:
$ \mathbf{x}[k] = (\mathbf{h}^{\perp '})^H (\mathbf{h}' a[k] + \mathbf{v}'[k]) $

$ e[k] = d[k] - \mathbf{f}^o \mathbf{x}[k] $
where $ \mathbf{f}^o $ needs to be recalculated for the transformed space:
$ \mathbf{f}^o = (\mathbf{h}'^H \mathbf{h}^{\perp '}) ((\mathbf{h}^{\perp '})^H \mathbf{h}^{\perp '})^{-1} $

Then:
$ e[k] = (\mathbf{h}')^H (\mathbf{h}' a[k] + \mathbf{v}'[k]) - \mathbf{f}^o (\mathbf{h}^{\perp '})^H (\mathbf{h}' a[k] + \mathbf{v}'[k]) $

This describes the error in terms of the transformed channel $ \mathbf{h}' $ and noise $ \mathbf{v}' $, where:
- $ \mathbf{h}' = R_{\mathbf{VV}}^{H/2} \mathbf{h} $
- $ \mathbf{v}'[k] = R_{\mathbf{VV}}^{-1/2} \mathbf{v}[k] $

#### Step 3: Rewriting $ e[k] $ in Original Terms

To rewrite $ e[k] $ using the original $ \mathbf{h} $ and $ \mathbf{v} $, we need to express the transformed variables back in terms of the originals:

- ##### a). **Transform Back the Noise and Channel Vectors**
- **Transformed Channel**:
  $
  \mathbf{h}' a[k] = R_{\mathbf{VV}}^{H/2} \mathbf{h} a[k]
  $

  Multiplied through the error signal expression, this becomes:
  $
  (\mathbf{h}')^H \mathbf{h}' a[k] = \mathbf{h}^H R_{\mathbf{VV}}^{H/2} R_{\mathbf{VV}}^{1/2} \mathbf{h} a[k] = \mathbf{h}^H \mathbf{h} a[k]
  $
  showing that multiplication with the square root and its Hermitian transpose returns to the original form because $ R_{\mathbf{VV}}^{H/2} R_{\mathbf{VV}}^{1/2} = R_{\mathbf{VV}} $ and $ \mathbf{h} $ is normalized or adjusted accordingly.

- **Transformed Noise**:
  $
  \mathbf{v}'[k] = R_{\mathbf{VV}}^{-1/2} \mathbf{v}[k]
  $
  
  Using the projection operator:
  $
  \Big[ \mathit{I_m} - R_{\mathbf{VV}} \mathbf{h}^{\perp} (\mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp)^{-1} \mathbf{h}^{\perp H} \Big] \mathbf{v}[k]
  $
  This matrix is essentially what happens when we multiply the noise by the orthogonal projection matrix adjusted by the covariance matrix $ R_{\mathbf{VV}} $. It projects $ \mathbf{v}[k] $ onto the space orthogonal to $ \mathbf{h}^{\perp} $ under the metric induced by $ R_{\mathbf{VV}} $.

- ##### b). **Combine Back into the Error Equation**

  After substituting these transformations back and simplifying:

  $ {\color{salmon}  \framebox[1][10]{ Solution: } }$

  $\boxed{
  e[k] = \mathbf{h}^H (\mathbf{h} \, a[k] + \Big[ \mathit{I_m} - R_{\mathbf{VV}}\mathbf{h}^{\perp} (\mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp)^{-1} \mathbf{h}^{\perp H} \Big] \mathbf{v}[k])
  }
  $

  This result shows the a priori error, which accounts for the noise's coloring through $ R_{\mathbf{VV}} $ and  effectively reduces the noise components aligned with the space spanned by $ \mathbf{h}^\perp $ under this coloring. The transformation retains the adaptive filter's capability to handle spatially colored noise and interference correctly.

**&#x1F4DD;** Finding a simplified expression for $e[k]$ and showing the corresponding MMSE


---

To transition from the complex expression for $ e[k] $ involving the projection matrix on $ \mathbf{h}^\perp $ in the original space to a simplified expression involving transformations and projection matrices in the transformed space, we need to map each term appropriately and understand how the transformations and projections relate. Let’s break down this transition step-by-step:

___Starting Point___

The initial formula:
$ 
e[k] = \mathbf{h}^H (\mathbf{h} \, a[k] + \Big[ \mathit{I_m} - R_{\mathbf{VV}}\mathbf{h}^{\perp} (\mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp)^{-1} \mathbf{h}^{\perp H} \Big] \mathbf{v}[k])
$

This formula accounts for:
1. **Signal Component:** $ \mathbf{h}^H \mathbf{h} a[k] $
2. **Noise Component:** Modified by a projection that reduces noise components along the space spanned by $ \mathbf{h}^\perp $, adjusted by $ R_{\mathbf{VV}} $.

___Transforming the Terms___

1. **Signal Component**

- The signal component $\mathbf{h}^H \mathbf{h} \, a[k]$ is straightforward. It simplifies to $\|\mathbf{h}\|^2 a[k]$ since $\mathbf{h}^H \mathbf{h}$ is the power of $\mathbf{h}$, or the square of its norm.

2. **Noise Component**

- The original noise term uses a projection to eliminate components of noise in the subspace spanned by $ \mathbf{h}^\perp $ adjusted by $ R_{\mathbf{VV}} $. To transform this into the desired format:
  $
  \Big[ \mathit{I_m} - R_{\mathbf{VV}}\mathbf{h}^{\perp} (\mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^\perp)^{-1} \mathbf{h}^{\perp H} \Big] \mathbf{v}[k] 
  $
  This term can be seen as a projection of $ \mathbf{v}[k] $ onto the orthogonal complement of $ \mathbf{h}^\perp $ in the metric of $ R_{\mathbf{VV}} $.

To connect this with the transformed version $ \mathbf{h}' $ and $ \mathbf{v}' $, recall:
- $ \mathbf{h}' = R_{\mathbf{VV}}^{H/2} \mathbf{h} $
- $ \mathbf{v}'[k] = R_{\mathbf{VV}}^{-1/2} \mathbf{v}[k] $

Thus, rewriting the projection matrix for the transformed variables:
- We use $ \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} = I - \mathbf{h}^{\perp '} (\mathbf{h}^{\perp 'H} \mathbf{h}^{\perp '})^{-1} \mathbf{h}^{\perp 'H} $ where $ \mathbf{h}^{\perp '} = R_{\mathbf{VV}}^{H/2} \mathbf{h}^\perp $.
- Applying this to $ \mathbf{v}'[k] $ gives:
  $
  \mathbf{h}'^H \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{v}'[k] = \mathbf{h}'^H \Big[I - \mathbf{h}^{\perp '} (\mathbf{h}^{\perp 'H} \mathbf{h}^{\perp '})^{-1} \mathbf{h}^{\perp 'H}\Big] R_{\mathbf{VV}}^{-1/2} \mathbf{v}[k]
  $

___Conclusion___

This translates the projection and transformation of the noise vector into the following expression:

$ {\color{salmon}  \framebox[1][10]{ Solution: } }$

$\boxed{
e[k] = \|\mathbf{h}\|^2 a[k] + \mathbf{h}^{'H} \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{v}'[k]
}$

This simplified formula connects the original space properties $ \mathbf{h} $, $ \mathbf{v}[k] $ to their transformed versions under $ R_{\mathbf{VV}} $, reflecting how signal processing can adaptively minimize noise effects while maintaining focus on the desired signal.

**&#x1F4DD;** When $R_{\mathbf{VV}} = \sigma_v^2 I_m$, what does $\| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2$ simplify to?

---

To clarify and simplify the explanation regarding the expression $\| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2$ when $R_{\mathbf{VV}} = \sigma_v^2 I_m$, let's focus on a straightforward reformulation:

___Background___

Given the original transformations and projection matrices:
- **Channel Transformation**: $ \mathbf{h}' = \sigma_v \mathbf{h} $
- **Projection Matrices**: For a vector $ \mathbf{g} $,
  - $ \mathit{P}_{\mathbf{g}} = \mathbf{g} (\mathbf{g}^H \mathbf{g})^{-1} \mathbf{g}^H $
  - $ \mathit{P}_{\mathbf{g}}^{\perp} = I - \mathit{P}_{\mathbf{g}} $

___Key Calculation___

- The projection $ \mathit{P}_{\mathbf{h}}^{\perp} $ projects onto the space orthogonal to $ \mathbf{h} $.
- Applying this to $ \mathbf{h}' $, we find that:
  $ \mathit{P}_{\mathbf{h}}^{\perp} \mathbf{h}' = \mathit{P}_{\mathbf{h}}^{\perp} (\sigma_v \mathbf{h}) = 0 $
  because $ \mathbf{h}' $ is a scalar multiple of $ \mathbf{h} $ and lies entirely within the space spanned by $ \mathbf{h} $.

___Simplification___

Given the transformation $ \mathbf{h}' = \sigma_v \mathbf{h} $, the projection operator $ \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} $ actually does not modify $ \mathbf{h}' $ because it's directly aligned with $ \mathbf{h} $. Thus, any projection onto $ \mathbf{h} $ or away from $ \mathbf{h} $ will not change $ \mathbf{h}' $:

- The expression $ \| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2 $ effectively simplifies to:
  $ \| \sigma_v \mathbf{h} \|^2 $
  by recognizing that $ \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' $ is simply $ \mathbf{h}' $ itself (no component is removed by the projection):


  $ {\color{salmon}  \framebox[1][10]{ Solution: } }$


  $ \boxed{
  \| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2 = \| \sigma_v \mathbf{h} \|^2 = \sigma_v^2 \| \mathbf{h} \|^2   
  }$

___Conclusion___

This simplification highlights that when $ R_{\mathbf{VV}} = \sigma_v^2 I_m $, the projection operator $ \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} $ retains the full transformed channel vector $ \mathbf{h}' $ because the transformation does not introduce any components orthogonal to $ \mathbf{h} $ itself. Therefore, the expression $ \| \mathit{P}_{\mathbf{h}^{\perp '}}^{\perp} \mathbf{h}' \|^2 $ evaluates to $ \sigma_v^2 \| \mathbf{h} \|^2 $, representing the power of the channel in the presence of isotropic noise scaled by $ \sigma_v^2 $.

#### **&#x1F516;** **(&#x62;)** ___LMS adaptation of $f$___

---

The &#x1F449; LMS algorithm consists of applying one iteration, per sampling period, of the steepest- descent strategy to the instantaneous error criterion $|\epsilon[k](\mathbf{f})|^2 = \epsilon^*[k](\mathbf{f}) \, \epsilon[k](\mathbf{f})$. In the complex signals case, this becomes
$$
\begin{equation}
\mathbf{f}[k] = \left.\mathbf{f}[k - 1] - \mu \frac{\partial|\epsilon[k](\mathbf{f})|^2}{\partial{\mathbf{f^*}}}\right|_{\mathbf{f}=\mathbf{f}[k-1]}
\end{equation}
$$
Work out the gradient term in this LMS update. We shall simplify the notation for the a priori error signal as $\epsilon[k] = \epsilon[k](\mathbf{f}[k - 1])$.
To check that the gradient has indeed to be taken w.r.t. $\mathbf{f}^∗$ (and not $\mathbf{f}$), express the a posteriori error signal $\epsilon[k](\mathbf{f}[k])$ as a function of the a priori error signal and observe that the update leads to a smaller error signal for a proper choice of stepsize $\mu$.

**&#x1F4DD;** Derive the gradient term in the LMS (Least Mean Squares) algorithm.

---

To derive the Least Mean Squares (LMS) algorithm with respect to complex signals, particularly addressing the computation of the gradient term required for the update and the significance of using the conjugate transpose, here is a compact review:

___Context and Equation___

The LMS algorithm updates the filter coefficients by applying a steepest-descent method to minimize the instantaneous error criterion, which for complex signals involves the complex conjugate: $ |\epsilon[k](\mathbf{f})|^2 = \epsilon^*[k](\mathbf{f}) \epsilon[k](\mathbf{f}) $ where the error $ \epsilon[k] $ at iteration $ k $ is defined as:

  $  {\color{salmon} \framebox[1][10]{ a priori error signal : } }$


$\boxed{
\epsilon[k] = d[k] - \mathbf{f}[k-1] \mathbf{x}[k] 
}$

Here, $ d[k] $ is the desired signal, $ \mathbf{x}[k] $ is the input, and $ \mathbf{f}[k-1] $ are the filter coefficients from the previous iteration.

___Gradient Computation___

To update the coefficients effectively, the gradient of the error squared with respect to the conjugate of the filter coefficients ($\mathbf{f}^*$) must be computed. This is because the filter output depends linearly on the conjugate of the coefficients in the complex-valued domain.

The gradient of $ |\epsilon[k]|^2 $ with respect to $ \mathbf{f}^* $ is: $ \frac{\partial |\epsilon[k]|^2}{\partial \mathbf{f}^*} = \frac{\partial}{\partial \mathbf{f}^*} \left((d[k] - \mathbf{f}[k-1] \mathbf{x}[k])^*(d[k] - \mathbf{f}[k-1] \mathbf{x}[k])\right) $

Breaking down the product and applying the derivative yields: $ \frac{\partial |\epsilon[k]|^2}{\partial \mathbf{f}^*} = -\mathbf{x}[k] \epsilon^*[k] $

___Update Rule___

Incorporating the derived gradient into the update equation provides the rule for adjusting the filter coefficients:

  $   {\color{salmon} \framebox[1][10]{ update equation solution: } }$

$\boxed{
\mathbf{f}[k] = \mathbf{f}[k-1] + \mu \epsilon[k] \mathbf{x}^H[k] 
}$

This update ensures the filter coefficients are adjusted in a manner that minimizes the mean squared error. The term $\mathbf{x}^H[k]$ represents the conjugate transpose of the input vector, which correctly aligns the dimensions and conjugate pairs for the update in the complex domain.

___Validation of the Update___

To validate that this update leads to a reduction in the error for an appropriate choice of step size ($\mu$), 
- consider the a posteriori error signal after applying the update: $ \epsilon[k](\mathbf{f}[k]) = d[k] - \mathbf{f}[k] \mathbf{x}[k] $
- Substituting the update equation: $ \epsilon[k](\mathbf{f}[k]) = d[k] - (\mathbf{f}[k-1] + \mu \epsilon[k] \mathbf{x}^H[k]) \mathbf{x}[k] $

   $ {\color{salmon} \framebox[1][10]{ a posteriori error signal solution: } }$

   $\boxed{
   \epsilon[k](\mathbf{f}[k]) = \epsilon[k] - \mu \epsilon[k] \|\mathbf{x}[k]\|^2 = ( 1 - \mu \|\mathbf{x}[k]\|^2 ) \epsilon[k]
   }$

   The term $\mu \epsilon[k] \|\mathbf{x}[k]\|^2$ indicates that if $\mu$ is chosen small enough, the error magnitude $|\epsilon[k](\mathbf{f}[k])|^2$ can indeed be reduced compared to $|\epsilon[k]|^2$, confirming the efficacy of the update rule in reducing the error. This adjustment aligns with the steepest descent approach, optimized for complex signals to ensure convergence towards the minimum error.

#### **&#x1F516;** **(&#x63;)** ___Steady-state analysis of LMS adaptation of $\mathbf{f}$___

---

Let $\mathbf{\tilde{f}}[k] = \mathbf{f}^o − \mathbf{f}[k]$ be the filter error. Note that due to the presumed temporal whiteness of
$a[k]$ and $\mathbf{v}[k]$, also $d[k]$ and $\mathbf{x}[k]$ are temporally white. Hence $\mathbf{\tilde{f}}[k−1]$ and $e[k]$ are independent
(strictly speaking only uncorrelated).

Let $R_{\mathbf{\tilde{f}}\mathbf{\tilde{f}}}[k] = \mathrm{E} \, \mathbf{\tilde{f}}^H[k] \, \mathbf{\tilde{f}}[k]$. We can write for the a priori error signal and MSE

$$
\begin{equation}
\epsilon[k] = e[k] + \mathbf{\tilde{f}}[k−1] \mathbf{x}[k] \implies \underbrace{\sigma_{\epsilon[k]}^2}_\text{MSE} = \underbrace{\sigma_e^2}_\text{MMSE} + \underbrace{tr\{R_{\mathbf{\tilde{f}\tilde{f}}}[k - 1] \, R_{\mathbf{XX}} \}}_\text{EMSE}
\end{equation}
$$

The LMS update (3) for $\mathbf{f}$ leads to the following recursion for $\mathbf{\tilde{f}}[k]$:

$$
\begin{equation}
\mathbf{\tilde{f}}[k] = \mathbf{\tilde{f}}[k - 1] ( I - \mu \, \mathbf{x}[k] \, \mathbf{x}^H[k]) - \mu \, e[k] \, \mathbf{x}^H[k]
\end{equation}
$$

which, using the averaging analysis for small stepsize, can be approximated by

$$
\begin{equation}
\mathbf{\tilde{f}}[k] = \mathbf{\tilde{f}}[k - 1] ( I - \mu \, R_{\mathbf{XX}}) - \mu \, e[k] \, \mathbf{x}^H[k]
\end{equation}
$$

From (6), obtain the time evolution for the filter error correlation matrix $R_{\mathbf{\tilde{f}\tilde{f}}}[k]$. Assume
$\mu$ small so that $I − \mu R_{\mathbf{xx}}$ is stable and neglect second-order terms in $\mu$. In steady-state $\sigma_{\epsilon[\infty]}^2 = \sigma_{\epsilon}^2$ and $R_{\mathbf{\tilde{f}\tilde{f}}}[\infty] = R_{\mathbf{\tilde{f}\tilde{f}}}$  . Show now from (6) that we obtain for the steady-state Excess MSE

$$
\begin{equation}
EMSE = \frac{\mu}{2} \, \sigma_e^2 \, tr\{R_{\mathbf{XX}} \}
\end{equation}
$$

Note that $R_{\mathbf{XX}} = \mathbf{h}^{\perp H} R_{\mathbf{VV}} \mathbf{h}^{\perp}$ is not Toeplitz and its diagonal elements are not all equal. Using the simplified expression derived in `(a)` for $e[k]$, show the corresponding expression for the a priori error signal $\epsilon[k]$.

The signal part in $\epsilon[k]$ is the term containing $a[k]$ and all the rest is noise. Using the expression for MMSE in `(2)`, derive an expression for the SNR in $\epsilon[k]$. Note that the noise term contains a signal part which limits the SNR attainable by the adaptive system. This is due to the fact that the signal part in the error signal $e[k]$ acts like noise for the adaptation of the filter $\mathbf{f}[k]$. This problem is generic for any adaptation algorithm and not just specific for LMS.


**&#x1F4DD;** obtain the time evolution for the filter error correlation matrix $R_{\mathbf{\tilde{f}\tilde{f}}}[k]$.

---

To derive the time evolution of the filter error correlation matrix $ R_{\mathbf{\tilde{f}\tilde{f}}}[k] $ from the update equation of the filter error $ \mathbf{\tilde{f}}[k] $, we'll start by plugging the recursive definition into the expression for the correlation matrix, and then applying expectations to simplify terms.

- ##### Step 1: Update Equation for $ \mathbf{\tilde{f}}[k] $
Given:
$ 
\mathbf{\tilde{f}}[k] = \mathbf{\tilde{f}}[k - 1] (I - \mu R_{\mathbf{XX}}) - \mu e[k] \mathbf{x}^H[k]
$

- ##### Step 2: Defining $ R_{\mathbf{\tilde{f}\tilde{f}}}[k] $
The error correlation matrix $ R_{\mathbf{\tilde{f}\tilde{f}}}[k] $ is defined as:
$ 
R_{\mathbf{\tilde{f}\tilde{f}}}[k] = \mathrm{E}[\mathbf{\tilde{f}}[k] \mathbf{\tilde{f}}^H[k]]
$

- ##### Step 3: Substitute $ \mathbf{\tilde{f}}[k] $ into the Correlation Matrix
Substituting the update equation into the correlation matrix definition:
$ 
R_{\mathbf{\tilde{f}\tilde{f}}}[k] = \mathrm{E}\left[\left(\mathbf{\tilde{f}}[k - 1] (I - \mu R_{\mathbf{XX}}) - \mu e[k] \mathbf{x}^H[k]\right) \left(\mathbf{\tilde{f}}[k - 1] (I - \mu R_{\mathbf{XX}}) - \mu e[k] \mathbf{x}^H[k]\right)^H\right]
$

- ##### Step 4: Expand the Expectation

Expanding the expression within the expectation:

$ 
\begin{flalign*}
R_{\mathbf{\tilde{f}\tilde{f}}}[k] = 
    \mathrm{E}\left[\mathbf{\tilde{f}}[k - 1] (I - \mu R_{\mathbf{XX}}) (I - \mu R_{\mathbf{XX}})^H \mathbf{\tilde{f}}^H[k - 1]\right] 
    \\
    - \mathrm{E}\left[\mu \mathbf{\tilde{f}}[k - 1] (I - \mu R_{\mathbf{XX}}) e^*[k] \mathbf{x}[k]\right] 
    \\
    - \mathrm{E}\left[\mu e[k] \mathbf{x}^H[k] (I - \mu R_{\mathbf{XX}})^H \mathbf{\tilde{f}}^H[k - 1]\right] 
    \\
    + \mathrm{E}\left[\mu^2 e[k] e^*[k] \mathbf{x}^H[k] \mathbf{x}[k]\right]
\end{flalign*}
$

- ##### Step 5: Simplify Using Independence and Neglecting Higher Order Terms
Assuming $ \mathbf{\tilde{f}}[k - 1] $ and $ e[k] $ are uncorrelated and neglecting higher-order terms in $ \mu $,

  $  {\color{salmon}  \framebox[1][10]{ the equation simplifies to: } } $


$ 
R_{\mathbf{\tilde{f}\tilde{f}}}[k] = R_{\mathbf{\tilde{f}\tilde{f}}}[k - 1] - \mu R_{\mathbf{\tilde{f}\tilde{f}}}[k - 1] R_{\mathbf{XX}} - \mu R_{\mathbf{XX}} R_{\mathbf{\tilde{f}\tilde{f}}}[k - 1] + \mu^2 \sigma_e^2 R_{\mathbf{XX}}
$

Neglegted term: $+ \mu^2 R_{\mathbf{XX}} R_{\mathbf{\tilde{f}\tilde{f}}}[k - 1] R_{\mathbf{XX}}$ 4th term

### Conclusion
This equation describes how the error correlation matrix evolves over time under the influence of the input correlation matrix $ R_{\mathbf{XX}} $ and the learning rate $ \mu $. It provides insight into the stability and convergence characteristics of the adaptive filter, highlighting the impact of step size and input properties on the filter's learning dynamics.

##### Definitions:

- **Updated Equation**: Focuses on how the error evolves and reacts to ongoing learning and adaptation, used during the active phase of filter training and initial deployment.
- **Steady-State Equation**: Provides insights into the performance and behavior of the filter after it has adapted sufficiently to the statistics of the inputs and noise, important for evaluating final filter settings and design parameters.



#### **&#x1F516;** **(&#x64;)** ___Steady-state analysis of signal compensated LMS adaptation of $\mathbf{f}$___

---

Consider now compensating the signal part in the desired response signal $d[k]$ for the LMS adaptation. So we shall take as desired response $d[k] = \mathbf{h}^H (\mathbf{y}[k] − \mathbf{h} \, a[k]) = \mathbf{h}^H \, \mathbf{v}[k]$. The goal of the receiver is to detect the $a[k]$ which are hence unknown. The way this signal compensation can be implemented then is by either limiting the update for f to time instants at which the $a[k]$ are training symbols (used for estimating h also, see further) or by using the detected $a[k]$ (decision-directed (DD) strategy). In the DD strategy, the symbol $a[k]$ gets detected from $\mathbf{h}^H \, \mathbf{y}[k] − \mathbf{f}[k - 1]$ (or delay needs to be introduced for the updating of f if also channel decoding gets exploited to get more reliable $a[k]$). Making abstraction of these details, consider hence $d[k] = \mathbf{h}^H \, \mathbf{v}[k]$.

Does the signal compensation influence the optimal filter setting $\mathbf{f}^o$? What do the optimal error signal $e[k]$ and associated MMSE $\sigma_e^2$ become?

The signal compensation only gets done for the adaptation of $\mathbf{f}$. The thus adapted $\mathbf{f}$ then gets used in the original ICMF circuit. So, at the output of the ICMF, with the adapted $\mathbf{f}$, what does the SNR become? With the signal compensation, the SNR degradation due to the adaptation of $\mathbf{f}$ can be made arbitrarily small.

**&#x1F4DD;** Does the signal compensation influence the optimal filter setting $f^0$?

To derive the Signal-to-Noise Ratio (SNR) formula provided, we'll follow a systematic approach to combine elements like signal power, noise power, and the effect of adaptive filtering. Let's begin by identifying and understanding each component in the expression:

### SNR Definition
In signal processing, SNR is typically defined as the ratio of the power of a signal (useful information) to the power of the noise (undesired interference), affecting the fidelity of its representation. The given SNR expression can be broken down into several key parts:

1. **Signal Power ($ \| \mathbf{h} \|^4 \sigma_a^2 $)**:
   - $ \sigma_a^2 $ represents the power of the transmitted signal symbols $ a[k] $.
   - $ \| \mathbf{h} \|^4 $ suggests an amplification factor from the channel characteristics, possibly implying some form of non-linear effect or a squared measure when considering real and imaginary components in a communication system.

2. **Noise Power ($ \| P_{\mathbf{h}^{\perp '} }^{\perp} \mathbf{h}' \|^2 $)**:
   - $ \mathbf{h}' = R_{\mathbf{VV}}^{H/2} \mathbf{h} $ transforms the channel vector by the noise covariance matrix, potentially aligning or scaling it with the noise characteristics.
   - $ P_{\mathbf{h}^{\perp '} }^{\perp} $ is the projection matrix onto the subspace orthogonal to the orthogonal complement of $ \mathbf{h}' $, essentially calculating the component of $ \mathbf{h}' $ that is not orthogonal to itself. This represents how much of the transformed channel vector contributes to the noise after filtering.
   - The norm squared ($ \| \cdot \|^2 $) indicates the power contribution of this component.

3. **Filter Adjustment Factor ($ \frac{\mu}{2} tr \{ R_{\mathbf{XX}} \} $)**:
   - $ \mu $ is the adaptation step size in the LMS algorithm, affecting how quickly the filter adapts to changes.
   - $ tr \{ R_{\mathbf{XX}} \} $ is the trace of the input signal autocorrelation matrix, representing the total power in the input signal across all its dimensions or channels.
   - This term adjusts the noise power term to account for the impact of the learning rate and the power of the input signals, indicating that higher adaptation rates or more powerful inputs increase the effective noise power, reducing SNR.

### Putting It All Together
The SNR formula can now be written as:
$
\text{SNR} = \frac{\| \mathbf{h} \|^4 \sigma_a^2}{\| P_{\mathbf{h}^{\perp '} }^{\perp} \mathbf{h}' \|^2 ( 1 + \frac{\mu}{2} tr \{ R_{\mathbf{XX}} \})}
$
This equation essentially compares the modified signal power to the modified noise power, with adjustments for how the filter handles the noise (influenced by $ \mu $ and the signal properties).

### Interpretation
- **Signal Component**: $ \| \mathbf{h} \|^4 \sigma_a^2 $ indicates a strong dependence of the SNR on the channel properties and transmitted signal power.
- **Noise Component**: $ \| P_{\mathbf{h}^{\perp '} }^{\perp} \mathbf{h}' \|^2 $ modified by $ 1 + \frac{\mu}{2} tr \{ R_{\mathbf{XX}} \} $ reflects the interaction between the filter dynamics and the signal characteristics, noting that aggressive adaptation or higher input power can degrade SNR by effectively increasing the noise power.

This derived SNR formula is especially useful in adaptive filtering contexts where understanding the balance between adaptation speed, signal power, and noise characteristics is crucial for optimizing communication system performance.

#### **&#x1F516;** **(&#x65;)** ___LMS adaptation of $\mathbf{h}$___

---

The transmitted symbols are in fact partitioned into known training symbols and actual data symbols. The training symbols get inserted periodically. They get used to adapt the channel estimate. From now on we shall denote the true value of the channel as $\mathbf{h}^o$ (assumed time-invariant). For the adaptation of the channel estimate, consider the error signal $\mathbf{w}[k](\mathbf{h}) = \mathbf{y}[k] − \mathbf{h} \, a[k]$. The optimal value for the error signal $\mathbf{w}[k](\mathbf{h}^o)$ has already been specified in the problem formulation. What is it?

The LMS algorithm performs one iteration of the steepest-descent strategy per training symbol to the instantaneous error criterion $\|\mathbf{w}[k](\mathbf{h})\|^2 = \mathbf{w}^H[k](h) \, \mathbf{w}[k](\mathbf{h})$. Derive the LMS algorithm that updates the channel estimate $\mathbf{h}[k]$ (which could have been denoted also as $\mathbf{\hat{h}}[k]$, but let’s keep $\mathbf{h}[k]$). Denote the a priori error signal as $\mathbf{w}[k]$ and the stepsize as $\nu$. Note that the time index now is no longer the true time index but a counter for the training symbols only, since adaptation occurs only when a symbol is a training symbol.
Develop the recursion for the channel estimation error $\mathbf{\hat{h}}[k] = \mathbf{h}^o − \mathbf{h}[k]$. Find the steady-state value for $R_\mathbf{\hat{h}\hat{h}}$.

**&#x1F4DD;** What is it? (Optimal Value)



___Definition of the Error Signal___
- The error signal for channel estimation, when using the channel estimate $\mathbf{h}$, is defined as:
  $
  \mathbf{w}[k](\mathbf{h}) = \mathbf{y}[k] - \mathbf{h} a[k]
  $
- When using the optimal or true channel estimate $\mathbf{h}^o$, this becomes:
  $
  \mathbf{w}[k](\mathbf{h}^o) = \mathbf{y}[k] - \mathbf{h}^o a[k]
  $

___Substituting the Received Signal___
- Plug the expression for $\mathbf{y}[k]$ into the error signal calculation:
  $
  \mathbf{w}[k](\mathbf{h}^o) = (\mathbf{h}^o a[k] + \mathbf{v}[k]) - \mathbf{h}^o a[k]
  $

  $   \framebox[1][10]{ the equation simplifies to: } $

  $
  \mathbf{w}[k](\mathbf{h}^o) = \mathbf{v}[k]
  $


#### **&#x1F516;** **(&#x66;)** ___Effect of channel adaptation on LMMSE ICMF operation with long-term IC estimation___

---

Consider now the use of the adapted $\mathbf{h}[k]$ in the ICMF: for any data symbol $a[k]$, the $\mathbf{h}$ that will be used is the one adapted with LMS at the latest training symbol before the current data symbol. The main effect is that the channel estimation error $\mathbf{\tilde{h}}$ will lead to signal leakage in the output $\mathbf{x}[k]$ of the blocking filter $\mathbf{h}^{\perp H}$. The effect of the error $\mathbf{\tilde{h}}$ on $\mathbf{h}^\perp$ will depend on the choice of $\mathbf{h}^\perp$. Assuming the error to be small, we can perform a first-order analysis of the form

$$
\begin{equation}
0 = \mathbf{h}^{\perp H} \, \mathbf{h} = (\mathbf{h}^{o \perp} − \mathbf{\tilde{h}}^{\perp})^H (\mathbf{h}^o − \mathbf{\tilde{h}}) \approx \mathbf{h}^{o \perp H}\mathbf{\tilde{h}} − \mathbf{\tilde{h}}^{\perp H} \, \mathbf{h}^o \implies \mathbf{\tilde{h}}^{\perp H} \, \mathbf{h}^o \approx - \mathbf{h}^{o \perp H}\mathbf{\tilde{h}}
\end{equation}
$$

where $\mathbf{\tilde{h}}^{\perp}$ is not the orthogonal complement of $\mathbf{\tilde{h}}$  but the error on $\mathbf{h}^{\perp}$.

Describe the signals $d[k]$ and $\mathbf{x}[k]$ for the ICMF operation in terms of $\mathbf{h}^o$, $\mathbf{\tilde{h}}$ and their orthogonal complement versions, and $\mathbf{a}[k]$ and $\mathbf{v}[k]$, neglecting products of noise terms, and using `(8)`.

Find $R_{d\mathbf{X}}$ and $R_\mathbf{XX}$. Find the LMMSE filter $\mathbf{f}$ in terms of the unperturbed version $\mathbf{f}^o$.

Express the corresponding error signal $e[k]$, and MMSE in terms of $\mathbf{h}^{'} = R_\mathbf{VV}^{H/2} \mathbf{h}^o , \mathbf{h}^{\perp '} =
R_\mathbf{VV}^{H/2} \mathbf{h}^{o \perp}$. Give the increase in MSE due to the channel estimation error. How much is this increase when $R_\mathbf{VV} = \sigma_v^2I_m$?

To address the given question thoroughly, let's break it down into manageable parts and provide detailed explanations for each segment.

### Signals $ d[k] $ and $ \mathbf{x}[k] $

**Desired Signal $ d[k] $**:

Given:
$ d[k] = \mathbf{h}^H \mathbf{y}[k] = ({\mathbf{h}^{o}}^H - \mathbf{\tilde{h}}^H)(\mathbf{h}^o \, a[k] + \mathbf{v}[k]) $

Neglecting products of noise terms, this can be approximated as:
$ d[k] \approx \|\mathbf{h}^o \|^2 \, a[k] + {\mathbf{h}^o}^H \, \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o \, a[k] $

- $\mathbf{h}^o$: Original (unperturbed) channel.
- $\mathbf{\tilde{h}}$: Channel estimation error.
- $a[k]$: Data symbol.
- $\mathbf{v}[k]$: Noise.

**Reference Signal $ \mathbf{x}[k] $**:

Given:
$ \mathbf{x}[k] = {{\mathbf{h}^\perp}}^H \mathbf{y}[k] \approx {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] - {\mathbf{\tilde{h}}^\perp}^H \mathbf{h}^o \, a[k] $

Neglecting products of noise terms, this can be approximated as:
$ \mathbf{x}[k] \approx {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] + {\mathbf{h}^{o\perp}}^H \mathbf{\tilde{h}} \, a[k] $

- $\mathbf{h}^{o\perp}$: Orthogonal complement of the original channel.
- $\mathbf{\tilde{h}}^\perp$: Error on $\mathbf{h}^{\perp}$.

### Correlation Matrices $ R_{d\mathbf{X}} $ and $ R_\mathbf{XX} $

**Cross-correlation $ R_{d\mathbf{X}} $**:

Given the signals $ d[k] $ and $ \mathbf{x}[k] $, we have:
$ R_{d\mathbf{X}} = E[d[k] \mathbf{x}^H[k]] $

Substituting $ d[k] $ and $ \mathbf{x}[k] $:
$ d[k] \approx \|\mathbf{h}^o \|^2 a[k] + {\mathbf{h}^o}^H \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] $
$ \mathbf{x}[k] \approx {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] + {\mathbf{h}^{o\perp}}^H \mathbf{\tilde{h}} a[k] $

Since $ \mathbf{\tilde{h}} $ is assumed to be small and has zero mean, we neglect higher-order noise terms and focus on dominant terms:
$ R_{d\mathbf{X}} \approx \mathbf{h}^{oH} E[\mathbf{v}[k] \mathbf{v}^H[k]] \mathbf{h}^{o\perp} $
$ R_{d\mathbf{X}} = \mathbf{h}^{oH} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o\perp} $

**Autocorrelation $ R_{\mathbf{XX}} $**:

Given the signal $ \mathbf{x}[k] $:
$ R_{\mathbf{XX}} = E[\mathbf{x}[k] \mathbf{x}^H[k]] $

Substituting $ \mathbf{x}[k] $:
$ \mathbf{x}[k] \approx {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] + {\mathbf{h}^{o\perp}}^H \mathbf{\tilde{h}} a[k] $

We consider the autocorrelation and cross terms:
$ R_{\mathbf{XX}} \approx {\mathbf{h}^{o\perp}}^H E[\mathbf{v}[k] \mathbf{v}^H[k]] \mathbf{h}^{o\perp} + {\mathbf{h}^{o\perp}}^H E[\mathbf{\tilde{h}} a[k] a^H[k] \mathbf{\tilde{h}}^H] \mathbf{h}^{o\perp} $

Given $ E[a[k] a^H[k]] = \sigma_a^2 $:
$ R_{\mathbf{XX}} \approx {\mathbf{h}^{o\perp}}^H \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o\perp} + \sigma_a^2 {\mathbf{h}^{o\perp}}^H E[\mathbf{\tilde{h}} \mathbf{\tilde{h}}^H] \mathbf{h}^{o\perp} $

Since $ E[\mathbf{\tilde{h}} \mathbf{\tilde{h}}^H] = R_{\mathbf{\tilde{h}\tilde{h}}} $:
$ R_{\mathbf{XX}} = \mathbf{h}^{o\perp H} \left( R_{\mathbf{VV}} + \sigma_a^2 R_{\mathbf{\tilde{h}\tilde{h}}} \right) \mathbf{h}^{o\perp} $

If we assume that $ R_{\mathbf{\tilde{h}\tilde{h}}} = \nu \mathbf{I} $ where $\nu$ represents the variance of the channel estimation error, then:
$ R_{\mathbf{XX}} = \mathbf{h}^{o\perp H} \left( R_{\mathbf{VV}} + \sigma_a^2 \nu \mathbf{I} \right) \mathbf{h}^{o\perp} $
$ R_{\mathbf{XX}} = (1 + \frac{\nu}{2} \sigma_a^2) \mathbf{h}^{o\perp H} R_{\mathbf{VV}} \mathbf{h}^{o\perp} $

### LMMSE Filter $ \mathbf{f} $

The LMMSE filter is given by:
$ \mathbf{f} = R_{d\mathbf{X}} R_{\mathbf{XX}}^{-1} $

Substituting the expressions for $ R_{d\mathbf{X}} $ and $ R_{\mathbf{XX}} $:
$ \mathbf{f} = \mathbf{h}^{oH} R_{\mathbf{VV}} \mathbf{h}^{o\perp} \left( (1 + \frac{\nu}{2} \sigma_a^2) \mathbf{h}^{o\perp H} R_{\mathbf{VV}} \mathbf{h}^{o\perp} \right)^{-1} $
$ \mathbf{f} = \frac{1}{1 + \frac{\nu}{2} \sigma_a^2} \left( \mathbf{h}^{oH} R_{\mathbf{VV}} \mathbf{h}^{o\perp} \right) \left( \mathbf{h}^{o\perp H} R_{\mathbf{VV}} \mathbf{h}^{o\perp} \right)^{-1} $
$ \mathbf{f} = \frac{1}{1 + \frac{\nu}{2} \sigma_a^2} \mathbf{f}^o $

where $ \mathbf{f}^o $ is the unperturbed optimal filter.

### Error Signal $ e[k] $

Given the definitions and the filter response:
$ e[k] = d[k] - \mathbf{f}^H \mathbf{x}[k] $

Substitute the approximations:
$ d[k] \approx \|\mathbf{h}^o\|^2 a[k] + {\mathbf{h}^o}^H \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] $
$ \mathbf{x}[k] \approx {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] + {\mathbf{h}^{o\perp}}^H \mathbf{\tilde{h}} a[k] $

Then:
$ e[k] \approx \|\mathbf{h}^o\|^2 a[k] + {\mathbf{h}^o}^H \mathbf{v}[k] - \mathbf{f}^H \left( {\mathbf{h}^{o\perp}}^H \mathbf{v}[k] + {\mathbf{h}^{o\perp}}^H \mathbf{\tilde{h}} a[k] \right) $

Substituting $\mathbf{f} = \frac{1}{1 + \frac{\nu}{2} \sigma_a^2} \mathbf{f}^o$:
$ e[k] \approx \|\mathbf{h}^o \|^2 a[k] + \mathbf{h}^{\prime H} \left( I - \frac{1}{1 + \frac{\nu}{2} \sigma_a^2} P_{\mathbf{h}^{\perp \prime}} \right) \mathbf{v}^{\prime}[k] $

### MMSE Calculation

The MMSE is given by:
$ \text{MMSE} = E[|e[k]|^2] $

Substituting the error signal:
$ \text{MMSE} = \|\mathbf{h}^o \|^4 \sigma_a^2 + \mathbf{h}^{\prime H} \left( I - \frac{1}{1 + \frac{\nu}{2} \sigma_a^2} P_{\mathbf{h}^{\perp \prime}} \right)^2 \mathbf{h}^\prime $

### Increase in MSE Due to Channel Estimation Error

The increase in MSE can be computed as:
$ \Delta \text{MSE} = \text{MMSE} - \text{MMSE}_{\text{ideal}} $

Where:
$ \text{MMSE}_{\text{ideal}} = \|\mathbf{h}^o \|^4 \sigma_a^2 + \mathbf{h}^{\prime H} (I - P_{\mathbf{h}^{\perp \prime}})^2 \mathbf{h}^\prime $

Thus:
$ \Delta \text{MSE} = (1 - \alpha)^2 \mathbf{h}^{\prime H} P_{\mathbf{h}^{\perp \prime}} \mathbf{h}^\prime $

For $ R_{\mathbf{VV}} = \sigma_v^2 I_m $:
$ P_{\mathbf{h}^{\perp \prime}} \mathbf{h}^\prime = \frac{1}{\sigma_v} P_{\mathbf{h}^\perp} \mathbf{h} = 0 $

Therefore, the increase in MSE is zero under this condition.

#### **&#x1F516;** **(&#x67;)** ___Effect of channel adaptation on LMMSE ICMF operation with short-term IC estimation___

---

In `(f)`, we considered the effect of channel estimation error on the operation of the ICMF when the Interference Canceling (IC) filter $\mathbf{f}$ is adapted with long-term statistics $R_{d \mathbf{X}}$, $R_\mathbf{XX}$. In that case, statistical averaging occurs not only over the noise and the transmitted data but also over the channel estimation error since many instances of this error will be involved and get averaged out. Another possible configuration is shorttime averaging for $\mathbf{f}$, involving essentially the data between two training symbols, so that the channel estimation error remains constant in such a period. This short-term averaging only averages over noise and transmitted data. The signal leakage in the output $\mathbf{x}[k]$ of the blocking filter $\mathbf{h}^{\perp H}$ will now lead to correlation between $\mathbf{d}[k]$ and $\mathbf{x}[k]$.

Take again the signal descriptions for $d[k]$ and $\mathbf{x}[k]$ from `(f)`, up to first order in $\mathbf{\tilde{h}}$. Find $R_{d \mathbf{X}}$ and $R_\mathbf{XX}$ using averaging over noise and symbols only, up to first order in $\mathbf{\tilde{h}}$. Find the LMMSE filter $\mathbf{f}$ up to first order in $\mathbf{\tilde{h}}$, in terms of the unperturbed version $\mathbf{f}^o$. Note that the perturbation in $\mathbf{f}$ due to $\mathbf{\tilde{h}}$ is proportional to signal power $\sigma_a^2\|\mathbf{h}^o\|^2$.

Express the corresponding error signal $e[k] = d[k] − \mathbf{f x}[k]$ in terms of the unperturbed $e^o[k]$ and first-order perturbation terms in $\mathbf{\tilde{h}}$. Note that the perturbation terms are mutually uncorrelated. Why?

Compute the corresponding MMSE, $E \| e[k] \|^2$, by now also averaging over $\mathbf{\tilde{h}}$, to get a simplified average expression, assuming the LMS adaptation for $\mathbf{h}[k]$ as in `(e)`.

The signal part in $e[k]$ that the receiver will assume on the basis of its knowledge of $\mathbf{h} = \mathbf{h}^o − \mathbf{\tilde{h}}$ is $\| \mathbf{h}^o − \mathbf{\tilde{h}} \|^2 a[k]$ while hence $e[k] − \| \mathbf{h}^o − \mathbf{\tilde{h}} \|^2 a[k]$ is considered noise. Compute the resulting SNR with numerator and denominator averaged over $\mathbf{\tilde{h}}$ and computed up to first order in $\nu$. The channel estimation error leads to signal leakage in the bottom branch of the ICMF, which leads to some _signal cancellation_ and ensuing loss in SNR. In the normal Generalized Sidelobe Canceler (GSC), of which the ICMF is a special instance, any error in the blocking filter ($\mathbf{h}^{\perp}$ in the ICMF case) leads to signal cancellation that becomes total when the received signal SNR increases (hence the SNR at the output of the GSC goes to zero then). In our analysis the ICMF output SNR remains bounded away from zero due to the fact that as the received SNR increases, the channel estimation error decreases also. Note the similarity with the loss in SNR in `(c)` due to IC filter adaptation by LMS without signal compensation.

To address the problem at hand, let's delve into the specifics of the given scenario and solve each part step-by-step, considering the short-term averaging effects, and the corresponding signal representations.

### Signal Descriptions for $d[k]$ and $\mathbf{x}[k]$

From the given information, the signal descriptions are:
$ d[k] \approx \|\mathbf{h}^o\|^2 \, a[k] + \mathbf{h}^{o H} \, \mathbf{v}[k] - \mathbf{\tilde{h}}^H \, \mathbf{h}^o \, a[k] $
$ \mathbf{x}[k] \approx \mathbf{h}^{o \perp H} \, \mathbf{v}[k] + \mathbf{h}^{o \perp H} \, \mathbf{\tilde{h}} \, a[k] $

Here:
- $ \mathbf{h}^o $ is the true channel vector.
- $ \mathbf{\tilde{h}} $ is the channel estimation error.
- $ a[k] $ is the data symbol.
- $ \mathbf{v}[k] $ is the noise.

### Finding $ R_{d \mathbf{X}} $ and $ R_{\mathbf{XX}} $

#### $ R_{d \mathbf{X}} $

We compute the cross-correlation between $ d[k] $ and $ \mathbf{x}[k] $:

$ R_{d \mathbf{X}} = E[d[k] \mathbf{x}^H[k]] $

Substituting $ d[k] $ and $ \mathbf{x}[k] $:

$ d[k] \approx \|\mathbf{h}^o\|^2 a[k] + \mathbf{h}^{o H} \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] $
$ \mathbf{x}[k] \approx \mathbf{h}^{o \perp H} \mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] $

Since we are averaging over noise $ \mathbf{v}[k] $ and data $ a[k] $:

$ R_{d \mathbf{X}} = E\left[ \left( \|\mathbf{h}^o\|^2 a[k] + \mathbf{h}^{o H} \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] \right) \left( \mathbf{h}^{o \perp H} \mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] \right)^H \right] $

Expanding and noting that $ E[a[k]] = 0 $ and $ E[a[k] a^H[k]] = \sigma_a^2 $, and $ E[\mathbf{v}[k] \mathbf{v}^H[k]] = \mathbf{R_{\mathbf{VV}}} $:

$ R_{d \mathbf{X}} = \|\mathbf{h}^o\|^2 \sigma_a^2 \mathbf{h}^{o \perp H} + \mathbf{h}^{o H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} - \sigma_a^2 \mathbf{\tilde{h}}^H \mathbf{h}^o \mathbf{h}^{o \perp H} $

Since $ \mathbf{\tilde{h}} $ is zero-mean and uncorrelated with $\mathbf{h}^o$:

$ R_{d \mathbf{X}} = \mathbf{h}^{o H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} + \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} $

#### $ R_{\mathbf{XX}} $

We compute the autocorrelation of $ \mathbf{x}[k] $:

$ R_{\mathbf{XX}} = E[\mathbf{x}[k] \mathbf{x}^H[k]] $

Substituting $ \mathbf{x}[k] $:

$ \mathbf{x}[k] \approx \mathbf{h}^{o \perp H} \mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] $

Expanding:

$ R_{\mathbf{XX}} = E\left[ \left( \mathbf{h}^{o \perp H} \mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] \right) \left( \mathbf{h}^{o \perp H} \mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] \right)^H \right] $

Since $ E[a[k] a^H[k]] = \sigma_a^2 $ and $ E[\mathbf{v}[k] \mathbf{v}^H[k]] = \mathbf{R_{\mathbf{VV}}} $:

$ R_{\mathbf{XX}} = \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} + \sigma_a^2 \mathbf{h}^{o \perp H} E[\mathbf{\tilde{h}} \mathbf{\tilde{h}}^H] \mathbf{h}^{o \perp} $

Assuming $ E[\mathbf{\tilde{h}} \mathbf{\tilde{h}}^H] = R_{\mathbf{\tilde{h}\tilde{h}}} $:

$ R_{\mathbf{XX}} = \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} + \sigma_a^2 \mathbf{h}^{o \perp H} R_{\mathbf{\tilde{h}\tilde{h}}} \mathbf{h}^{o \perp} $

### LMMSE Filter $\mathbf{f}$

The LMMSE filter is given by:
$ \mathbf{f} = R_{d \mathbf{X}} R_{\mathbf{XX}}^{-1} $

Substituting the derived $ R_{d \mathbf{X}} $ and $ R_{\mathbf{XX}} $:

$ R_{d \mathbf{X}} = \mathbf{h}^{o H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} + \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} $
$ R_{\mathbf{XX}} = \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} + \sigma_a^2 \mathbf{h}^{o \perp H} R_{\mathbf{\tilde{h}\tilde{h}}} \mathbf{h}^{o \perp} $

Approximating the inverse of $ R_{\mathbf{XX}} $:

$ R_{\mathbf{XX}}^{-1} \approx \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1} - \sigma_a^2 \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1} \mathbf{h}^{o \perp H} R_{\mathbf{\tilde{h}\tilde{h}}} \mathbf{h}^{o \perp} \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1} $

Then:
$ \mathbf{f} \approx \mathbf{f}^o + \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1} $

### Error Signal $ e[k] $

The error signal is:
$ e[k] = d[k] - \mathbf{f}^H \mathbf{x}[k] $

Substituting $ d[k] $ and $ \mathbf{x}[k] $:

$ e[k] = \|\mathbf{h}^o\|^2 a[k] + \mathbf{h}^{o H} \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] - \mathbf{f}^H \left( \mathbf{h}^{o \perp H} mathbf{v}[k] + \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] \right) $

Substituting $\mathbf{f} = \mathbf{f}^o + \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1}$:

$ e[k] = \|\mathbf{h}^o\|^2 a[k] + ( \mathbf{h}^{o H} - \mathbf{f}^o \mathbf{h}^{o \perp H} ) \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] - \mathbf{f}^o \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} a[k] $

The perturbation term due to the second part of $\mathbf{f}$:
$ - \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} \left( \mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp} \right)^{-1} \mathbf{h}^{o \perp H} \mathbf{v}[k] $

Thus, the complete error signal is:
$ e[k] = \|\mathbf{h}^o \|^2 \, a[k] + ( \mathbf{h}^{o H} - \mathbf{f}^o \, \mathbf{h}^{\perp H} ) \mathbf{v}[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o \, a[k] - \mathbf{f}^o \, \mathbf{h}^{o \perp H} \mathbf{\tilde{h}} \, a[k] - \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \, \mathbf{h}^{o \perp} \, ( \mathbf{h}^{o \perp H} \, \mathbf{R_{\mathbf{VV}}} \, \mathbf{h}^{o \perp})^{-1} \, \mathbf{h}^{o \perp H} \, \mathbf{v}[k] $

### MMSE Calculation

To find the MMSE, we compute the expectation of the squared error:

$ \sigma_e^2 = E[|e[k]|^2] $

Given the structure of $ e[k] $ and noting the uncorrelated nature of the perturbation terms:

$ \sigma_e^2 = \sigma_{e^o}^2 + \frac{\nu}{2} \sigma_a^2 \, \mathbf{h}^{o H} R_{\mathbf{VV}} \, \mathbf{h}^o + \frac{\nu}{2} \sigma_a^2 \mathbf{f}^o \, \mathbf{h}^{o \perp H} \, R_{\mathbf{VV}} \, \mathbf{h}^{o \perp} \, \mathbf{f}^{o H} $

The third perturbation term:
$ \sigma_a^4 \|\mathbf{h}^o\|^4 \, \mathrm{tr} \left\{ \left( \mathbf{h}^{o \perp H} \, R_{\mathbf{VV}} \, \mathbf{h}^{o \perp} \right)^{-1} \mathbf{h}^{o \perp H} \mathbf{v}[k] \mathbf{v}^H[k] \mathbf{h}^{o \perp} \left( \mathbf{h}^{o \perp H} \, \mathbf{R_{\mathbf{VV}}} \, \mathbf{h}^{o \perp} \right)^{-1} \mathbf{h}^{o \perp H} \mathbf{\tilde{h}\tilde{h}}^H \mathbf{h}^{o\perp} \right\} $

Given the uncorrelated nature of the noise and data:

$ \sigma_e^2 = \sigma_{e^o}^2 + \frac{\nu}{2} \sigma_a^2 \mathbf{h}^{o H} R_{\mathbf{VV}} \mathbf{h}^o + \frac{\nu}{2} \sigma_a^2 \mathbf{f}^o \mathbf{h}^{o \perp H} R_{\mathbf{VV}} \mathbf{h}^{o \perp} \mathbf{f}^{o H} + \frac{\nu}{2} \sigma_a^4 \|\mathbf{h}^o\|^4 \left( m - 1 \right) $

### Signal-to-Noise Ratio (SNR)

The signal part assumed by the receiver:

$ \|\mathbf{h}^o - \mathbf{\tilde{h}}\|^2 a[k] \approx \|\mathbf{h}^o\|^2 a[k] - \mathbf{h}^{o H} \mathbf{\tilde{h}} a[k] - \mathbf{\tilde{h}}^H \mathbf{h}^o a[k] $

The noise part:

$ (\mathbf{h}^{o H} - \mathbf{f}^o \mathbf{h}^{o \perp H}) (\mathbf{v}[k] + \mathbf{\tilde{h}} a[k]) - \sigma_a^2 \|\mathbf{h}^o\|^2 \mathbf{\tilde{h}}^H \mathbf{h}^{o \perp} (\mathbf{h}^{o \perp H} \mathbf{R_{\mathbf{VV}}} \mathbf{h}^{o \perp})^{-1} \mathbf{h}^{o \perp H} \mathbf{v}[k] $

Thus, the ICMF output SNR is:

$ \text{SNR}^{\text{out}} = \frac{\|\mathbf{h}^o \|^4 \sigma_a^2 + \nu \sigma_a^2 \| \mathbf{h}^\prime \|^2}{ \| P_{\mathbf{h}^{\perp \prime} }^{\perp} \mathbf{h}' \|^2 (1 + \frac{\nu}{2} \sigma_a^2) + \|\mathbf{h}^o\|^4 \sigma_a^2 \frac{\nu}{2} \sigma_a^2 (m - 1)} $

When $ \text{SNR}^{\text{Rx}} \to \infty $:

$ \text{SNR}^{\text{out}} \approx \frac{1}{\frac{\nu}{2} \sigma_a^2 (m - 1)} $

### Explanation

The analysis shows that the effect of the channel estimation error on the ICMF leads to signal leakage in the output, impacting the correlation between $ d[k] $ and $ \mathbf{x}[k] $. The error terms contribute to the MMSE and degrade the performance, yet the bounded nature of the output SNR is maintained due to the decreasing channel estimation error with increasing received SNR. This effect is similar to the loss in SNR observed with LMS adaptation without signal compensation.