<img src="../figuras/logos/logo_usc.jpg" align=right width='80px'/>
<br>


<table width="100%">
<td style="font-size:40px;font-style:italic;text-align:right;background-color:rgba(0, 220, 170,0.7)">
Quantum Information Theory
</td></table>



$ \newcommand{\bra}[1]{\langle #1|} $
$ \newcommand{\ket}[1]{|#1\rangle} $
$ \newcommand{\braket}[2]{\langle #1|#2\rangle} $
$ \newcommand{\ketbra}[2]{| #1\rangle \langle #2|} $
$ \newcommand{\tr}{{\rm Tr}\,} $
$ \newcommand{\Tr}{{\rm Tr}\,} $
$ \newcommand{\i}{{\color{blue} i}} $ 
$ \newcommand{\Hil}{{\cal H}} $
$ \newcommand{\V}{{\cal V}} $
$ \newcommand{\Lin}{\hbox{Lin}}$
$ \newcommand{\Xn}{X^{\! n}}$
$ \newcommand{\xn}{{\bf x}}$
$ \newcommand{\bxn}{\bar{\bf x}}$

<a id='top'></a>

 
 - [Quantum Information](#quant_info)  
     - [Von Neumann Entropy](#vonNeu)
     



<a id='quant_info'></a>

# Elements of Quantum Information
[<<<](#top)



Suppose a classical random source generates letters from an alphabet $X = \{x_a, p_a\}$ with probability $p_a = p(x_a)$.


The *prior uncertainty* (information) is given by the Shannon entropy:

$$
H(X) = - \sum_{a=1}^r p_a \log p_a
$$


To transmit a message using a *quantum channel*, we prepare states $x_i \to \ket{\psi_i}$ with $i$ devices and send them successively.

In this way, we have created a *quantum signal source*.


From the receiver's point of view, this is an incoherent statistical mixture of states $X = \{\ket{\psi_a}, p_a\}$, which are received with probability $p_a = p(\ket{\psi_a})$.

To decode the message, the receiver must guess which states compose the received state by performing measurements.


The <b>density operator</b> is the mathematical object that characterizes the statistical mixture that is received  
<br>
<br>
$$
\rho = \sum_{a} p_a \ket{\psi_a}\bra{\psi_a}
$$


Note that:

- We have not required $\ket{\psi_a}$ to be a set of orthogonal vectors. In general, they will not be.
<br>
<br>
- The number of vectors and letters $a = 1, 2, \dots$ may be greater or smaller than the dimension of the Hilbert space of the quantum system.


Since $\rho$ is Hermitian, we can always write it in its *spectral representation*:

$$
\rho = \sum_{i=1}^N \lambda_i \ket{\lambda_i}\bra{\lambda_i}
$$

where $\lambda_i$ are the eigenvalues and $\ket{\lambda_i}$ are the corresponding eigenvectors, which form an orthonormal basis: $\braket{\lambda_i}{\lambda_j} = \delta_{ij}$.


This representation refers to a *hypothetical device* associated with projective measurements $\{P_i = \ketbra{\lambda_i}{\lambda_i}\}$.

We refer to the quantum random variable $\hat C = \{\ket{\lambda_i}, \lambda_i\}$ as the *canonical ensemble*.


<a id='vonNeu'></a>

## Von Neumann Entropy


The described procedure presents us with two ensembles:

- the *original* one, associated with the preparation: $X = \{x_a, p_a\} \to \{\ket{\psi_a}, p_a\}$
<br>
<br>
- the *canonical* one, associated with the diagonalization of $\rho$: $C = \{\ket{\lambda_i}, \lambda_i\}$


Each ensemble has an associated Shannon entropy:

\begin{eqnarray}
H(X) &=& -\sum_a p_a \log p_a \\
H(C) &=& -\sum_{i=1}^N \lambda_i \log \lambda_i
\end{eqnarray}


The key to the second expression is that, by definition, it is equivalent to the following:  
<br>
<br>

$$
H(C) = -\Tr (\rho \log \rho)
$$
<br>

The advantage of writing it this way is that it is *basis-independent*.



<div class="alert alert-block alert-info", text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Definition:</b> <i>von Neumann entropy</i>  
<br>    
$$
S(\rho) = -\Tr (\rho\log \rho)
$$  
</p>
</div>


Note that, written in this way, $S(\rho)$:

- does not refer to *any specific* basis of states.  
<br>
<br>
- is uniquely determined for each state $\rho$  
<br>
<br>
- also does not depend on the preparation procedure


In short: *we can assign a von Neumann entropy to any density operator* $\rho$


<div class="alert alert-block alert-success">
<b>Exercise:</b>  
Write a function `S_entropy(rho)` that returns the von Neumann entropy associated with a state $\rho$,  
expressed as a matrix in the canonical basis.
</div>



### Properties of the von Neumann Entropy


- **Bounds**:  
<br>Let $N$ be the <u>dimension of $\Hil$</u>. The von Neumann entropy is bounded by  
<br>
<br>$$0 \leq S(\rho) \leq \log N$$



- In a <i>pure state</i>, the von Neumann entropy is zero  
<br>
<br>
$$
S(\rho) = 0 ~~~\Longleftrightarrow ~~~\rho^2 = \rho
$$
<br>


- In a <i>maximally mixed state</i>, the entropy of the state is maximal  
<br>
<br>
$$
S(\rho) = \log N ~~~\Longleftrightarrow ~~~\rho = \frac{1}{N} I
$$


- **Concavity**:  
<br>

$S(\rho)$ is a concave function of its argument $\rho$. For any straight line interpolating between $\rho_1$ and $\rho_2$:  
<br>
<br>
$$
S\left(\rule{0mm}{4mm}\lambda \rho_1 + (1-\lambda) \rho_2 \right) \geq \lambda S(\rho_1) + (1-\lambda) S(\rho_2)
$$
<br>
where $\lambda \in (0,1)$


<div class="alert alert-block alert-danger">
<b>Note:</b>  
The concavity of $S$ generalizes to linear combinations. Let $\rho = \sum_{i=1}^r p_i \rho_i$, where $~\sum_{i=1}^r p_i = 1$  
<br>    
<br>    
$$
S(\rho) \geq \sum_i p_i S(\rho_i)
$$
</div>


The proof will be given later by means of the subadditivity property.



- **Invariance**:  
<br> The von Neumann entropy is invariant under *unitary transformations*:

$$ 
S(\rho) = S(U^\dagger \rho U)
$$

In particular, this implies that the von Neumann entropy of an isolated system is constant in time  

$$
S(\rho(t)) = S(U(t)\rho(0) U(t)^\dagger) = S(\rho(0))
$$
That is,

$$
\frac{dS(t)}{dt} = 0
$$


<a id='entrop_rel'></a>
## Relative Entropy

We define the relative entropy by formal analogy with the classical case. Let $\rho$ and $\sigma$ be two quantum states. The relative entropy is a measure of distance that vanishes when they are equal:

$$
S(\rho \| \sigma) = \Tr \rho(\log\rho - \log \sigma)
$$
<br>

Gibbs' inequality for $H(X\|Y)$ has a parallel result for $S(\rho \| \sigma)$.


<div class="alert alert-block alert-info", text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem</b> <i>(Klein's inequality)</i>  
<br>
Relative entropy is non-negative:  
$$
S(\rho \| \sigma) \geq 0
$$  
<br>
and it vanishes if and only if $\rho = \sigma$.
</p>
</div>


<a id='ent_for_med'></a>
## Preparation and Measurement Entropies


### Preparation Entropy


There are infinitely many ensembles  
$X = \{\ket{\psi_a}, p_a\},\quad \tilde X = \{\ket{\tilde \psi_i}, \tilde p_i\}, \dots$  
that are described by the same density operator:

$$
\rho ~=~ \sum_{a=1}^r p_a \ket{\psi_a}\bra{\psi_a} ~=~ \sum_{i=1}^s \tilde p_i \ket{\tilde\psi_i}\bra{\tilde\psi_i} ~=~ \dots
$$


- Each ensemble has an associated Shannon entropy $H(X), H(\tilde X), \dots$ which may differ.  
<br>

- However, the von Neumann entropy $S(\rho)$ is the same for all of them because it depends only on $\rho$.
`



<div class="alert alert-block alert-info", text-align:center>
<p style="text-align: left ;color: navy;">  
    <b>Definition:</b> <i>(Preparation entropy)</i>  
<br>    
For each ensemble $X = \{\ket{\psi_a}, p_a\}$ that prepares a state  
$\rho = \sum_a p_a\ket{\psi_a}\bra{\psi_a}$,  
we define the <i>preparation entropy</i> as the difference  
<br>  
<br>    
$$\Delta(X,\rho) = H(X) - S(\rho)$$
</p>
</div>


<div class="alert alert-block alert-info", text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem:</b> The preparation entropy is non-negative, $\Delta(X,\rho) \geq 0$, that is:
<br>
<br>
\begin{eqnarray}
S(\rho ) ~~\leq ~~ H(X)
\\
\end{eqnarray}
<br>
The inequality is saturated for a preparation $X$ in which the states $\{\ket{\psi_a}\}$ are orthogonal.
</p>
</div>


- The proof is lengthy and will not be presented here.  
The result is plausible because $H(X) \leq \log(r)$, where $r$ is the number of *letters* in the ensemble $\ket{\psi_a}, \, a = 1,\dots,r$, which is unbounded, while $S(\rho)\leq \log N$ is bounded by the dimension of the Hilbert space $\Hil$.  
<br>
<br>

- On the other hand, if we require the states to be orthogonal, then $r \leq N$. We can complete them to form a basis $\{\ket{\psi_a}\}, \, a = 1,\dots,N$. Since $S(\rho)$ is invariant under unitary transformations, it equals the expression written in the eigenbasis $\{\ket{\lambda_a}\}$, that is, $H$.



- If the source states $X = \{\ket{\psi_a}, p_a\}$ are not orthogonal, then $S < H$. But the  
$\{\ket{\psi_a}\}$ cannot be distinguished $\Rightarrow$ there is no observable that allows for full recovery of the information encoded in the classical message.  
<br>
<br>
$\Rightarrow \rho$ transmits less information through the quantum channel than the amount contained in the original classical message.


<div class="alert alert-block alert-warning">
<b>Example 1:</b> Orthogonal states.  
<br>
<br>
Suppose Alice has a random source of orthogonal states  
<br>
<br>    
$$X = \{ \ket{\psi_i}, p_i\} = \{ (\ket{0}, p_0= 1/4), (\ket{1}, p_1 = 3/4)\}$$  
<br>
Bob describes the system using the density matrix  
$$
\rho = p_0\ket{0}\bra{0} + p_1\ket{1}\bra{1} = \begin{bmatrix} p_0 & 0 \\ 0 & p_1 \end{bmatrix}
$$  
<br>
</div>


<div class="alert alert-block alert-warning">
and the associated Shannon entropy will be  
<br>
\begin{eqnarray}
S(\rho) &=&  -\Tr \rho\log \rho = - \Tr \left( \begin{bmatrix} p_0 & 0 \\ 0 & p_1 \end{bmatrix}  \begin{bmatrix} \log p_0 & 0 \\ 0 & \log p_1 \end{bmatrix} \right) \nonumber\\
&=& \rule{0mm}{5mm}
-p_0\log p_0 - p_1 \log p_1 = H(p_0,p_1) \nonumber
\end{eqnarray}
<br>
Therefore, for orthogonal states, the Von Neumann and Shannon entropies are equal  
<br>    
<br>    
\begin{eqnarray}
S(\rho)\rule{0mm}{8mm}&=& -\frac{1}{4}\log \frac{1}{4} - \frac{3}{4} \log \frac{3}{4}    = 0.81 \, \hbox{bits} = 
  H(X)   \, . \nonumber
\end{eqnarray}
<br>
</div>


<div class="alert alert-block alert-warning">
<b>Example 2:</b> non-orthogonal states  
<br>
<br>    
Now consider another source from Alice that produces a set of states with identical probabilities $p_i$  
<br>
<br>    
$$\{ \ket{\psi_i}, p_i\} = \left\{\rule{0mm}{4mm} (\ket{0}, p_0= 1/4)\, , \, (\ket{+},p_+ = 3/4)\right\}$$  
<br>    
Bob now writes the density matrix  
<br>    
<br> 
$$
\rho = \frac{1}{4}\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} + \frac{3}{8} \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} = \frac{1}{8} \begin{bmatrix} 5 & 3 \\ 3 & 3 \end{bmatrix} \,
$$        
<br> 
By diagonalizing, we obtain the eigenvalues  
$$\lambda_i = \frac{1}{2} \pm \frac{1}{4} \sqrt{\frac{5}{2}}$$   
<br>     
Now we compute the Shannon entropy  
<br>    
$$
S(\rho) = -\sum_i \lambda_i \log \lambda_i \, =\,  0.485\, \hbox{bits} \, <\,   0.81   \, \hbox{bits} ~=~   H(X)
$$
</div>


<div class="alert alert-block alert-danger">
<b>Note:</b>  
<br><br>
The fact that \( S \) is smaller than \( H \) also suggests that a quantum alphabet \( X = \{\ket{\psi_a}, p_a\} \) could admit an encoding with fewer resources than a classical one. That is, greater compression.  
<br><br>
The proof of this fact is given by Schumacher’s theorem.
</div>


## Measurement Entropy


Bob receives a system in a state $\rho$ and applies a projective measurement $\{E_m = P_m\}$, where  
<br>
<br>

$$
P_l^2 = P_l~,~~ P_m P_n = P_m\delta_{mn}~,~~ \sum_m P_m = I
$$

If <u>the outcome is not recorded</u>, the *non-selective measurement* has the following effect on the state:
<br>
<br>

$$
\rho ~~ \stackrel{\{P_m\}}{\longrightarrow} ~~ \rho' = \sum_m P_m \rho P_m
$$


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b> Theorem:</b> 
<br>    
In a non-selective projective measurement, the entropy does not decrease, 
$
S(\rho') \geq S(\rho)
$.    
The inequality is saturated when the basis of the projective measurement diagonalizes $\rho = \sum_m\lambda_m P_m $.
<br>
</div>




<details>
<summary><p > >> <i>Proof</i> </p></summary>
    
We want to prove that 
$$
0 ~\leq -S(\rho) + S(\rho') = -S(\rho) -\tr (\rho'\log \rho')
$$
    
We know a very similar inequality, Klein's inequality, 
    
$$
0~\leq ~S(\rho\|\rho') ~=~ -S(\rho) -\tr (\rho\log \rho')
$$
    
It would be sufficient to prove that the second terms are equal: $\tr (\rho\log \rho') = \tr (\rho'\log \rho')$

$$
\tr (\rho'\log \rho') = \tr\left[ \sum_l  P_l \rho P_l \log \rho' \right] 
$$
Let's examine: 
\begin{eqnarray}
P_l \rho' &=& P_l\sum_m P_m \rho P_m = \sum_m P_l\delta_{lm}\rho P_m = P_l\rho P_l \nonumber \\
\rho'P_l  &=& \sum_m P_m \rho P_mP_l = \sum_m P_m\rho P_l \delta_{lm} = P_l\rho P_l\nonumber  \\
\end{eqnarray}

From this we deduce that $~\rho' P_l = P_l \rho' ~\Rightarrow ~\log\rho' P_l = P_l \log \rho'~$,
and therefore
    
$$
\tr(\rho'\log \rho') =  \tr\left[\sum_l\left( P_l \rho \log \rho' P_l\right)\right]= \tr \left[\left(\sum_l P_l^2\right) \rho \log \rho'\right] = \tr(\rho\log \rho')
$$

and thus we arrive at the desired result.
</details>


<div class="alert alert-block alert-success">
<b>Exercise:</b>  
<br> Work in a Hilbert space $\Hil$ of dimension 6. Randomly define a collective $\{\ket{\psi_a},q_a\},\, a = 0,...,r-1$.  
Perform a non-selective projective measurement in the computational basis $\ket{i}$. Obtain the entropy variation due to the measurement.  

Repeat the process, performing the measurement in the eigenbasis $\ket{\lambda_i}$ of $\rho$.  
</div>


## Entropy of Statistical Mixtures


The idea is to compare the entropies of a series of states $\rho_i\, i=1,2,...r$ with that of a statistical mixture of mixed states

$$
\rho = \sum_{i=1}^r p_i \rho_i
$$
with $p_i \geq 0, \, \sum_{i=1}^r p_i = 1$.


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b> Theorem:</b> 
<br>    
Let $\rho = \sum_{i=1}^r p_i\rho_i$ be a statistical mixture of states $\rho_i$ with probabilities $p_i\geq 0,\, \sum_{i=1}^r p_i = 1$. The following inequalities hold:
<br> 
<br>    
$$\fbox{$
~\sum_{i=1}^r p_i S(\rho_i) ~~\leq~  S(\rho) ~\leq~ \sum_{i=1}^r p_i S(\rho_i) + H(\{p_i\}) ~
$}
$$   
<br>
<br>
The inequality on the right is saturated when the $\rho_i$ have support on mutually orthogonal subspaces.    
<br>
</p>
</div>


<details>
<summary><p > >> <i>Proof</i> </p></summary>

The inequality on the left has already been mentioned; it is the property of <b>concavity</b> of the entropy.
Its proof is simple using the triangle inequality, which will be proven later.

To prove the inequality on the right, we will start by considering the case where $\rho_i = \ketbra{\psi_i}{\psi_i}$ are pure states, **not necessarily orthogonal**.

We introduce an auxiliary system $B$ with dimension $d_B \geq r$ and orthonormal basis $\{\ket{i}\}$ and define $\rho_{AB} = \ketbra{AB}{AB}$ in terms of the entangled state

$$
\ket{AB} = \sum_{i=1}^r \sqrt{p_i} \ket{\psi_i}\ket{i}
$$

Since $\rho_{AB}$ is pure, its partial traces coincide and, therefore, their entropies are also equal:

$$
S(B) = S(A) = S\big(\sum_{i} p_i\ketbra{\psi_i}{\psi_i}\big) = S(\rho)
$$

Next, we perform a projective, non-selective measurement on $B$ using the projectors $P_i^B = \ketbra{i}{i}$

$$
\rho_B'= \sum_i P_i^B\rho_B P_i^B = \sum_i p_i \ketbra{i}{i}
$$

We have proven that a non-selective measurement can only increase the entropy, i.e.,

$$
S(B)= S(\rho) \leq S(\rho_B') = - \sum_i p_i\log p_i = H(\{p_i\})
$$

Therefore, when $\rho_i$ are pure states, we have:

$$
S (\rho) \leq H(\{p_i\}) + \sum_i p_i S(\rho_i)
$$

where we added the last term, which is zero. The inequality is saturated if the $\ket{\psi_i}$ are orthogonal.
<br>
<br>

Now we can address the general case in which the $\rho_i$ are mixed states. The spectral decomposition of each $\rho_i$ is

$$
\rho_i = \sum_{j=1}^N \pi^i_j \ketbra{e^i_j}{e^i_j}
$$

where the $r$ bases $\{\ket{e^i_j}\, , j=1,...,N\}$ are, in principle, different. We can now write

$$
\rho ~=~ \sum_{i=1}^r \sum_{j=1}^N p_i \pi^i_j \ketbra{e^i_j}{e^i_j}  ~= ~\sum_{i,j} q_{ij}\rho_{ij}
$$

where we have considered $\rho$ as a mixture of pure states $\rho_{ij}$. We can apply the result found for pure states to this case:
<br>

$$
S(\rho) \leq H(\{q_{ij}\}) = -\sum_{ij} p_i\pi^i_j \log(p_i\pi^i_j) = -\sum_{ij} p_i\pi^i_j(\log p_i + \log \pi^i_j)
= -\sum_i p_i \log p_i -\sum_i p_i\big( \sum_j \pi^i_j \log \pi^i_j \big)
$$

where we used that $\sum_j \pi^i_j = \tr(\rho_i) = 1$. We now recognize in the last two terms the functions $H(\{p_i\})$ and $S(\rho_i)$, that is:

$$
S(\rho) \leq H(\{p_i\}) + \sum_i p_i S(\rho_i)
$$
 
</details>


The inequality on the left is the concavity property, whose content is that the entropy of a mixture is greater than that of its parts.

The difference is an important quantity that *should be maximized*, as it increases the amount of information in the system.


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b> Definition:</b>  <i>(Holevo information)</i>
<br>    
The <i>Holevo information</i> of a state $\rho = \sum_i p_i \rho_i$ is defined as the entropy increase associated with the statistical mixture 
<br>
<br>    
$$
\chi = S(\rho) - \sum_i p_i S(\rho_i)    
$$
</div>



From the previous theorem, by subtracting the quantity $\rho = \sum_{i=1}^r p_i \rho_i$ from all terms, the following inequality for the Holevo information is verified:

<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b> Corollary:</b> 
In a statistical mixture $\rho = \sum_{i=1}^r p_i \rho_i$, the Holevo information is bounded as follows:
<br>
<br>    
$$ 
0 \leq \chi(\rho) \leq H(\{p_i\}) 
$$
<br>    
</div>


<div class="alert alert-block alert-success">
<b>Exercise:</b> 
<br> Work in a Hilbert space $\Hil$ of dimension 6. Randomly define three ensembles $\{\ket{\psi_a},q_a\}_I$ with $a = 0,...,r_a-1$ and  $I=0,1,2$. With 3 random probabilities $\{p_i\},\, i=0,1,2$, consider the statistical mixture $\rho = \sum_i p_i \rho_i$. Compute the Holevo information and verify the bounds.
</div>  


<a id="ent_comp"></a>
<table width="100%">
    <td style="font-size:25px;font-family:Helvetica;text-align:left;background-color:rgba(0,0,900, 0.3);">
<b>Quantum Entropies of Composite Systems</b>
</table>



After the information contained in a state, the next important quantity we wish to understand is the *degree of correlation* between two systems $A$ and $B$.


Considered jointly, the isolated bipartite system $AB \sim \Hil_A \otimes \Hil_B$ 

recall that all accessible information for observers who can measure on $AB$ ($A$, $B$) is contained in $\rho$, $(\rho_A, \rho_B)$

in particular, the von Neumann entropy $S(\rho)$ measures the Shannon uncertainty associated with a preparation via a set of projective measurements



## Entanglement Entropy
<a id='ent_entrelaz'></a>


It is natural to expect that the degree of entanglement between $A$ and $B$ is reflected in the partial states $\rho_A$ and $\rho_B$.


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Definition:</b> (<i>entanglement entropy</i>)
<br>
The <b>entanglement entropy</b> is
the Von Neumann entropy of a subsystem obtained by taking the partial trace over its complement:
<br>
<br>
$$
S(\rho_A) = \Tr \rho_A \log \rho_A~~~~~~\text{with} ~~~~~\rho_A = \Tr_B \rho
$$
<br>   
$$
S(\rho_B) = \Tr \rho_B \log \rho_B~~~~~~\text{with} ~~~~~\rho_B = \Tr_A \rho
$$  
<br>    
</div>    


**Notation**: Unless otherwise stated, when dealing with composite systems, we will denote

$$ 
S(AB) = S(\rho) ~~~~~~~~~ S(A) = S(\rho_A)  ~~~~~~~~~~ S(B) = S(\rho_B)
$$

where it is understood that these refer to the systems obtained via partial traces.


## Entropy of an Uncorrelated State


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b> Theorem</b> <i></i> 
<br>
for an uncorrelated state we find
$$
S(\rho) = S (\rho_A\otimes \rho_B) = S(\rho_A) + S(\rho_B)
$$
</p>
</div>



<details>
<summary><p style="text-align: right ; color:black"> >> Proof </p></summary>
Working with the spectral decompositions of $\rho_A$ and $\rho_B$, we find that

\begin{eqnarray}
\log (\rho_A\otimes \rho_B) &=& \big( \sum_{i,a} \log(\lambda_i \mu_a) \ketbra{\lambda_i \mu_a}{\lambda_i\mu_a}\big) \\
&=& \big( \sum_{i,a} (\log\lambda_i+ \log \mu_a)  \ketbra{\lambda_i \mu_a}{\lambda_i\mu_a}\big)  + \\
&=& \sum_{i} \log \lambda_i \ketbra{\lambda_i}{\lambda_i}\otimes \sum_a \ketbra{\mu_a}{\mu_a} + 
\sum_{i}  \ketbra{\lambda_i}{\lambda_i}\otimes \sum_a \log \mu_a \ketbra{\mu_a}{\mu_a} \\
&=& \log\rho_A\otimes I + I\otimes \log \rho_B
\end{eqnarray}  
Then
\begin{eqnarray}
S(\rho_A\otimes \rho_B) &=& -\tr_{AB} \left[(\rho_A\otimes \rho_B)\log (\rho_A\otimes\rho_B)\right] \\
 &=& -\tr_{AB} \left[(\rho_A\otimes \rho_B)(\log\rho_A\otimes I + I\otimes \log \rho_B)\right] \\
&=&- \tr_{AB}\left[\rho_A\log\rho_A\otimes \rho_B + \rho_A \otimes \rho_B \log\rho_B\right]\\
&=& \tr(\rho_A\log\rho_A)\otimes \tr \rho_B + \tr\rho_A \otimes \tr(\rho_B \log\rho_B)\\
&=& S(A) + S(B)
\end{eqnarray}
</details>


This notion of non-correlation is analogous to that which exists in classical probability.


Classically, there are several entropies that play a central role: conditional entropy $H(X|Y)$, relative entropy $H(X\|Y)$, and mutual information $I(X,Y)$.
All three admit interpretations in terms of uncertainties and expectations.

We can define *formally analogous* quantities in the quantum context, even though the probabilistic interpretation is not as evident—or may even be unknown.

We have already defined [relative entropy](#entrop_rel), along with its important property of positivity.



<a id ='infor_mutua'></a>
## Mutual Information



The definition of mutual information is the same, replacing Shannon entropy with von Neumann entropy  
<br>
<br>

$$
I(A,B) = S(A) + S(B) - S(AB) 
$$
<br>


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem</b> <i>(Positivity of Mutual Information)</i> 
<br>
$$
I(A,B) \geq 0
$$
<br>    
and the inequality is saturated, $I(A,B) = 0$, if and only if $\rho = \rho_A\otimes \rho_B$ is factorizable.
</p>
</div>


<details>
<summary><p style="text-align: right ; color:black"> >> Demostración </p></summary>
<br>
<br>    
Consideremos la entropía relativa asociada a los estados $\rho  = \rho_{AB}$ y  $\sigma = \rho_A\otimes \rho_B$
<br>
Entonces, por la desigualdad de Klein
<br>    
<br>
\begin{eqnarray}
0\leq S(\rho\|\sigma) &=& \tr\left(\rho_{AB}(\log \rho_{AB} - \log (\rho_A\otimes \rho_B) \rule{0mm}{4mm}\right)\nonumber\\  \rule{0mm}{8mm}
&=& \tr \left(\rho_{AB}(\log \rho_{AB} - \log (\rho_A\otimes I) -\log ( I \otimes \rho_B) \rule{0mm}{4mm} \right)
\\ \rule{0mm}{8mm}
&=& S(AB) - \tr_{A}(\tr_B\rho_{AB}) - \tr_B(\tr_A\rho_{AB}) \\ \rule{0mm}{8mm}
&=& S(AB) - S(A) - S(B) 
\nonumber \\
\end{eqnarray}
<br>    
</details>

## Conditional Entropy

Finally, we will copy the definition of conditional entropy, even though a notion of quantum conditional probability does not exist

$$
S(A|B) = S(AB) - S(B)
$$


Here we find a genuine feature:
unlike the classical case, $S(A|B)$ *can be negative*. 

- For example, if $AB$ is a pure state $\Rightarrow S(AB) = 0$
<br>

- However, it cannot be *too* negative


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem:</b>  
<br>
$$
S(A|B) \geq - \hbox{min}(S_A, S_B)
$$  
</p>
</div>


<details>
    <summary><p style="text-align: right; color:black"> >> <i>Proof</i> </p></summary>
<br>
On one hand, from the fact that $S(AB)\geq 0$ it follows that
$$
S(A|B) \geq - S(B)
$$ 
<br>    
On the other hand, let us couple $AB$ to a third system $C$ and consider a pure state $ABC$ (i.e., $\rho_{ABC} = \ket{\psi_{ABC}}\bra{\psi_{ABC}}$).
    
Then we know that $S(AB) = S(C)$ and also $S(B) = S(AC)$. We find that
<br>    
$$
S(A|B) = S(AB)- S(B) = S(C) - S(AC) \geq -S(A)
$$
<br>    
where we have used the positivity of the mutual information between $A$ and $C$.  

Thus the result follows:


Conditional entropy plays a fundamental role in the possibility of establishing *teleportation protocols*.


<a id='desig_triang'></a>
### Araki–Lieb Triangle Inequality


We can now state certain relationships between the entropy of a system $AB$ and that of its parts $A$ and $B$.


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem</b> 
<br>   
The von Neumann entropies of a composite system $AB$ and its parts $A,B$ satisfy, for any state $\rho$, the following inequality:
<br>
<br>
$$
|S(A) - S(B)| ~~ \leq ~~ S(AB) ~~\leq ~~S(A) + S(B)
$$   
</p>
</div>

- The inequality on the left is known as the *Araki–Lieb inequality*.  
<br>

- The inequality on the right is known as the *subadditivity* of entropy.


<details>
    <summary><p style="text-align:right ; color:black"> >> <i>Proof</i> </p></summary>
<br>
The inequality on the right is called <i>subadditivity</i> and it is equivalent to the positivity of relative entropy.
<br>
<br>
To prove the inequality on the left (Araki–Lieb), we recall the lower bound for conditional entropy:
<br>
<br>
\begin{eqnarray}
 S(A|B) = S(AB) - S(B) &\geq & - \hbox{min}(S_A,S_B) \geq -S_A \\
 S(B|A) = S(AB) - S(A) &\geq& - \hbox{min}(S_A,S_B) \geq -S_B \rule{0mm}{6mm}
\end{eqnarray}  
<br>
From this it follows that
$$
S(AB) \geq  |S(A)- S(B)|
$$
</details>


The subadditivity property allows for a simple proof of the *concavity* of the von Neumann entropy.

<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
    <b>Corollary:</b> The von Neumann entropy is a <b>concave</b> function of its argument
<br>
<br>    
$$
\sum_i p_i S(\rho_i) \leq S\left(\sum_i p_i \rho_i\right)
$$
</p>
</div>    

<details>
<summary><p style="text-align:right ; color:black"> >> <i>Proof</i> </p></summary>    

Let $\{\lambda_{i,a}\}, a=1,...,d_A$ be the eigenvalues of $\rho_i$, then
$$
S(\rho_i ) = -\sum_{a}\lambda_{i,a}\log\lambda_{i,a}
$$

Consider an auxiliary system $B$ with orthonormal basis  $\{\ket{i}\}$ and density matrix
$$
\rho_B = \sum_i p_i \ket{i}\bra{i}
$$
and define the following joint $AB$ state
$$
\rho_{AB} = \sum_{i} p_i \rho_i \otimes \ket{i}\bra{i}
$$
If $\rho_i$ has eigenvectors $\ket{e_{i,a}}$ with eigenvalues $\lambda_{i,a}$ such that $\sum_a \lambda_{i,a} = 1$, then the eigenvalues of $\rho_{AB}$ are $\{ p_i \lambda_{i,a} \}$.

\begin{eqnarray}
S_{AB} &=& -\sum_{i,a} p_i \lambda_{i,a} \log( p_i \lambda_{i,a})  \\
&=& \sum_i p_i \log p_i \sum_a \lambda_{i,a} + \sum_i p_i \sum_a \lambda_{i,a} \log \lambda_{i,a} \nonumber\\
&=& \sum_i p_i \log p_i + \sum_i p_i S(\rho_i) \nonumber\\
&=& S_B + \sum_i p_i S(\rho_i) 
\end{eqnarray}

Taking the partial traces we find
\begin{eqnarray}
\rho_A &=& \tr_B \rho_{AB} =  \sum_{i} p_i \rho_i \nonumber\\
\rho_B &=& \tr_A \rho_{AB} = \sum_i p_i \ket{i}\bra{i} \nonumber
\end{eqnarray}

Now, using the subadditivity property
$$
S_{AB} \leq S_A + S_B
$$
we obtain 
$$
S_B + \sum_i p_i S(\rho_i) \leq S_A + S_B
$$
and canceling out $S_B$ we get
$$
\sum_i p_i S(\rho_i) \leq S_A = S\left(\sum_i p_i \rho_i\right)
$$
</details>


Let's look at cases that saturate these inequalities.


### Case 1: Saturation of Subadditivity: Factorizable State ##

Suppose the system $AB$ is in a *factorizable* composite state. Then the *mutual information* is zero and, consequently, the *subadditivity* is saturated:

$$
\rho_{AB} = \rho_A \otimes \rho_B ~~~~\Leftrightarrow ~~~~ S(AB) = S(A) + S(B)
$$
<br>


### Case 2: Saturation of Araki–Lieb: Pure or Entangled State


We begin by writing the pure state

$$
\rho_{AB} = \ket{\psi_{AB}} \bra{\psi_{AB}}
$$


In a pure state of a bipartite system $AB$, entanglement introduces quantum correlations between the two systems

$$
\ket{\psi_{AB}} = \sum_{i,j} c_{ij} \ket{i}_A \otimes \ket{j}_B
$$


One way to determine if entanglement exists is to use the Schmidt decomposition.

$$
\ket{\psi_{AB}} = \sum_{a=1}^r \sqrt{p_a}\,  \ket{\psi^a_A} \otimes \ket{\psi^a_B}
$$

If and only if the Schmidt number, $r$, is greater than one, the state is entangled.


Now we want to be more precise and propose a way to quantify the extent of such correlations. This is what the *Entanglement Entropy* measures.

Indeed,

$$
S(\rho_A) = S(\rho_B) = - \sum_{a=1}^r p_a \log p_a
$$

Therefore, the *entanglement entropy* is proportional to the *entanglement* present in $\ket{\psi_{AB}}$.


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
<b>Theorem:</b> Let ${AB}$ be a composite system in a pure state with $ S(AB) = 0$. The entanglement entropy of its constituent subsystems $A$ and $B$ is  
<br>
<br>    
$\to ~$ equal for both subsystems $S(A) = S(B)$
<br>
<br>    
$\to ~$ proportional to the entanglement of the pure state $\ket{\psi_{AB}}$
</p>
</div>    

<details>
<summary><p style="text-align: right ; color:black"> >> Proof </p></summary>
<br>
The density matrix $\rho_{AB} = \ket{\psi_{AB}}\bra{\psi_{AB}}$ has zero entropy $S(\rho_{AB})=0$. This is not the case for the density matrices of the subsystems $A$ and $B$.
Writing $\ket{\psi_{AB}}$ in the Schmidt basis, we can compute
<br>
<br>
$$
\rho_{A} = \Tr_B \rho_{AB} = \sum_{a} p_a \ket{\psi^a_A}\bra{\psi^a_A}
~~~~~~~,~~~~~~~~
\rho_{B} = \Tr_A \rho_{AB} = \sum_{a} p_a \ket{\psi^a_B}\bra{\psi^a_B}
$$
Since the $\ket{\psi^a}$ are orthonormal, for the entropies of the subsystems we find
$$
S(A) = S(B) = - \sum_{a=1}^r p_a \log p_a
$$
<br>
 - they are proportional to the degree of entanglement of $\ket{\psi_{AB}}$. 
    
Indeed, if $p_1= 1$ and $p_{i>1} = 0$, so that there is no entanglement, then the entropies satisfy $S(A) = S(B) = 0$. 
    
On the other hand, if the state is maximally mixed $S = \log N$, then the entanglement is also maximal with $p_a = \frac{1}{N}$ for $a = 1,..., N$.
<br>
<br>
</details>


We see that, in this case, the Araki–Lieb inequality is saturated:

$$
|S(A) - S(B)| = 0 = S(AB)
$$


<a id="codif_optim"></a>
<table width="100%">
    <td style="font-size:25px;font-family:Helvetica;text-align:left;background-color:rgba(0,0,900, 0.3);">
<b>Quantum Coding</b>
</table>


<div class="alert alert-block alert-info",text-align:center>
<p style="text-align: left ;color: navy;">  
    <b>Theorem:</b> <i>Schumacher's theorem</i>
    <br>
Given a message whose letters are pure states drawn independently from the quantum alphabet $X=  \{ \ket{\psi_i}, p_i\}$,  there exists 
    an <i>optimal lossless coding</i> that makes an average use of $S(\rho)$ <i>qubits per letter</i>, where $\rho = \sum_i p_i \ketbra{\psi_i}{\psi_i}$.
</p>
</div>

