Based on [Noisy circuits](https://pennylane.ai/qml/demos/tutorial_noisy_circuits/#noisy-circuits)

Noise is any unwanted transformation that corrupts the intended output of a quantum computation. It can be separated into two categories:
* Coherent noise is described by unitary operations that maintain the purity of the output quantum state. A common source are systematic errors originating from imperfectly-calibrated devices that do not exactly apply the desired gates, e.g., applying a rotation by an angle
$ϕ+ϵ$ instead of $ϕ$.

* Incoherent noise is more problematic: it originates from a quantum computer becoming entangled with the environment, resulting in mixed states — probability distributions over different pure states. Incoherent noise thus leads to outputs that are always random, regardless of what basis we measure in.

Mixed states are described by density matrices. They provide a more general method of describing quantum states that elegantly encodes a distribution over pure states in a single mathematical object. Mixed states are the most general description of a quantum state, of which pure states are a special case.

The purpose of PennyLane’s `default.mixed` device is to provide native support for mixed states and for simulating noisy computations. Let’s use default.mixed to simulate a simple circuit for preparing the Bell state $\Ket{\psi}= \frac{1}{\sqrt{2}}(\Ket{00} + \Ket{11})$. We ask the QNode to return the expectation value of $Z_0 \otimes Z_1$

In [2]:
import pennylane as qml

dev = qml.device('default.mixed',wires = 2)

@qml.qnode(dev)
def circuit():
    qml.Hadamard(wires = 0)
    qml.CNOT(wires=[0,1])
    return qml.expval(qml.PauliZ(0)@qml.PauliZ(1))

print(f"QNode output = {circuit():.4f}")

QNode output = 1.0000


With a small modification of the circuit we can also ask for density matrix. In this case, the density matrix is equal to $\Ket{\psi}\Bra{\psi}$, where $\Ket{\psi}= \frac{1}{\sqrt{2}}(\Ket{00} + \Ket{11})$

In [3]:
import numpy as np
@qml.qnode(dev)
def density_matrix_circuit():
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0,1])
    return qml.state()

matrix = density_matrix_circuit()
print(f"Output density matrix is = \n{np.real(matrix)}")

Output density matrix is = 
[[0.5 0.  0.  0.5]
 [0.  0.  0.  0. ]
 [0.  0.  0.  0. ]
 [0.5 0.  0.  0.5]]


Incoherent noise is modelled by quantum channels. Mathematically, a quantum channel is a linear, completely positive, and trace-preserving ([CPTP](https://www.quantiki.org/wiki/channel-cp-map)) map. A convenient strategy for representing quantum channels is to employ [Kraus operators](https://en.wikipedia.org/wiki/Quantum_operation#Kraus_operators) $\{K_i\}$ satisfying the condition $\sum_i K_{i}^{\dagger} K_i = I$. For
an initial state $\rho$, the output state after the action of a channel
$\Phi$ is:

$$\Phi(\rho) = \sum_i K_i \rho K_{i}^{\dagger}.$$

Just like pure states are special cases of mixed states, unitary
transformations are special cases of quantum channels. Unitary
transformations are represented by a single Kraus operator, the unitary
$U$, and they transform a state as $U\rho U^\dagger$.

More generally, the action of a quantum channel can be interpreted as
applying a transformation corresponding to the Kraus operator $K_i$ with
some associated probability. More precisely, the channel applies the
transformation $\frac{1}{p_i}K_i\rho K_i^\dagger$ with probability
$p_i = \text{Tr}[K_i \rho K_{i}^{
\dagger}]$. Quantum channels therefore represent a probability
distribution over different possible transformations on a quantum state.
For example, consider the bit flip channel. It describes a
transformation that flips the state of a qubit (applies an X gate) with
probability $p$ and leaves it unchanged with probability $1-p$. Its
Kraus operators are

$$\begin{aligned}
K_0 &= \sqrt{1-p}\begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix}, \\
K_1 &= \sqrt{p}\begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}.
\end{aligned}$$

This channel can be implemented in PennyLane using the
`qml.BitFlip <pennylane.BitFlip>`{.interpreted-text role="class"}
operation.

Let\'s see what happens when we simulate this type of noise acting on
both qubits in the circuit. We\'ll evaluate the QNode for different bit
flip probabilities.


In [7]:
@qml.qnode(dev)
def bitflip_circuit(p):
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    qml.BitFlip(p, wires=0)
    qml.BitFlip(p, wires=1)
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

ps = [0.001, 0.01, 0.1, 0.2]
for p in ps:
    print(f"QNode output for bit flip probability {p} is {bitflip_circuit(p):.4f}")

QNode output for bit flip probability 0.001 is 0.9960
QNode output for bit flip probability 0.01 is 0.9604
QNode output for bit flip probability 0.1 is 0.6400
QNode output for bit flip probability 0.2 is 0.3600


The circuit behaves quite differently in the presence of noise! This will be familiar to anyone that has run an algorithm on quantum hardware. It is also highlights why error mitigation and error correction are so important. We can use PennyLane to look under the hood and see the output state of the circuit for the largest noise parameter

In [8]:
print(f"Output state for bit flip probability {p} is \n{np.real(dev.state)}")

Output state for bit flip probability 0.2 is 
[[0.34 0.   0.   0.34]
 [0.   0.16 0.16 0.  ]
 [0.   0.16 0.16 0.  ]
 [0.34 0.   0.   0.34]]


Let\'s take a look at another example. The depolarizing channel is a
generalization of the bit flip and phase flip channels, where each of
the three possible Pauli errors can be applied to a single qubit. Its
Kraus operators are given by

$$\begin{aligned}
K_0 &= \sqrt{1-p}\begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix}, \\
K_1 &= \sqrt{p/3}\begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}, \\
K_2 &= \sqrt{p/3}\begin{pmatrix}0 & -i\\ i & 0\end{pmatrix}, \\
K_3 &= \sqrt{p/3}\begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}.
\end{aligned}$$

A circuit modelling the effect of depolarizing noise in preparing a Bell
state is implemented below.


In [9]:
@qml.qnode(dev)
def depolarizing_circuit(p):
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    qml.DepolarizingChannel(p, wires=0)
    qml.DepolarizingChannel(p, wires=1)
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

In [10]:
ps = [0.001, 0.01, 0.1, 0.2]
for p in ps:
    print(f"QNode output for depolarizing probability {p} is {depolarizing_circuit(p):.4f}")

QNode output for depolarizing probability 0.001 is 0.9973
QNode output for depolarizing probability 0.01 is 0.9735
QNode output for depolarizing probability 0.1 is 0.7511
QNode output for depolarizing probability 0.2 is 0.5378


As before, the output deviates from the desired value as the amount of noise increases. Modelling the noise that occurs in real experiments requires careful consideration. PennyLane offers the flexibility to experiment with different combinations of noisy channels to either mimic the performance of quantum algorithms when deployed on real devices, or to explore the effect of more general quantum transformations

## Channel gradients

The ability to compute gradients of any operation is an essential
ingredient of
quantum differentiable programming. In PennyLane, it is possible to compute gradients of noisy
channels and optimize them inside variational circuits. PennyLane
supports analytical gradients for channels whose Kraus operators are
proportional to unitary matrices. In other cases, gradients are
evaluated using finite differences.

To illustrate this property, we\'ll consider an elementary example. We
aim to learn the noise parameters of a circuit in order to reproduce an
observed expectation value. So suppose that we run the circuit to
prepare a Bell state on a hardware device and observe that the
expectation value of $Z_0\otimes Z_1$ is not equal to 1 (as would occur
with an ideal device), but instead has the value 0.7781. In the
experiment, it is known that the major source of noise is amplitude
damping, for example as a result of photon loss. Amplitude damping
projects a state to $|0\rangle$ with probability $p$ and otherwise
leaves it unchanged. It is described by the Kraus operators

$$\begin{aligned}
K_0 = \begin{pmatrix}1 & 0\\ 0 & \sqrt{1-p}\end{pmatrix}, \quad
K_1 = \begin{pmatrix}0 & \sqrt{p}\\ 0 & 0\end{pmatrix}.
\end{aligned}$$

What damping parameter ($p$) explains the experimental outcome? We can
answer this question by optimizing the channel parameters to reproduce
the experimental observation! 💪 Since the parameter $p$ is a
probability, we use a sigmoid function to ensure that the trainable
parameters give rise to a valid channel parameter, i.e., a number
between 0 and 1.


In [4]:
from jax import numpy as np
import jax
import jaxopt

jax.config.update("jax_platform_name", "cpu")
jax.config.update('jax_enable_x64', True)

ev = 0.7781  # observed expectation value

def sigmoid(x):
    return 1/(1+np.exp(-x))

@qml.qnode(dev)
def damping_circuit(x):
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    qml.AmplitudeDamping(sigmoid(x), wires=0)  # p = sigmoid(x)
    qml.AmplitudeDamping(sigmoid(x), wires=1)
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

We optimaze the circuit with to a simple cost function that attains its minimum then the output of the QNode is equal to the experimental value:

In [16]:
def cost(x, target):
    return (damping_circuit(x) - target)**2

All that remains is to optimize the parameter. We use a straightforward gradient descent method.

In [22]:
steps = 200
x = np.array(0.01)
target_ev = 0.7781

opt = jaxopt.GradientDescent(cost,stepsize=0.7, acceleration=False)
params = x
opt_state = opt.init_state(params)
for i in range(steps):
    params, opt_state = opt.update(params, opt_state,target= target_ev)
    if (i + 1) % 10 == 0:
        print("Cost after step {:5d}: {: .7f}".format(i + 1, cost(params,target_ev)))

print("Optimized x: {}".format(params))
print(f"Optimized noise parameter p = {sigmoid(params.take(0)):.4f}")
print(f"QNode output after optimization = {damping_circuit(params):.4f}")
print(f"Experimental expectation value = {target_ev}")


Cost after step    10:  0.0772951
Cost after step    20:  0.0770553
Cost after step    30:  0.0755595
Cost after step    40:  0.0673987
Cost after step    50:  0.0420718
Cost after step    60:  0.0164136
Cost after step    70:  0.0057149
Cost after step    80:  0.0021155
Cost after step    90:  0.0008378
Cost after step   100:  0.0003482
Cost after step   110:  0.0001494
Cost after step   120:  0.0000655
Cost after step   130:  0.0000291
Cost after step   140:  0.0000131
Cost after step   150:  0.0000059
Cost after step   160:  0.0000027
Cost after step   170:  0.0000012
Cost after step   180:  0.0000006
Cost after step   190:  0.0000003
Cost after step   200:  0.0000001
Optimized x: 1.9247392364041398
Optimized noise parameter p = 0.8727
QNode output after optimization = 0.7778
Experimental expectation value = 0.7781
