In [53]:
import numpy as np
import matplotlib.pyplot as plt
from helpers import plot_hist

def percentage_differnce(a, b):
    return ((a-b)/a)*100

Transfer function vs Folding - simple 2 bin example
----------------------------------------------------

We will work through a simple 2 bin example and look at whether using a transfer function or full folding gives a more "*correct*" answer.

##### Question: 
We have a **system 1** where we have a `reco_1`, `truth_1`, migration matrix `M_1` and a transfer function `TF_1`.

We then have a second system, **system 2**, again with `reco_2`, `truth_2`, migration matrix `M_2` and a transfer function `TF_2`.

We want to know if we can use the `M_1` and `TF_1` from **system 1** to get the correct `reco_2` from `truth_2`

First we will look at two simple cases: `M_1=M_2` and `M_1!=M_2`


`M_1 = M_2`
----------------------------------------------------

In [72]:
truth_1 = np.array([50, 20])
M_1 = np.array([[0.8, 0.2],
                [0.2, 0.8]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))

Our truth_1 = [50 20] and our reco_1 = [ 44.  26.]
The migration matrix M_1 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_1 = [ 0.88  1.3 ]


In [73]:
truth_2 = np.array([45, 25])
M_2 = M_1
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))

Our truth_2 = [45 25] and our reco_2 = [ 41.  29.]
The migration matrix M_2 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_2 = [ 0.91111111  1.16      ]


##### If we now apply `M_1` and `TF_1` to `truth_2`, which method is closest the the actual `reco_2`?

In [74]:
reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print(reco_2_transfer)
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))
reco_2_folded_80 = reco_2_folded
reco_2_transfer_80  = reco_2_transfer

[ 39.6  32.5]
Difference between folded and actual reco_2 = [ 0.  0.] %
Difference between transfer function and actual reco_2 = [  3.41463415 -12.06896552] %


With both migration matricies being equal, as expected the folding method gives exact closure. The transfer function however gives quite a different result. This would suggest that if the migration matrix of the two systems are the same, the folding method will be more accurate.

If we now look at how this changes with the diagonality of the migration matrix.

##### 70% diagonal

In [75]:
# system_1
M_1 = np.array([[0.7, 0.3],
                [0.3, 0.7]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))
# system_2
truth_2 = np.array([45, 25])
M_2 = M_1
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))
# compare
reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print(reco_2_transfer)
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))
reco_2_folded_70 = reco_2_folded
reco_2_transfer_70  = reco_2_transfer

Our truth_1 = [50 20] and our reco_1 = [ 41.  29.]
The migration matrix M_1 = 
[[ 0.7  0.3]
 [ 0.3  0.7]]
The transfer function TF_1 = [ 0.82  1.45]
Our truth_2 = [45 25] and our reco_2 = [ 39.  31.]
The migration matrix M_2 = 
[[ 0.7  0.3]
 [ 0.3  0.7]]
The transfer function TF_2 = [ 0.86666667  1.24      ]
[ 36.9   36.25]
Difference between folded and actual reco_2 = [ 0.  0.] %
Difference between transfer function and actual reco_2 = [  5.38461538 -16.93548387] %


##### 90% diagonal

In [76]:
# system_1
M_1 = np.array([[0.9, 0.1],
                [0.1, 0.9]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))
# system_2
truth_2 = np.array([45, 25])
M_2 = M_1
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))
# compare
reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print(reco_2_transfer)
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))
reco_2_folded_90 = reco_2_folded
reco_2_transfer_90  = reco_2_transfer

Our truth_1 = [50 20] and our reco_1 = [ 47.  23.]
The migration matrix M_1 = 
[[ 0.9  0.1]
 [ 0.1  0.9]]
The transfer function TF_1 = [ 0.94  1.15]
Our truth_2 = [45 25] and our reco_2 = [ 43.  27.]
The migration matrix M_2 = 
[[ 0.9  0.1]
 [ 0.1  0.9]]
The transfer function TF_2 = [ 0.95555556  1.08      ]
[ 42.3   28.75]
Difference between folded and actual reco_2 = [ 0.  0.] %
Difference between transfer function and actual reco_2 = [ 1.62790698 -6.48148148] %


As expected, changing the diagonality of the migration matrix makes no difference to the folding approach - it always closes. 

For the transfer function, we find that the more diagonal the migration matrix, the smaller the error. This is summarised in the table below:

| % diagonality | Bin 1 %difference | Bin 2 %difference  |
| -------------: |-------------:    | -----:             |
| 70%          | 5.3              | -16.9                |
| 80%          | 3.4              | -12.1                |
| 90%          | 1.6              | -6.5                 |


`M_1 != M_2`
----------------------------------------------------
Now we will look at the case where the two migration matricies are not equal. In this case we will keep **system 1** the same but change **system 2**.

In [77]:
truth_1 = np.array([50, 20])
M_1 = np.array([[0.8, 0.2],
                [0.2, 0.8]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))


truth_2 = np.array([45, 25])
M_2 = np.array([[0.9, 0.1],
                [0.1, 0.9]])
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))

Our truth_1 = [50 20] and our reco_1 = [ 44.  26.]
The migration matrix M_1 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_1 = [ 0.88  1.3 ]
Our truth_2 = [45 25] and our reco_2 = [ 43.  27.]
The migration matrix M_2 = 
[[ 0.9  0.1]
 [ 0.1  0.9]]
The transfer function TF_2 = [ 0.95555556  1.08      ]


##### If we now apply `M_1` and `TF_1` to `truth_2`, which method is closest the the actual `reco_2`?

In [78]:
reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))

Difference between folded and actual reco_2 = [ 4.65116279 -7.40740741] %
Difference between transfer function and actual reco_2 = [  7.90697674 -20.37037037] %


What we see is that there is now a difference in the folded case and the transfer function case. In this particular case where `M_1` is less diagonal than `M_2`, the folding method is more accurate. In the opposie situation where `M_1` is more diagonal than `M_2':

In [79]:
truth_1 = np.array([50, 20])
M_1 = np.array([[0.9, 0.1],
                [0.1, 0.9]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))


truth_2 = np.array([45, 25])
M_2 = np.array([[0.8, 0.2],
                [0.2, 0.8]])
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))

reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))

Our truth_1 = [50 20] and our reco_1 = [ 47.  23.]
The migration matrix M_1 = 
[[ 0.9  0.1]
 [ 0.1  0.9]]
The transfer function TF_1 = [ 0.94  1.15]
Our truth_2 = [45 25] and our reco_2 = [ 41.  29.]
The migration matrix M_2 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_2 = [ 0.91111111  1.16      ]
Difference between folded and actual reco_2 = [-4.87804878  6.89655172] %
Difference between transfer function and actual reco_2 = [-3.17073171  0.86206897] %


We find that the opposite is true, the transfer function is more accurate. In this case we are using significantly different migration matricies when in fact for our situation they are likely much closer. Trying with more similar migration matricies:

In [81]:
truth_1 = np.array([50, 20])
M_1 = np.array([[0.82, 0.18],
                [0.18, 0.82]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))


truth_2 = np.array([45, 25])
M_2 = np.array([[0.8, 0.2],
                [0.2, 0.8]])
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))

reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))

Our truth_1 = [50 20] and our reco_1 = [ 44.6  25.4]
The migration matrix M_1 = 
[[ 0.82  0.18]
 [ 0.18  0.82]]
The transfer function TF_1 = [ 0.892  1.27 ]
Our truth_2 = [45 25] and our reco_2 = [ 41.  29.]
The migration matrix M_2 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_2 = [ 0.91111111  1.16      ]
Difference between folded and actual reco_2 = [-0.97560976  1.37931034] %
Difference between transfer function and actual reco_2 = [ 2.09756098 -9.48275862] %


In this case where the migration matricies are much closer, the folding method is more accurate. The same case where `M_1` and `M_2` are now swapped:

In [82]:
truth_1 = np.array([50, 20])
M_1 = np.array([[0.8, 0.2],
                [0.2, 0.8]])
reco_1 = np.matmul(M_1, truth_1)
TF_1 = reco_1/truth_1
print("Our truth_1 = %s and our reco_1 = %s" % (str(truth_1), str(reco_1)))
print("The migration matrix M_1 = ")
print(np.matrix(M_1))
print("The transfer function TF_1 = %s" % str(TF_1))


truth_2 = np.array([45, 25])
M_2 = np.array([[0.82, 0.18],
                [0.18, 0.82]])
reco_2 = np.matmul(M_2, truth_2)
TF_2 = reco_2/truth_2
print("Our truth_2 = %s and our reco_2 = %s" % (str(truth_2), str(reco_2)))
print("The migration matrix M_2 = ")
print(np.matrix(M_2))
print("The transfer function TF_2 = %s" % str(TF_2))

reco_2_folded = np.matmul(M_1, truth_2)
reco_2_transfer = truth_2 * TF_1
print("Difference between folded and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_folded)))
print("Difference between transfer function and actual reco_2 = %s %%" % str(percentage_differnce(reco_2, reco_2_transfer)))

Our truth_1 = [50 20] and our reco_1 = [ 44.  26.]
The migration matrix M_1 = 
[[ 0.8  0.2]
 [ 0.2  0.8]]
The transfer function TF_1 = [ 0.88  1.3 ]
Our truth_2 = [45 25] and our reco_2 = [ 41.4  28.6]
The migration matrix M_2 = 
[[ 0.82  0.18]
 [ 0.18  0.82]]
The transfer function TF_2 = [ 0.92   1.144]
Difference between folded and actual reco_2 = [ 0.96618357 -1.3986014 ] %
Difference between transfer function and actual reco_2 = [  4.34782609 -13.63636364] %


Again the folding method is more accurate. It seems that the closer the two migration matricies are, the better the folding method will be compared to the transfer function. This makes sense as we have already shown that when the migration matricies are the same, the folding closes whereas the transfer function has an error.