**Dimensionalitäts Reduktion mit SVD**
1. Zerlegung von $m \times n$ Matrix 
2. Auswahl Anzahl der Singulärwerte
3. Rekonstruktion mit verschiednen Anzahlen von Singulärwerten


<p>
$Data_{m\times n} = U_{m \times m}\Sigma_{m \times n}V^T_{n \times n}$    
<p>  unter Verwendung von 3 latenten Faktoren approximiert duch <p>
$Data_{m\times n} = U_{m \times 3}\Sigma_{3 \times 3}V^T_{3 \times n}$ 



In [24]:
import numpy as np
from numpy import mat

In [25]:
# User-Item Matrix 
# Reihen = User, Spalten = Items
data = np.array([[1, 1, 1, 0, 0],
                 [2, 2, 2, 0, 0],
                 [1, 1, 1, 0, 0],
                 [5, 5, 5, 0, 0],
                 [1, 1, 0, 2, 2],
                 [0, 0, 0, 3, 3],
                 [0, 0, 0, 1, 1]])

# Erster User
print("Ratings of first User: {}".format(data[0,:]))
# Erstes Item
print("Ratings for first Item: {}".format(data[:,0]))

Ratings of first User: [1 1 1 0 0]
Ratings for first Item: [1 2 1 5 1 0 0]


**Zerlegung**

In [26]:
from numpy import linalg
np.set_printoptions(precision=1, suppress=True)

U,Sigma,VT=linalg.svd(data)


**Resultierende Matrizen**

In [27]:
print(U)

[[-0.2 -0.   0.   0.9 -0.3 -0.1  0.1]
 [-0.4 -0.   0.  -0.1  0.  -0.9 -0. ]
 [-0.2 -0.   0.  -0.4 -0.8  0.1  0.5]
 [-0.9 -0.1  0.1 -0.   0.2  0.4 -0.1]
 [-0.1  0.5 -0.8 -0.  -0.  -0.   0. ]
 [-0.   0.8  0.5 -0.  -0.2  0.  -0.3]
 [-0.   0.3  0.2  0.   0.5 -0.   0.8]]


In [28]:
# singular values
print(Sigma)
# => nur die ersten drei behalten

[9.7 5.3 0.7 0.  0. ]


In [29]:
print(VT)

[[-0.6 -0.6 -0.6 -0.  -0. ]
 [ 0.   0.  -0.1  0.7  0.7]
 [-0.4 -0.4  0.8  0.1  0.1]
 [-0.7  0.7 -0.   0.   0. ]
 [ 0.  -0.   0.  -0.7  0.7]]


**Anzahl der singulären Werte** <P>
Zie: 90% der Energie

In [30]:
# Quadrieren der Werte
Sig2=Sigma**2
# Totale Energie
total_engergy = sum(Sigma)
print("Total energy: {:5.2f}".format(total_engergy))
# 90% der Energie
energy_90 = total_engergy * 0.9
print("90% energy: {:5.2f}".format(energy_90))
energy_first_one = sum(Sigma[:1])
print("Energy in first SV: {:5.2f}".format(energy_first_one))
# Wieviel Energie in den ersten beiden
energy_first_two = sum(Sigma[:2])
print("Energy in first two SV: {:5.2f}".format(energy_first_two))
# Ersten drei
energy_first_three = sum(Sigma[:3])
print("Energy in first three SV: {:5.2f}".format(energy_first_three))



Total energy: 15.70
90% energy: 14.13
Energy in first SV:  9.72
Energy in first two SV: 15.02
Energy in first three SV: 15.70


**Rekonstruktion** anhand von **drei** Dimensionen

In [31]:
# Diagonale Matrix erstellen 
# (np gibt nur Werte auf diagonalen)
Sig3 = mat([[Sigma[0], 0, 0],
            [0, Sigma[1], 0],
            [0, 0, Sigma[2]]])
U[:,:3]

array([[-0.2, -0. ,  0. ],
       [-0.4, -0. ,  0. ],
       [-0.2, -0. ,  0. ],
       [-0.9, -0.1,  0.1],
       [-0.1,  0.5, -0.8],
       [-0. ,  0.8,  0.5],
       [-0. ,  0.3,  0.2]])

In [32]:
# Rekonstruktion durch Multiplikation
approx = U[:,:3] * Sig3 * VT[:3,:]
#np.int_(approx)
approx

matrix([[ 1.,  1.,  1., -0., -0.],
        [ 2.,  2.,  2.,  0.,  0.],
        [ 1.,  1.,  1.,  0.,  0.],
        [ 5.,  5.,  5.,  0., -0.],
        [ 1.,  1., -0.,  2.,  2.],
        [ 0.,  0., -0.,  3.,  3.],
        [ 0.,  0., -0.,  1.,  1.]])

In [33]:
data

array([[1, 1, 1, 0, 0],
       [2, 2, 2, 0, 0],
       [1, 1, 1, 0, 0],
       [5, 5, 5, 0, 0],
       [1, 1, 0, 2, 2],
       [0, 0, 0, 3, 3],
       [0, 0, 0, 1, 1]])

**Rekonstruktion** anhand von **zwei** Dimensionen

In [42]:
# Nur zwei Dimensionen
Sig2 = mat([[Sigma[0], 0],
            [0, Sigma[1]]])
approx = U[:,:2] * Sig2 * VT[:2,:]
#approx
print(U[:,:2])
print(Sig2)
print(VT[:2,:])

[[-0.2 -0. ]
 [-0.4 -0. ]
 [-0.2 -0. ]
 [-0.9 -0.1]
 [-0.1  0.5]
 [-0.   0.8]
 [-0.   0.3]]
[[9.7 0. ]
 [0.  5.3]]
[[-0.6 -0.6 -0.6 -0.  -0. ]
 [ 0.   0.  -0.1  0.7  0.7]]


In [35]:
# Transformieren der Items in lower dimensional space
data.T * (U[:,:3] * Sig3.I)

matrix([[-0.6,  0. , -0.4],
        [-0.6,  0. , -0.4],
        [-0.6, -0.1,  0.8],
        [-0. ,  0.7,  0.1],
        [-0. ,  0.7,  0.1]])

**Rekonstruktion** anhand von **einer** Dimension

In [39]:
# Eine Dimension
Sig1 = mat([Sigma[0]])
approx = U[:,:1] * Sig1 * VT[:1,:]
approx

matrix([[1. , 1. , 1. , 0.1, 0.1],
        [2. , 2. , 2. , 0.1, 0.1],
        [1. , 1. , 1. , 0.1, 0.1],
        [5. , 5. , 4.9, 0.3, 0.3],
        [0.8, 0.8, 0.7, 0. , 0. ],
        [0.1, 0.1, 0.1, 0. , 0. ],
        [0. , 0. , 0. , 0. , 0. ]])

In [41]:
# Original
data

array([[1, 1, 1, 0, 0],
       [2, 2, 2, 0, 0],
       [1, 1, 1, 0, 0],
       [5, 5, 5, 0, 0],
       [1, 1, 0, 2, 2],
       [0, 0, 0, 3, 3],
       [0, 0, 0, 1, 1]])