In [20]:
import numpy as np
import pandas as pd

## Vectors

In [2]:
# a new vector s == array in NumPy
s = np.array([33, 65, 50, 45])
s

array([33, 65, 50, 45])

In [3]:
# the third coordinate
s[2]

50

In [5]:
# the length of a vector
len(s)

4

In [10]:
print("ndim:", s.ndim) # the number of the array's dimensions - n.dim: 1
print("shape:", s.shape) # the shape of the array, all vectors have only length - shape: (4, )

ndim: 1
shape: (4,)


In [11]:
apartment = np.array([59.50, 31.40, 19, 22, 60550, 2])

In [12]:
# let's calculate the % of living area in the apartment
share_living_space = apartment[1]/apartment[0]

In [13]:
apartment = np.delete(apartment, [0, 1])
apartment = np.append(apartment, share_living_space)

In [14]:
share_living_space

0.5277310924369748

In [16]:
len(apartment)

5

In [17]:
t = np.array([12, 14, 17, 19, 24, 28, 31, 31, 27, 22, 17, 13])

In [21]:
# the mean temperature in June
t[5]

28

In [22]:
# the min mean temperature in Rome
t.argmin()

0

In [23]:
# the mean temperature in February 
t[1]

14

In [24]:
# the max mean temperature in Rome
t.argmax()

6

### The vector operations

In [39]:
salary_eur_mother_in_law = np.array([2, 3, 2.5])
salary_rub_husband = np.array([120, 150, 90])
salary_rub_wife = np.array([130, 130, 130])
exch_rate_eur_rub = 72

In [41]:
salary_rub_mother_in_law = salary_eur_mother_in_law * exch_rate_eur_rub
# salary_rub_mother_in_law and salary_eur_mother_in_law are the collinear vectors
salary_rub_mother_in_law

array([144., 216., 180.])

In [42]:
salary_rub_husband + salary_rub_wife + salary_rub_mother_in_law

array([394., 496., 400.])

### Linear Combinations of Vectors

In general, a linear combination is a particular way of combining things (variables, vectors, etc) using scalar multiplication and addition.
(scalar)(vector 1) + (scalar)(vector 2) + (scalar)(vector 3)

**The vector b** can be written as a combination of the three given vectors using scalar multiplication and addition. Specifically,

> b = [3, 6, 9]  
> v1 = [1, 2, 3]  
> v2 = [3, 5, 1]  
> v3 = [8, 0, 0]  

> [3, 6, 9] = 3 * [1, 2, 3] + 0 * [3, 5, 1] + 0 * [8, 0, 0]

Or, using the names given to each vector:

> b = 3 * v1 + 0 * v2 + 0 * v3

Now that we have seen a couple of examples and the general idea, let’s finish with the formal definition of a linear combination of vectors. Let the vectors v1, v2, v3, vn be **vectors** in Rn and c1,c2,..,cn be **scalars**. Then the vector b, where b =c1v1+c2v2+..+cnvn is called **a linear combination of v1,v2,v3,..,vn**. The scalars c1,c2,…,cn are commonly called the **“weights”**.

In [44]:
2*np.array([4,5]) - 3*np.array([2,1]) + 5*np.array([1,0])

array([7, 7])

In [48]:
# Юный предприниматель Вовочка мастерит ракеты на водном топливе и продает их. За 4 недели он смастерил 3, 4, 5 и 9 
# ракет, а продал 1, 5, 3 и 6 ракет соответственно. На изготовление одной ракеты он тратит  рублей, а продаёт их по 
# рублей за штуку. Найдите вектор прибыли Вовочки за эти 4 недели.

-200*np.array([3, 4, 5, 9]) + 400*np.array([1, 5, 3, 6])

array([-200, 1200,  200,  600])

### A scalar product of two vectors

In [3]:
monthly_rent = np.array([65, 70, 120, 30])
makler_fee = np.array([0.4, 0.4, 0.2, 0.8])

In [4]:
np.dot(monthly_rent, makler_fee)

102.0

In [14]:
# OR
monthly_rent@makler_fee

102.0

If a scalar product is equal to 0, it means that the vectors are ortogonal. The angle between them is 90 Degree.

In [7]:
a = np.array([2, 3])
b = np.array([-9, 6])
np.dot(a, b)

0

In [8]:
a = np.array([4, 5, -1])
b = np.array([2, 0, 1])
np.dot(a, b)

7

### Vector length

The length of a vector is the square root of the sum of the squares of the horizontal and vertical components. If the horizontal or vertical component is zero: If a or b is zero, then you don't need the vector length formula. In this case, the length is just the absolute value of the nonzero component.

In [15]:
a = np.array([4, 6, -1])
np.sqrt([4**2 + 6**2 + 1**2])

array([7.28010989])

In [16]:
# OR
np.linalg.norm(a)

7.280109889280518

Нормирование вектора — это получение вектора с тем же направлением, что и исходный, но нормой .

Норма вектора вычисляется как корень из суммы квадратов его компонент. Её физический смысл: норма показывает, насколько вектор «большой».

Длина вектора (в физическом смысле) — это то же, что и норма, применимо к векторам в реальном пространстве.

Нормирование производится делением вектора на его норму.

In [18]:
a_norm = a / np.linalg.norm(a)
print(f"a_norm: {a_norm}")
print(f"lenght of a_norm: {np.linalg.norm(a_norm)}")

a_norm: [ 0.54944226  0.82416338 -0.13736056]
lenght of a_norm: 1.0


In [11]:
a = np.array([4, 2, -1])
b = np.array([2, 0, 1])
np.dot(a, b)

7

In [13]:
a = np.array([4, 5, -1])
b = np.array([0, 0, 0])
np.dot(a, b)
# ortogonal

0

### Task 1

In [23]:
hut_paradise_df = pd.DataFrame({'1.Rent': [65, 70, 120, 35, 40, 50, 100, 90, 85], 
                                '2.Area': [50, 52, 80, 33, 33, 44, 80, 65, 65], 
                                '3.Rooms':[3, 2, 1, 1, 1, 2, 4, 3, 2],
                                '4.Floor':[5, 12, 10, 3, 6, 13, 8, 21, 5], 
                                '5.Demo two weeks':[8, 4, 5, 10, 20, 12, 5, 1, 10], 
                                '6.Liv.Area': [37, 40, 65, 20, 16, 35, 60, 50, 40]})

In [24]:
hut_paradise_df

Unnamed: 0,1.Rent,2.Area,3.Rooms,4.Floor,5.Demo two weeks,6.Liv.Area
0,65,50,3,5,8,37
1,70,52,2,12,4,40
2,120,80,1,10,5,65
3,35,33,1,3,10,20
4,40,33,1,6,20,16
5,50,44,2,13,12,35
6,100,80,4,8,5,60
7,90,65,3,21,1,50
8,85,65,2,5,10,40


In [27]:
# vector of apartment 5
hut_paradise_df.iloc[4, :].values

array([40, 33,  1,  6, 20, 16], dtype=int64)

In [29]:
# vector of floors
hut_paradise_df['4.Floor'].values

array([ 5, 12, 10,  3,  6, 13,  8, 21,  5], dtype=int64)

In [31]:
# floor of the apartment 3
hut_paradise_df.loc[2, '4.Floor']

10

In [32]:
# # of apartments
len(hut_paradise_df)

9

In [35]:
# vector of non-living area
(hut_paradise_df['2.Area'] - hut_paradise_df['6.Liv.Area']).values

array([13, 12, 15, 13, 17,  9, 20, 15, 25], dtype=int64)

In [38]:
# Арендная плата измеряется в тысячах рублей. Пересчитайте стоимость аренды каждой квартиры в тысячах гривен, 
# если курс 10 руб = 4 гривны:
(hut_paradise_df['1.Rent'] * 0.4).values

array([26., 28., 48., 14., 16., 20., 40., 36., 34.])

In [40]:
"""
Пусть в первой квартире один просмотр занимает 10 минут, во второй — 20 минут, в третьей — полчаса, 
в четверой — 15 минут, в пятой — 5 минут, в шестой — 40 минут, в седьмой — 20 минут, в восьмой — 8 минут 
и в девятой — 20 минут. Найдите продолжительность просмотров в минутах во всех квартирах за 2 недели:
"""
demo_length_min = np.array([10, 20, 30, 15, 5, 40, 20, 8, 20])
np.dot(hut_paradise_df['5.Demo two weeks'], demo_length_min)

1348

In [42]:
# OR
hut_paradise_df['5.Demo two weeks']@demo_length_min

1348

### Task 2

In [43]:
u=np.array([3,0,1,1,1])
v=np.array([0,1,0,2,-2])
w=np.array([1,-4,-1,0,-2])

In [46]:
# a linear combination of v and w with the coefficient 2 and -3 accordingly
v*2 + w*(-3)

array([-3, 14,  3,  4,  2])

In [48]:
# this linear combination is ortogonal to the vector u
(v*2 + w*(-3))@u

0

In [50]:
# normalize all vectors and find their coordinates
u_norm = u / np.linalg.norm(u)
v_norm = v / np.linalg.norm(v)
w_norm = w / np.linalg.norm(w)

In [53]:
round(u_norm[2], 3), round(v_norm[3], 3), round(w_norm[0], 3)

(0.289, 0.667, 0.213)

# Matrix

Unfortunately, no one can be told what the Matrix is.
You have to see it for yourself.

Morpheus

![types of matrices](https://ars.els-cdn.com/content/image/3-s2.0-B9780080448947013415-gr1.gif)

In [54]:
np.array([[1, 1], [2, 3], [4, 5]])

array([[1, 1],
       [2, 3],
       [4, 5]])

In [58]:
data = np.array(range(1, 13))
data

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [61]:
data = data.reshape((4, 3))

In [62]:
data

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [64]:
df = pd.DataFrame(data)
A = df.values
A

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [65]:
A.shape

(4, 3)

In [69]:
# column vector
x = np.array([[1, 2, 3]])
x

array([[1, 2, 3]])

In [70]:
x.reshape((3,1))

array([[1],
       [2],
       [3]])

In [71]:
x = np.array([1, 2, 3], ndmin=2)
x

array([[1, 2, 3]])

In [72]:
# null matrix
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [75]:
# identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [79]:
# scalar identity matrix
np.eye(3) * 5

array([[5., 0., 0.],
       [0., 5., 0.],
       [0., 0., 5.]])

In [76]:
# one matrix
np.ones((3, 6))

array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])

In [78]:
# diagonal matrix
diagonal = [1, 3, 5]
np.diag(diagonal)

array([[1, 0, 0],
       [0, 3, 0],
       [0, 0, 5]])

In [80]:
import numpy as np

husband_income = np.array([100,220,140])
wife_income = np.array([150,200,130])
mother_in_law_income = np.array([90,80,100])

husband_consumption = np.array([50,50,60])
wife_consumption = np.array([100,80,140])
mother_in_law_consumption = np.array([100,20,140])

In [85]:
inc = np.array([husband_income, wife_income, mother_in_law_income])
inc.diagonal()

array([100, 200, 100])

In [90]:
cons = np.array([husband_consumption, wife_consumption, mother_in_law_consumption])
cons

array([[ 50,  50,  60],
       [100,  80, 140],
       [100,  20, 140]])

In [103]:
inc_after_taxes = inc.T * 0.87
inc_after_taxes[0, :]

array([ 87. , 130.5,  78.3])

In [104]:
P = inc_after_taxes - cons.T
P[2, :]

array([ 61.8, -26.9, -53. ])

### Matrix multiplication

In [114]:
# tensor dot returns always a matrix
a = np.array([1,3])
b = np.array([-3,1])
np.tensordot(a.T, b)

IndexError: tuple index out of range

![image](https://i.gyazo.com/a6f33f59e82b11b3b7c1d789c81de169.png)

In [118]:
a = np.array([[2, 0], [0, 3]])
b = np.array([[1, 1], [4, 5]])
np.dot(a, b)

array([[ 2,  2],
       [12, 15]])

In [119]:
np.dot(b, a)

array([[ 2,  3],
       [ 8, 15]])

In [176]:
a = np.array([1,1])
b = np.array([2,-1])
c = np.array([1,2])
test = np.array([a, b, c])
np.dot(test, test.T)

array([[2, 1, 3],
       [1, 5, 0],
       [3, 0, 5]])

In [123]:
A = np.array([[5,-1,3,1,2], [-2,8,5,-1,1]])
x = np.array([1,2,3,4,5])

In [126]:
A @ x

array([26, 30])

In [127]:
x @ A

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 5)

In [129]:
A=np.array( [ [1,9,8,5] , [3,6,3,2] , [3,3,3,3], [0,2,5,9], [4,4,1,2] ] )
B=np.array( [ [1,-1,0,1,1] , [-2,0,2,-1,1] ] )

In [130]:
A@B

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 4)

In [133]:
(B@A)[0, 3]

14

In [134]:
x = np.array([1,2,1,0,4])
y = np.array([2,1,-1,1,0])
z = np.array([-1,1,-1,0,0])

In [150]:
# Gram matrix
com = np.array([x,y,z], ndmin=2)
com@com.T

array([[22,  3,  0],
       [ 3,  7,  0],
       [ 0,  0,  3]])

In [151]:
count_df = pd.DataFrame({'Женские стрижки': [10, 2, 12, 4, 6, 10, 22, 7], 
                                'Мужские стрижки': [5, 21, 12, 8, 25, 3, 1, 0], 
                                'Окрашивания':[12, 3, 0, 18, 27, 2, 4, 31],
                              'Укладка':[15, 25, 30, 14, 25, 17, 25, 31],
                                'Уход':[10, 6, 4, 5, 18, 12, 20, 28]
                                }, 
                               index=['Аня', 'Борис', 'Вика', 'Галя', 'Дима', 'Егор', 'Женя','Юра'])
price_df = pd.DataFrame({'Женские стрижки': [2, 1.8, 2, 1.8, 2.5, 5, 1.1, 4.5], 
                                'Мужские стрижки': [1.5, 2.5, 2, 1.2, 3.5, 5, 1, 4], 
                                'Окрашивания':[1, 1, 0, 2.8, 2, 3, 1.5, 2.5],
                              'Укладка':[0.8, 1, 0.5, 0.8, 1, 2, 0.5, 1],
                                'Уход':[1, 1, 2, 2, 1.5, 2.5, 1.7, 2] 
                                }, 
                               index=['Аня', 'Борис', 'Вика', 'Галя', 'Дима', 'Егор', 'Женя','Юра'])

In [152]:
count_df

Unnamed: 0,Женские стрижки,Мужские стрижки,Окрашивания,Укладка,Уход
Аня,10,5,12,15,10
Борис,2,21,3,25,6
Вика,12,12,0,30,4
Галя,4,8,18,14,5
Дима,6,25,27,25,18
Егор,10,3,2,17,12
Женя,22,1,4,25,20
Юра,7,0,31,31,28


In [153]:
price_df

Unnamed: 0,Женские стрижки,Мужские стрижки,Окрашивания,Укладка,Уход
Аня,2.0,1.5,1.0,0.8,1.0
Борис,1.8,2.5,1.0,1.0,1.0
Вика,2.0,2.0,0.0,0.5,2.0
Галя,1.8,1.2,2.8,0.8,2.0
Дима,2.5,3.5,2.0,1.0,1.5
Егор,5.0,5.0,3.0,2.0,2.5
Женя,1.1,1.0,1.5,0.5,1.7
Юра,4.5,4.0,2.5,1.0,2.0


In [158]:
count_df.loc['Борис',:].values * price_df.loc['Борис',:].values

array([ 3.6, 52.5,  3. , 25. ,  6. ])

In [172]:
# Найдите вектор прибыли салона по стилистам, если за каждую услугу стилисты платят определенную комиссию салону.
com = np.array([0.2, 0.2, 0.3, 0.1, 0.1])
((count_df.values * price_df.values) * com).sum(axis=1)

array([11.3 , 15.22, 11.9 , 20.6 , 41.9 , 21.2 , 11.49, 38.25])

In [173]:
(count_df.values * price_df.values) @ com

array([11.3 , 15.22, 11.9 , 20.6 , 41.9 , 21.2 , 11.49, 38.25])

In [175]:
# Найдите вектор прибыли стилистов.
(count_df * price_df).values @ (1-com)

array([ 50.2 ,  74.88,  59.1 ,  67.8 , 166.6 , 113.8 ,  66.21, 157.75])

In [179]:
a = np.array([2, 0, 0])
b = np.array([0, 1, 0])
c = np.array([0, 0, 4])
test = np.array([a, b, c])
test

array([[2, 0, 0],
       [0, 1, 0],
       [0, 0, 4]])

In [180]:
np.linalg.det(test)

7.999999999999998

In [183]:
test_inv = np.linalg.inv(test)
np.linalg.det(test_inv)

0.12500000000000003

In [185]:
# matrix rank - the # of linearly independent columns
np.linalg.matrix_rank(test)

3

In [187]:
a = np.array([4,0])
b = np.array([1,1])
c = np.array([6,4])
test = np.array([a, b, c])
np.linalg.matrix_rank(test)

2

In [198]:
list(range(1, 9)) + [10]

[1, 2, 3, 4, 5, 6, 7, 8, 10]

In [199]:
a1 = np.array(list(range(1, 9)) + [10]).reshape((3,3))
a1

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8, 10]])

In [202]:
print(f"The rank of the matrix is equal to {np.linalg.matrix_rank(a1)} that is equal to the # of rows and columns. \n Therefore, the matrix should be inverse.")

The rank of the matrix is equal to 3 that is equal to the # of rows and columns. 
 Therefore, the matrix should be inverse.


![1](https://i.pinimg.com/originals/c0/ad/cd/c0adcd88a08b12b2861f9b44712d67ef.gif)

In [203]:
# let's check it
np.linalg.inv(a1)

array([[-0.66666667, -1.33333333,  1.        ],
       [-0.66666667,  3.66666667, -2.        ],
       [ 1.        , -2.        ,  1.        ]])

In [224]:
# to check that A*A-1 gives an identity matrix
np.dot(a1, np.linalg.inv(a1)).round(0)

array([[ 1., -0., -0.],
       [ 0.,  1., -0.],
       [ 0.,  0.,  1.]])

In [208]:
# let's look at another example
a2 = np.array(list(range(1, 7)) + [1, 2, 3]).reshape((3,3))
a2

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3]])

In [211]:
print(f"The rank of the matrix is equal to {np.linalg.matrix_rank(a2)} that is NOT equal to the # of rows and columns. \n Therefore, maths tells us that the matrix is non invertable.")

The rank of the matrix is equal to 2 that is NOT equal to the # of rows and columns. 
 Therefore, maths tells us that the matrix is non invertable.


In [213]:
# let's check it
np.linalg.inv(a2) 
# it says that the matrix is non invertable and you can't calculate it

LinAlgError: Singular matrix

In [217]:
# let's look at the third example
a3 = np.array(range(1, 10)).reshape((3,3))
a3

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [218]:
print(f"The rank of the matrix is equal to {np.linalg.matrix_rank(a3)} that is NOT equal to the # of rows and columns. \n Therefore, maths tells us that the matrix is non invertable.")

The rank of the matrix is equal to 2 that is NOT equal to the # of rows and columns. 
 Therefore, maths tells us that the matrix is non invertable.


In [220]:
# let's check it
np.linalg.inv(a3) 
# but it is invertable

array([[ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15],
       [-6.30503948e+15,  1.26100790e+16, -6.30503948e+15],
       [ 3.15251974e+15, -6.30503948e+15,  3.15251974e+15]])

In [225]:
# let's try to check that A*A-1 gives an identity matrix
np.dot(a3, np.linalg.inv(a3)).round(0)
# and our check fails

array([[ 0.,  1., -0.],
       [ 0.,  2., -1.],
       [ 0.,  3.,  2.]])

We can come to the conclusion that the problem is here in that the calculations were done in Python where the error was accumulated. We can confirm just by looking at the determinant.

In [226]:
np.linalg.det(a1)

-3.000000000000001

In [227]:
np.linalg.det(a2)

0.0

In [228]:
np.linalg.det(a3)

-9.51619735392994e-16

In [229]:
a=np.array( [[ 8 , 6 ,11],[ 7 , 5 , 9],[ 6 ,10,  6]])

In [231]:
np.linalg.inv(a)[1, 0]

0.375

In [232]:
v1 = np.array([9, 10, 7, 7, 9])
v2 = np.array([2, 0, 5, 1, 4])
v3 = np.array([4, 0, 0, 4, 1])
v4 = np.array([3, -4, 3, -1, -4])

In [235]:
vs = np.array([v1, v2, v3, v4])
np.linalg.matrix_rank(vs)

4

In [240]:
# Gram matrix
gram = (vs@vs.T)
gram[0, 3]

-35

In [241]:
np.linalg.det(gram)

3716647.9999999995

In [248]:
gram_inv = np.linalg.inv(gram)
round(gram_inv[2, 0], 3)

-0.026

In [246]:
(gram@gram_inv).round(0)

array([[ 1.,  0.,  0., -0.],
       [-0.,  1.,  0., -0.],
       [-0., -0.,  1.,  0.],
       [ 0.,  0., -0.,  1.]])