<a href="https://colab.research.google.com/github/krakowiakpawel9/data-science-bootcamp/blob/master/01_wprowadzenie/01_numpy_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### NumPy
>Strona biblioteki: [https://numpy.org/](https://numpy.org/)  
>Dokumentacja: [https://numpy.org/doc/](https://numpy.org/doc/)  
>
>Podstawowa biblioteka do obliczeń numerycznych w języku Python.
>
>Aby zainstalować bibliotekę NumPy, użyj polecenia poniżej:
```
pip install numpy
```

### Spis treści:
1. [Podstawy](#a1)
2. [Typy danych](#a2)
3. [Tworzenie tablic](#a3)
4. [Podstawowe operacje na tablicach](#a4)
5. [Generowanie liczb pseudolosowych](#a5)
6. [Podstawowe funkcje](#a6)
7. [Indeksowanie, Wycinanie](#a7)
8. [Iteracja po tablicach](#a8)
9. [Zmiana rozmiaru tablic](#a9)
10. [Algebra liniowa](#a10)


### <a name='a1'></a> Podstawy

In [0]:
import numpy as np
np.__version__

'1.17.4'

In [0]:
print(dir(np))



In [0]:
help(np.array)

Help on built-in function array in module numpy:

array(...)
    array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)
    
    Create an array.
    
    Parameters
    ----------
    object : array_like
        An array, any object exposing the array interface, an object whose
        __array__ method returns an array, or any (nested) sequence.
    dtype : data-type, optional
        The desired data-type for the array.  If not given, then the type will
        be determined as the minimum type required to hold the objects in the
        sequence.  This argument can only be used to 'upcast' the array.  For
        downcasting, use the .astype(t) method.
    copy : bool, optional
        If true (default), then the object is copied.  Otherwise, a copy will
        only be made if __array__ returns a copy, if obj is a nested sequence,
        or if a copy is needed to satisfy any of the other requirements
        (`dtype`, `order`, etc.).
    order : {'K', 'A', 'C', 'F'}

Podstawowym obiektem biblioteki NumPy jest wielowymiarowa tablica, której cechą charakterystyczną jest jednorodność elementów.   
W bibliotece NumPy wymiary są nazywane osiami (axes).  
Klasa do której należą obiekty nazywa się `ndarray`.

In [0]:
x = np.array([1, 3])
x

array([1, 3])

In [0]:
print(x)

[1 3]


In [0]:
type(x)

numpy.ndarray

Podstawowe atrybuty obiektu klasy `ndarray`

In [0]:
A = np.array([-1, 0, 2])
A

array([-1,  0,  2])

In [0]:
# wymiar tablicy
A.ndim

1

In [0]:
# rozmiar tablicy 
A.shape

(3,)

In [0]:
len(A.shape) == A.ndim

True

In [0]:
# całkowita liczba elementów w tablicy
A.size

3

In [0]:
# typ danych w tablicy
A.dtype

dtype('int64')

1D Array

In [0]:
np.array([0, 0])

array([0, 0])

In [0]:
np.array([1, 2, 1, 3, 5])

array([1, 2, 1, 3, 5])

2D Array

In [0]:
np.array(
    [[1, 2],
     [3, 1]]
)

array([[1, 2],
       [3, 1]])

In [0]:
np.array(
    [[1],
     [4]]
)

array([[1],
       [4]])

In [0]:
np.array(
    [[3, 2, 1],
     [6, 2, 1]]
)

array([[3, 2, 1],
       [6, 2, 1]])

In [0]:
np.array(
    [[3, 2, 1],
     [5, 2, 1],
     [8, 5, 2]]
)

array([[3, 2, 1],
       [5, 2, 1],
       [8, 5, 2]])

3D Array

In [0]:
np.array(
    [[[1, 2],
      [5, 2]],
     
     [[3, 2],
      [7, 1]]]
)

array([[[1, 2],
        [5, 2]],

       [[3, 2],
        [7, 1]]])

In [0]:
np.array(
    [[[1, 2]],
     
     [[4, 1]],
     
     [[3, 1]]]
)

array([[[1, 2]],

       [[4, 1]],

       [[3, 1]]])

In [0]:
x=np.array(
    [[[4, 3, 2],
      [3, 1, 1]],
     
     [[3, 2, 1],
      [5, 2, 1]],
     
     [[1, 2, 1],
      [3, 2, 1]]]
)

### <a name='a2'></a> Typy danych

In [0]:
A = np.array([1, 2, 3])
A.dtype

dtype('int64')

In [0]:
A = np.array([1.2, 4.2, 1.0])
A.dtype

dtype('float64')

In [0]:
A = np.array([1, 2, 4], dtype='float')
A.dtype

dtype('float64')

In [0]:
A = np.array([1, 2, 4], dtype='float32')
A.dtype

dtype('float32')

In [0]:
A = np.array([1, 2, 4], dtype='complex')
A.dtype

dtype('complex128')

In [0]:
A = np.array([3.2, 4.8], dtype='int')
A.dtype

dtype('int64')

In [0]:
A = np.array([True, False, False])
A.dtype

dtype('bool')

In [0]:
# od -128 do 127
A = np.array([23, 24, 500], dtype=np.int8)
A.dtype

dtype('int8')

In [0]:
A

array([ 23,  24, -12], dtype=int8)

In [0]:
# od 0 do 255
A = np.array([23, 24, 256], dtype=np.uint8)
A.dtype

dtype('uint8')

In [0]:
A

array([23, 24,  0], dtype=uint8)

### <a name='a3'></a> Tworzenie tablic

In [0]:
np.zeros(shape=(4, 10))

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [0]:
np.zeros(shape=(4, 10), dtype='int')

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [0]:
np.ones(shape=(5, 5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [0]:
np.ones(shape=(10, 1), dtype='int')

array([[1],
       [1],
       [1],
       [1],
       [1],
       [1],
       [1],
       [1],
       [1],
       [1]])

In [0]:
np.full(shape=(3, 3), fill_value=4, dtype='int')

array([[4, 4, 4],
       [4, 4, 4],
       [4, 4, 4]])

In [0]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [0]:
np.arange(start=5, stop=10)

array([5, 6, 7, 8, 9])

In [0]:
np.arange(start=10, stop=100, step=10)

array([10, 20, 30, 40, 50, 60, 70, 80, 90])

In [0]:
np.arange(start=0, stop=1, step=0.1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

In [0]:
np.arange(start=10, stop=0, step=-1)

array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1])

In [0]:
np.linspace(start=0, stop=1, num=11)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [0]:
np.linspace(start=0, stop=1, num=30)

array([0.        , 0.03448276, 0.06896552, 0.10344828, 0.13793103,
       0.17241379, 0.20689655, 0.24137931, 0.27586207, 0.31034483,
       0.34482759, 0.37931034, 0.4137931 , 0.44827586, 0.48275862,
       0.51724138, 0.55172414, 0.5862069 , 0.62068966, 0.65517241,
       0.68965517, 0.72413793, 0.75862069, 0.79310345, 0.82758621,
       0.86206897, 0.89655172, 0.93103448, 0.96551724, 1.        ])

In [0]:
A = np.arange(15)
A

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [0]:
A.reshape((3, 5))

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [0]:
A.reshape((5, 3))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [0]:
A.reshape((5, 2))

ValueError: ignored

In [0]:
A.reshape((5, -1))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [0]:
A = np.arange(10000).reshape(100, -1)
A

array([[   0,    1,    2, ...,   97,   98,   99],
       [ 100,  101,  102, ...,  197,  198,  199],
       [ 200,  201,  202, ...,  297,  298,  299],
       ...,
       [9700, 9701, 9702, ..., 9797, 9798, 9799],
       [9800, 9801, 9802, ..., 9897, 9898, 9899],
       [9900, 9901, 9902, ..., 9997, 9998, 9999]])

### Braki danych

In [0]:
A = np.array([1, 3, np.nan])
A

array([ 1.,  3., nan])

In [0]:
bool(np.nan)

True

### <a name='a4'></a> Podstawowe operacje na tablicach

In [0]:
A = np.array([2, 1, 4, -2])
B = np.array([3, 2, 1, 0])
A + B

array([ 5,  3,  5, -2])

In [0]:
A - B

array([-1, -1,  3, -2])

In [0]:
A + 3

array([5, 4, 7, 1])

In [0]:
A / 4.0

array([ 0.5 ,  0.25,  1.  , -0.5 ])

In [0]:
A + 3 * B

array([11,  7,  7, -2])

In [0]:
np.add(A, B)

array([ 5,  3,  5, -2])

In [0]:
np.subtract(A, B)

array([-1, -1,  3, -2])

In [0]:
np.multiply(A, B)

array([6, 2, 4, 0])

In [0]:
np.divide(A, B)

  """Entry point for launching an IPython kernel.


array([0.66666667, 0.5       , 4.        ,       -inf])

Mnożenie

In [0]:
X = np.array(
    [[1, 3],
     [-2, 0]]
)
Y = np.array(
    [[6, 0],
     [-1, 2]]
)

In [0]:
# mnożenie element po elemencie
X * Y

array([[6, 0],
       [2, 0]])

In [0]:
# mnożenie macierzy
X @ Y

array([[  3,   6],
       [-12,   0]])

In [0]:
X.dot(Y)

array([[  3,   6],
       [-12,   0]])

In [0]:
X = np.array(
    [[1, 3],
     [2, 1],
     [3, 1]]
)
W = np.array(
    [1, 2]
)

In [0]:
W

array([1, 2])

In [0]:
print(X.shape)
print(W.shape)

(3, 2)
(2,)


In [0]:
X.dot(W)

array([7, 4, 5])

### <a name='a5'></a>  Generowanie liczb pseudolosowych

In [0]:
np.random.seed(1)

In [0]:
# standardowy rozkład normalny
np.random.randn()

-1.0729686221561705

In [0]:
np.random.randn(10)

array([ 0.86540763, -2.3015387 ,  1.74481176, -0.7612069 ,  0.3190391 ,
       -0.24937038,  1.46210794, -2.06014071, -0.3224172 , -0.38405435])

In [0]:
np.random.randn(10, 4)

array([[-0.20889423,  0.58662319,  0.83898341,  0.93110208],
       [ 0.28558733,  0.88514116, -0.75439794,  1.25286816],
       [ 0.51292982, -0.29809284,  0.48851815, -0.07557171],
       [ 1.13162939,  1.51981682,  2.18557541, -1.39649634],
       [-1.44411381, -0.50446586,  0.16003707,  0.87616892],
       [ 0.31563495, -2.02220122, -0.30620401,  0.82797464],
       [ 0.23009474,  0.76201118, -0.22232814, -0.20075807],
       [ 0.18656139,  0.41005165,  0.19829972,  0.11900865],
       [-0.67066229,  0.37756379,  0.12182127,  1.12948391],
       [ 1.19891788,  0.18515642, -0.37528495, -0.63873041]])

In [0]:
# rozkład jednostajny na przedziale [0, 1)
np.random.rand()

0.9695957483196745

In [0]:
np.random.rand(10)

array([0.56103022, 0.01864729, 0.80063267, 0.23297427, 0.8071052 ,
       0.38786064, 0.86354185, 0.74712164, 0.55624023, 0.13645523])

In [0]:
np.random.rand(10, 3)

array([[0.05991769, 0.12134346, 0.04455188],
       [0.10749413, 0.22570934, 0.71298898],
       [0.55971698, 0.01255598, 0.07197428],
       [0.96727633, 0.56810046, 0.20329323],
       [0.25232574, 0.74382585, 0.19542948],
       [0.58135893, 0.97001999, 0.8468288 ],
       [0.23984776, 0.49376971, 0.61995572],
       [0.8289809 , 0.15679139, 0.0185762 ],
       [0.07002214, 0.48634511, 0.60632946],
       [0.56885144, 0.31736241, 0.98861615]])

In [0]:
# zwraca pseudolosowo liczbę całkowitą mniejszą niż 10
np.random.randint(10)

8

In [0]:
np.random.randint(low=10, high=101)

40

In [0]:
np.random.randint(low=10, high=101, size=10)

array([27, 78, 74, 70, 88, 27, 49, 45, 91, 38])

In [0]:
np.random.choice([32, 12, 54])

54

In [0]:
np.random.choice(['python', 'java', 'php'])

'php'

In [0]:
data = np.arange(10)
data

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [0]:
np.random.shuffle(data)
data

array([5, 3, 0, 4, 2, 1, 6, 7, 8, 9])

### <a name='a6'></a> Podstawowe funkcje

In [0]:
np.exp(1)

2.718281828459045

In [0]:
np.exp(2)

7.38905609893065

In [0]:
np.sqrt(9)

3.0

In [0]:
# help(np.all)
np.all([2, 3, 2])

True

In [0]:
np.all([False, True, True])

False

In [0]:
np.any([False, False, False])

False

In [0]:
np.any([False, False, True])

True

In [0]:
A = np.random.rand(5)
A

array([0.3245175 , 0.72801164, 0.52273663, 0.73681147, 0.16540611])

In [0]:
np.argmax(A)

3

In [0]:
np.argmin(A)

4

In [0]:
# zwraca tablicę indeksów, które sortują tablicę A
np.argsort(A)

array([4, 0, 2, 1, 3])

In [0]:
np.max(A)

0.7368114658282388

In [0]:
np.min(A)

0.16540611484625978

In [0]:
np.mean(A)

0.4954966700270667

In [0]:
np.median(A)

0.522736629636422

In [0]:
np.std(A)

0.2241569590007642

###  <a name='a7'></a> Indeksowanie, Wycinanie

In [0]:
A = np.arange(20).reshape(4, 5)
A

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [0]:
# A[start:stop, start:stop]
# A[idx1, idx2]
A[0]

array([0, 1, 2, 3, 4])

In [0]:
A[1]

array([5, 6, 7, 8, 9])

In [0]:
A[:, 0]

array([ 0,  5, 10, 15])

In [0]:
A[:, -1]

array([ 4,  9, 14, 19])

In [0]:
A[1, 1]

6

In [0]:
A[-1, -1]

19

In [0]:
A[2:4]

array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [0]:
A = np.arange(10).reshape(2, -1)
A

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [0]:
A[1, 2]

7

In [0]:
A[1, 2] = 14
A

array([[ 0,  1,  2,  3,  4],
       [ 5,  6, 14,  8,  9]])

### <a name='a8'></a> Iteracja po tablicach

In [0]:
A

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [0]:
for row in A:
    print(row)

[0 1 2 3 4]
[5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]


In [0]:
for row in A:
    print(row[0])

0
5
10
15


In [0]:
for item in A.flat:
    print(item)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


### <a name='a9'></a> Zmiana rozmiaru tablic

In [0]:
A

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [0]:
A.ravel()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [0]:
A.reshape(5, -1)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [0]:
A.T

array([[ 0,  5, 10, 15],
       [ 1,  6, 11, 16],
       [ 2,  7, 12, 17],
       [ 3,  8, 13, 18],
       [ 4,  9, 14, 19]])

### Maski

In [0]:
A = np.arange(start=-10, stop=10, step=0.5)
A = A.reshape(10, -1)
A

array([[-10. ,  -9.5,  -9. ,  -8.5],
       [ -8. ,  -7.5,  -7. ,  -6.5],
       [ -6. ,  -5.5,  -5. ,  -4.5],
       [ -4. ,  -3.5,  -3. ,  -2.5],
       [ -2. ,  -1.5,  -1. ,  -0.5],
       [  0. ,   0.5,   1. ,   1.5],
       [  2. ,   2.5,   3. ,   3.5],
       [  4. ,   4.5,   5. ,   5.5],
       [  6. ,   6.5,   7. ,   7.5],
       [  8. ,   8.5,   9. ,   9.5]])

In [0]:
A > 0

array([[False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [0]:
A[A > 0]

array([0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5,
       7. , 7.5, 8. , 8.5, 9. , 9.5])

In [0]:
# ValueError
# A[A > -5 and A < 5]
# A[A > -5 & A < 5]

ValueError: ignored

In [0]:
np.bitwise_and(A > -5, A < 5)

array([[False, False, False, False],
       [False, False, False, False],
       [False, False, False,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True, False, False],
       [False, False, False, False],
       [False, False, False, False]])

In [0]:
A[np.bitwise_and(A > -5, A < 5)]

array([-4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,  0.5,
        1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5])

In [0]:
np.bitwise_or(A < -5, A > 5)

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [0]:
A[np.bitwise_or(A < -5, A > 5)]

array([-10. ,  -9.5,  -9. ,  -8.5,  -8. ,  -7.5,  -7. ,  -6.5,  -6. ,
        -5.5,   5.5,   6. ,   6.5,   7. ,   7.5,   8. ,   8.5,   9. ,
         9.5])

### Broadcasting

In [0]:
A = np.array([2, 1, -4])
const = 2
A * const

array([ 4,  2, -8])

In [0]:
const = 2.0
A * const

array([ 4.,  2., -8.])

In [0]:
A = np.arange(12).reshape(4, 3)
A

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [0]:
B = np.arange(10, 41, 10).reshape(-1, 1)
B

array([[10],
       [20],
       [30],
       [40]])

In [0]:
A + B

array([[10, 11, 12],
       [23, 24, 25],
       [36, 37, 38],
       [49, 50, 51]])

In [0]:
A - B

array([[-10,  -9,  -8],
       [-17, -16, -15],
       [-24, -23, -22],
       [-31, -30, -29]])

In [0]:
A * B

array([[  0,  10,  20],
       [ 60,  80, 100],
       [180, 210, 240],
       [360, 400, 440]])

In [0]:
A / B

array([[0.        , 0.1       , 0.2       ],
       [0.15      , 0.2       , 0.25      ],
       [0.2       , 0.23333333, 0.26666667],
       [0.225     , 0.25      , 0.275     ]])