**Created by Berkay Alan**

**Numpy**

**5 of Decemmber, 2020**

## Contents

- What is Numpy?
- Importing Numpy
- Numpy arrays and Dimensions
- Creating Numpy Arrays
    - Zero arrays
    - Ones arrays
    - Full arrays
    - Identify Matrixes
    - Linear Series
    - Distributions arrays - Random
- Array Indexing
- Subsets
- reshape() function
- Flattening the Arrays
- Concatenation
- Splitting
- Sorting
- Broadcasting
- Array Math
- Dot(Scalar) Product
    
    
**Resources:**

- https://cs231n.github.io/python-numpy-tutorial/#numpy
- https://numpy.org/doc/stable/reference/

## What is Numpy?

It is abbreviation of **Numerical Python**. Numpy is a python library for scientific computing. It helps to work with multidimensional array objects and provides tools for working with these array. 

Source code of this library is reachable here: https://github.com/numpy/numpy

It is used very often in order to handle with huge data. 

![image.png](attachment:image.png)

## Importing Numpy

In general, **np** is used for short and alternative name. It can be creates with the **as** keyword.

In [1]:
import numpy as np

## Numpy Arrays (Ndarrays) and Dimensions

An array is main data structure of the Numpy library.It is a grid of values and is indexed by a tuple of nonnegative integers. **The elements are all of the same type**, referred to as the array dtype.

An array can be indexed by a tuple of nonnegative integers, by boolean, by other array or by integers. The rank of the array is equal to dimensions of array, the shape of an array is a tuple of integers giving the size of the array along each dimension.

**ndarray** is shorthand of **N-Dimensional Array** which you will hear occasionally. An N-dimensional array is simply an array with any number of dimensions.

In [38]:
np.array(74) # 0 dimensional array

array(74)

In [39]:
np.array([1,2,3,4,5]) # 1 dimensional array

array([1, 2, 3, 4, 5])

In [43]:
np.array([[1,2,3,4,5],[6,7,8,9,19]]) # 2 dimensional array - do not forget brackets!

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 19]])

In [46]:
np.array([[[1,2,3,4,5],[6,7,8,9]],[[11,23,43,45,65],[63,45,284,49]]]) 
# 3 dimensional array - An array that has 2-D arrays (matrices)

array([[list([1, 2, 3, 4, 5]), list([6, 7, 8, 9])],
       [list([11, 23, 43, 45, 65]), list([63, 45, 284, 49])]],
      dtype=object)

The NumPy ndarray class is used to represent both matrices and vectors. A vector is an array with a single column, while a matrix refers to an array with multiple columns.

With **ndim** attribute, dimension of arrays can be checked.

In [47]:
a = np.array([2,1,23,3])

In [48]:
a.ndim

1

With **shape** attribute, shape of arrays can be checked.

In [50]:
a.shape

(4,)

With **size** attribute, variable number of arrays can be checked.

In [51]:
a.size

4

With **dtype** attribute, data type of arrays can be checked.

In [52]:
a.dtype

dtype('int64')

## Creating Numpy Arrays

Arrays can be created with nested Python lists and elements are accessible with square brackets as lists.

In [2]:
arr = np.array([1,2,3])

In [3]:
arr

array([1, 2, 3])

In [5]:
type(arr) # Type of array is ndarray

numpy.ndarray

In [6]:
arr.shape

(3,)

In [11]:
list1 = [1,6,2.7,3]

In [12]:
arr2 = np.array(list1)

In [15]:
arr2 # all of them are float

array([1. , 6. , 2.7, 3. ])

In [16]:
type(arr2)

numpy.ndarray

In [20]:
arr3 = np.array([1.9,23.5,32.5],dtype="int") # We can specify data type with dtype argument

In [21]:
arr3

array([ 1, 23, 32])

### Zeros arrays

It returns a new array setting values to zero.

In [22]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [23]:
np.zeros(5,dtype=int)

array([0, 0, 0, 0, 0])

### Ones arrays

Return a new array of given shape and type, filled with ones.

In [24]:
np.ones(3)

array([1., 1., 1.])

In [25]:
np.ones((2,4),dtype="int") # 2 dimensional arrays

array([[1, 1, 1, 1],
       [1, 1, 1, 1]])

### Full arrays

It returns a new array of given shape filled with value.

In [28]:
np.full((2, 5), 6)

array([[6, 6, 6, 6, 6],
       [6, 6, 6, 6, 6]])

### Identity Matrix

It creates identity matrixes.

In [29]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Linear Series

In [31]:
np.arange(2,20,2) # It creates a linear series

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [32]:
np.arange(5,100,7)

array([ 5, 12, 19, 26, 33, 40, 47, 54, 61, 68, 75, 82, 89, 96])

In [33]:
np.linspace(0,1,10) # It creates 10 values between 0 and 1

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

### Distribution arrays - Random

In [34]:
np.random.normal(10,4,(2,5)) # It creates (2,5) arrays with 10 mean and 4 standart deviation

array([[ 3.86926818, 11.99867606, 11.9529883 ,  6.64976978, 12.67243724],
       [11.62745438, 10.33701513, 12.09609341,  8.24374913, 10.57274765]])

In [36]:
np.random.randint(2,10,9)

array([8, 6, 8, 4, 3, 6, 3, 5, 7])

## Array Indexing

Like lists in Python, Numpy arrays are reachable with square brackets.

In [356]:
a = np.arange(0,10)

In [357]:
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [358]:
a[0]

0

In [359]:
a[2:5] # It is also called slicing

array([2, 3, 4])

In [360]:
a[-1] # Last item of an array

9

In [361]:
a[1::2]

array([1, 3, 5, 7, 9])

In [362]:
a[::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

In [364]:
arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

In [374]:
arr

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [375]:
arr.shape

(3, 4)

In [376]:
arr[0]

array([1, 2, 3, 4])

In [377]:
arr[1,2]

7

In [378]:
arr[0,2:]

array([3, 4])

In [379]:
arr[0][2:] # Same with above.

array([3, 4])

In [380]:
arr[2][:2]

array([ 9, 10])

In [385]:
arr[:3,:] # All columns till 3rd row

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [386]:
arr[:,:3] # All rows till 3rd columns

array([[ 1,  2,  3],
       [ 5,  6,  7],
       [ 9, 10, 11]])

Arrays can be changed with assigning new value.

In [352]:
arr[2][3] = 100

In [353]:
arr

array([[  1,   2,   3,   4],
       [  5,   6,   7,   8],
       [  9,  10,  11, 100]])

Boolean indexing is also possible in arrays.

In [354]:
arr2 = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

In [355]:
arr2

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [335]:
arr2>2

array([[False, False,  True],
       [ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

In [336]:
arr2[arr2>2]

array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

## Subsets

In [387]:
arr = np.random.randint(20, size=(4,5))

In [388]:
arr

array([[ 7, 14, 16, 14, 16],
       [11, 10, 10,  8,  8],
       [ 6,  5,  6, 18,  6],
       [ 8,  7, 19, 11,  5]])

In [389]:
sub_arr = arr[:2,:3]

In [390]:
sub_arr

array([[ 7, 14, 16],
       [11, 10, 10]])

In [391]:
sub_arr[0,0] =99

In [392]:
sub_arr

array([[99, 14, 16],
       [11, 10, 10]])

In [394]:
arr 
# When we made changes in subset, main array has also changed. In order to avoid from it, we can use copy(function)

array([[99, 14, 16, 14, 16],
       [11, 10, 10,  8,  8],
       [ 6,  5,  6, 18,  6],
       [ 8,  7, 19, 11,  5]])

In [395]:
arr = np.random.randint(20, size=(4,5))

In [397]:
sub_arr = arr[:2,:3].copy()

In [398]:
sub_arr

array([[ 4, 18, 10],
       [ 5,  9, 12]])

In [399]:
sub_arr[0,0] =99

In [400]:
sub_arr

array([[99, 18, 10],
       [ 5,  9, 12]])

In [401]:
arr # Main array did not change.

array([[ 4, 18, 10,  2,  3],
       [ 5,  9, 12,  4, 17],
       [ 7, 15,  0, 13, 14],
       [15, 17,  0,  6, 18]])

## reshape() function

reshape() funtions make us able to change shape of an array. 

**Note:** The shape of an array is the number of elements in each dimension.

In [147]:
arry = np.arange(1,10)

In [148]:
arry

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [149]:
arry.shape

(9,)

In [150]:
reshaped_array = arry.reshape(3,3) # It converts 1-D array to 2-D array.

In [151]:
reshaped_array

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [152]:
reshaped_array.shape

(3, 3)

We can also reshape directly.

In [153]:
np.arange(1,10).reshape(3,3)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [154]:
a1= np.arange(1,10)

In [155]:
a1

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [156]:
a1.ndim # One dimensional array

1

In [157]:
a2 = np.arange(1,10).reshape((1,9))

In [158]:
a2

array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [159]:
a2.ndim # Two dimensional array

2

## Flattening the Arrays

Flattening array means converting a multidimensional array into a 1D array.

We can use **reshape(-1)** or **flatten()** function for that.

In [160]:
a2 = np.arange(1,6).reshape((1,5))

In [161]:
a2

array([[1, 2, 3, 4, 5]])

In [162]:
a2.ndim

2

In [163]:
a2.reshape(-1)

array([1, 2, 3, 4, 5])

In [164]:
a2.flatten()

array([1, 2, 3, 4, 5])

## Concatenation

In [167]:
a1 = np.array([10,11,12])
a2 = np.array([20,21,22])

In [168]:
np.concatenate([a1,a2])

array([10, 11, 12, 20, 21, 22])

In [169]:
arr1 = np.array([[1,2,3],[4,5,6]])
arr2 = np.array([[11,12,13],[14,15,16]])

In [170]:
np.concatenate([arr1,arr2])

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [11, 12, 13],
       [14, 15, 16]])

In [171]:
np.concatenate([arr1,arr2],axis=1) # If we want to concat them based on columns

array([[ 1,  2,  3, 11, 12, 13],
       [ 4,  5,  6, 14, 15, 16]])

## Splitting

In [172]:
arr = np.array([1,3,45,6,76,44,32,104])

In [173]:
arr

array([  1,   3,  45,   6,  76,  44,  32, 104])

In [177]:
np.split(arr,[3,5]) # Array splitted by index

[array([ 1,  3, 45]), array([ 6, 76]), array([ 44,  32, 104])]

In [178]:
a, b, c = np.split(arr,[3,5]) # We assigned these arrays to a, b, c

In [179]:
a

array([ 1,  3, 45])

In [180]:
b

array([ 6, 76])

In [181]:
c

array([ 44,  32, 104])

In [192]:
two_d_array = np.arange(28).reshape(7,4) #2 dimensional array

In [193]:
two_d_array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27]])

In [200]:
np.vsplit(two_d_array,[2,5]) # Vertical Splitting 2 dimensional array

[array([[0, 1, 2, 3],
        [4, 5, 6, 7]]),
 array([[ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]]),
 array([[20, 21, 22, 23],
        [24, 25, 26, 27]])]

In [213]:
v1, v2, v3 = np.vsplit(two_d_array,[2,5])

In [214]:
v1

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [215]:
v2

array([[ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

In [216]:
v3

array([[20, 21, 22, 23],
       [24, 25, 26, 27]])

In [217]:
two_d_array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27]])

In [218]:
np.hsplit(two_d_array,[2]) # Horizontal Splitting 2 dimensional array

[array([[ 0,  1],
        [ 4,  5],
        [ 8,  9],
        [12, 13],
        [16, 17],
        [20, 21],
        [24, 25]]),
 array([[ 2,  3],
        [ 6,  7],
        [10, 11],
        [14, 15],
        [18, 19],
        [22, 23],
        [26, 27]])]

In [219]:
h1, h2 = np.hsplit(two_d_array,[2]) # Assigning in Horizontal Splitting 2 dimensional array

In [220]:
h1

array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13],
       [16, 17],
       [20, 21],
       [24, 25]])

In [221]:
h2

array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15],
       [18, 19],
       [22, 23],
       [26, 27]])

## Sorting

In [253]:
arr = np.array([26,2,5,3,65,11,2,4,9])

In [254]:
arr

array([26,  2,  5,  3, 65, 11,  2,  4,  9])

In [255]:
np.sort(arr) # We can sort with sort() function

array([ 2,  2,  3,  4,  5,  9, 11, 26, 65])

In [256]:
arr # But it did not change in real array

array([26,  2,  5,  3, 65, 11,  2,  4,  9])

In [257]:
arr.sort() # It helps to change in real array

In [258]:
arr

array([ 2,  2,  3,  4,  5,  9, 11, 26, 65])

In [263]:
-np.sort(-arr) # If we want to sort in descending order

array([65, 26, 11,  9,  5,  4,  3,  2,  2])

In [264]:
uni = np.random.uniform(10,2,(3,3)) # It will give 9 value from uniform distribution, mean=10, standart deviation =2

In [265]:
uni

array([[3.30272564, 2.96082396, 8.23757843],
       [6.30406785, 4.25671931, 6.77303936],
       [9.96721933, 9.60798343, 8.60751805]])

In [266]:
np.sort(uni,axis=1) # It sorts row based

array([[2.96082396, 3.30272564, 8.23757843],
       [4.25671931, 6.30406785, 6.77303936],
       [8.60751805, 9.60798343, 9.96721933]])

In [267]:
np.sort(uni,axis=0) # It sorts column based

array([[3.30272564, 2.96082396, 6.77303936],
       [6.30406785, 4.25671931, 8.23757843],
       [9.96721933, 9.60798343, 8.60751805]])

We can also sort strings.

In [268]:
ar = np.array(['Python', 'Java', 'Julia', "Visual Basic"])

In [269]:
ar

array(['Python', 'Java', 'Julia', 'Visual Basic'], dtype='<U12')

In [270]:
ar.sort()

In [271]:
ar

array(['Java', 'Julia', 'Python', 'Visual Basic'], dtype='<U12')

## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

In [304]:
arry = np.linspace(0,50,12,dtype="int").reshape(4,3)

In [305]:
arry

array([[ 0,  4,  9],
       [13, 18, 22],
       [27, 31, 36],
       [40, 45, 50]])

In [306]:
arry.shape

(4, 3)

In [307]:
arry.ndim

2

In [308]:
adding_array = np.array([1, 0, 1])

In [309]:
adding_array

array([1, 0, 1])

In [310]:
adding_array.shape

(3,)

In [311]:
adding_array.ndim

1

In [312]:
arry + adding_array

array([[ 1,  4, 10],
       [14, 18, 23],
       [28, 31, 37],
       [41, 45, 51]])

The shape of arry is (4,3) and the shape of adding_array is (3,). The line above worked even though shapes are not same because of **broadcasting**. 

Broadcasting two arrays together follows these rules:

- If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.

- The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.

- The arrays can be broadcast together if they are compatible in all dimensions.

- After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
 
- In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension

![image.png](attachment:image.png)

This image is from: https://www.google.com/url?sa=i&url=https%3A%2F%2Fblog.knoldus.com%2Fnumpy-say-bye-to-loops%2F&psig=AOvVaw2d07vEIwpcpKiKkOVunx_5&ust=1607250033807000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCJitgY3Ptu0CFQAAAAAdAAAAABAZ

## Array Math

One of the most useful advantage of arrays is aritmathical operation opportunity. All mathematical operations can be done.

In [425]:
arr1 = np.array([1,2,3,4,5,6,7,8,9])

In [426]:
arr1-5 # You can also hear it as ufunc.

array([-4, -3, -2, -1,  0,  1,  2,  3,  4])

In [427]:
arr1*3 # You can also hear it as ufunc.

array([ 3,  6,  9, 12, 15, 18, 21, 24, 27])

In [428]:
arr2 = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

In [429]:
arr2

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [430]:
arr2+10

array([[11, 12, 13],
       [14, 15, 16],
       [17, 18, 19],
       [20, 21, 22]])

We can also use **add()** function for elementwise sum.

In [431]:
np.add(arr2,10)

array([[11, 12, 13],
       [14, 15, 16],
       [17, 18, 19],
       [20, 21, 22]])

In [432]:
arr2-1

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

We can also use **subtract()** function for elementwise difference.

In [433]:
np.subtract(arr2,1)

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [434]:
arr2*8

array([[ 8, 16, 24],
       [32, 40, 48],
       [56, 64, 72],
       [80, 88, 96]])

We can also use **multiply()** function for elementwise product.

In [435]:
np.multiply(arr2,8)

array([[ 8, 16, 24],
       [32, 40, 48],
       [56, 64, 72],
       [80, 88, 96]])

In [436]:
arr2/2

array([[0.5, 1. , 1.5],
       [2. , 2.5, 3. ],
       [3.5, 4. , 4.5],
       [5. , 5.5, 6. ]])

We can also use **divide()** function for elementwise division.

In [437]:
np.divide(arr2,2)

array([[0.5, 1. , 1.5],
       [2. , 2.5, 3. ],
       [3.5, 4. , 4.5],
       [5. , 5.5, 6. ]])

We can get power of array with **power()** function.

In [438]:
arr1 = np.array([1,2,3,4,5,6,7,8,9])

In [439]:
np.power(arr1,2)

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])

We can get mean of array with **mean()** function.

In [440]:
np.mean(arr1)

5.0

We can get sum of array with **sum()** function.

In [441]:
np.sum(arr1)

45

We can get minimum value of array with **min()** function.

In [442]:
np.min(arr1)

1

We can get maximum value of array with **max()** function.

In [443]:
np.max(arr1)

9

We can get variance of array with **var()** function.

In [444]:
np.var(arr1)

6.666666666666667

We can get standart deviation of array with **std()** function.

In [445]:
np.std(arr1)

2.581988897471611

We can get floor divide of array with **mod()** function.

In [419]:
np.mod(arr1,2)

array([1, 0, 1, 0, 1, 0, 1, 0, 1])

We can get absolute of array with **absolute()** function.

In [420]:
np.absolute(np.array([-1,2-4,-49,-2]))

array([ 1,  2, 49,  2])

We can get log of array with **log()** function.

In [421]:
np.log(arr1)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458])

In [423]:
np.log10(arr1) # We can change the base

array([0.        , 0.30103   , 0.47712125, 0.60205999, 0.69897   ,
       0.77815125, 0.84509804, 0.90308999, 0.95424251])

## Dot(Scalar) Product

 The **dot product** or **scalar product** is an algebraic operation that takes two equal-length sequences of numbers (usually coordinate vectors), and returns a single number.

In [279]:
arr1 = np.array([[3,5],[9,4]]) 
arr2 = np.array([[11,1],[23,12]]) 

In [280]:
np.dot(arr1,arr2)

array([[148,  63],
       [191,  57]])

Dot product is calculated with this way.

In [281]:
[[3*11+5*23, 3*1+5*12],[9*11+4*23, 9*1+4*12]]

[[148, 63], [191, 57]]