# Numpy

Numpy library to use array
Numpy is a library that contains data structures and functions that allow for the usage of arrays and matrices.

This allows to perform faster operations on much larger data sets. Which is benefitial for a data analyst.

### Why learn numpy?
- numpy arrays are orders of magnitude faster than simple lists in Python
- much of the technology to analyse data is based on numpy
- it packs novel functionalities embedded to the datatypes so mathematical computations are very efficient.
- Operations with matrices and arrays

### Topics
- Numpy arrays
  - Creating numpy arrays
  - get/fetch operations
  - set/motification operations
- Numpy matrices
  - Creating matrices
written in C but used through a wrapper in Python

## Creating Numpy Arrays

Numpy array work in a similar way as numpy lists. With the difference that the elements of each array can only have one type.

In [2]:
import numpy as np

list_1 = [1,2,3]

numpy_1 = np.array(list_1)
numpy_2 = np.array([2,3,4])

In [8]:
array_with_zeros = np.zeros(5)
arraw_with_ones = np.ones(10)
print(arraw_with_ones)

# you can create matrices
# a matrix is an array in R2
arraw_with_ones = np.ones((10,10))
print(arraw_with_ones)

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]


In [11]:
range_array = np.arange(30,20,-1)
print(range_array)

[30 29 28 27 26 25 24 23 22 21]


In [15]:
lower_bound = 0
upper_bound = 10
step = 6
# lower bound and upper bound are inclusive
linear_vector = np.linspace(lower_bound,upper_bound,step)
print(linear_vector)

[ 0.  2.  4.  6.  8. 10.]


In [68]:
# create random
print(np.random.rand(10))

[0.16704568 0.42386009 0.03984392 0.60245084 0.81827418 0.03991576
 0.64137376 0.19230055 0.71967193 0.22856088]


### Numpy Array types

In [65]:

linear_vector = np.linspace(-10000,10000, 20)
print(linear_vector)
# only gives you the type of the object
print(type(linear_vector))


# to see the type of the elements in the array you have to use dtype property
print(linear_vector.dtype)

# you can change the type by casting it with the astype method
linear_vector2 = linear_vector.astype(np.uint8)
print(linear_vector2)
print(linear_vector2.dtype)

#The type limits the range of values that you can use. 
# The number of bits establishes the rage and the resolution.
# the signed datatypes use some bits to encode negative values, 
# unsigned transforms negative values into positive to fit the encoding

[-10000.          -8947.36842105  -7894.73684211  -6842.10526316
  -5789.47368421  -4736.84210526  -3684.21052632  -2631.57894737
  -1578.94736842   -526.31578947    526.31578947   1578.94736842
   2631.57894737   3684.21052632   4736.84210526   5789.47368421
   6842.10526316   7894.73684211   8947.36842105  10000.        ]
<class 'numpy.ndarray'>
float64
[240  13  42  70  99 128 156 185 214 242  14  42  71 100 128 157 186 214
 243  16]
uint8


### Fetch and basic operations

In [24]:
#Fetch operation
# to get elements from a array you can do in the following way
# 1- provide the index like you do in the list
# 2- provide a range using lower_boudary(inclusive):upperboundary(excludsive):step

range_array = np.arange(300,20,-4)
print(range_array)

print(range_array[3])

print(range_array[3:6])


print(range_array[:3])

print(range_array[4:])

print(range_array[1::2]) #adds a stride by 2

print(range_array[6:3:-1]) #reversed

[300 296 292 288 284 280 276 272 268 264 260 256 252 248 244 240 236 232
 228 224 220 216 212 208 204 200 196 192 188 184 180 176 172 168 164 160
 156 152 148 144 140 136 132 128 124 120 116 112 108 104 100  96  92  88
  84  80  76  72  68  64  60  56  52  48  44  40  36  32  28  24]
288
[288 284 280]
[300 296 292]
[284 280 276 272 268 264 260 256 252 248 244 240 236 232 228 224 220 216
 212 208 204 200 196 192 188 184 180 176 172 168 164 160 156 152 148 144
 140 136 132 128 124 120 116 112 108 104 100  96  92  88  84  80  76  72
  68  64  60  56  52  48  44  40  36  32  28  24]
[296 288 280 272 264 256 248 240 232 224 216 208 200 192 184 176 168 160
 152 144 136 128 120 112 104  96  88  80  72  64  56  48  40  32  24]
[276 280 284]


In [26]:
#important methods and functions
sz = range_array.size
print(sz)

sh = range_array.shape
print(sh)

mx = np.max(range_array)
print(mx)

mn = np.min(range_array)
print(mn)

av = np.average(range_array)
print(av)

md = np.median(range_array)
print(md)

pc = np.percentile(range_array,85)
print(pc)

70
(70,)
300
24
162.0
162.0
258.6


### Array modification operations

In [52]:
#append
import numpy as np
my_array = np.arange(0,6)
print(my_array)

[0 1 2 3 4 5]


In [53]:
# you can replace elements in the array by providing their index and the substitute value

my_array[4] = 10
print(my_array)

my_array[0:3] = [51, 52, 53]
print(my_array)

[ 0  1  2  3 10  5]
[51 52 53  3 10  5]


In [54]:
# append, insert and delete return new arrays
# you can append values to the array by using the method append

my_array = np.append(my_array,88)
print(my_array)

my_array = np.append(my_array,[91,92,93])
print(my_array)

# to insert values at a particular index
my_array = np.insert(my_array,5,[200,201,202])
print(my_array)

# to remove you use the delete function
my_array = np.delete(my_array, [5,6,7,8])
print(my_array)

[51 52 53  3 10  5 88]
[51 52 53  3 10  5 88 91 92 93]
[ 51  52  53   3  10 200 201 202   5  88  91  92  93]
[51 52 53  3 10 88 91 92 93]


In [58]:
# operations with a scalar
print(my_array+2)
print(my_array-2)
print(my_array*2)
print(my_array/2)
print(my_array//2)
print(my_array%2)
print(my_array**2)

[53 54 55  5 12 90 93 94 95]
[49 50 51  1  8 86 89 90 91]
[102 104 106   6  20 176 182 184 186]
[25.5 26.  26.5  1.5  5.  44.  45.5 46.  46.5]
[25 26 26  1  5 44 45 46 46]
[1 0 1 1 0 0 1 0 1]
[2601 2704 2809    9  100 7744 8281 8464 8649]


In [79]:
my_random_array = (np.random.rand((5))*10).astype(np.uint8)
print(my_random_array)
my_sorted_array = np.sort(my_random_array)
print(my_sorted_array)

[5 5 2 4 8]
[2 4 5 5 8]
