# NumPy - Numeric Python (Cheat Sheet)

### What is NumPy?
An easy way to create, navigate and manipulate arrays and matrices of data


### Why do I need to know this?
Technically you don't, it doesn't come up too much directly in Data Science however it is the underlying functionality powering pandas and data sciences in python

No need to memorize, just be familar with the concept, where to look at how to use

NumPy refers to lists as arrays

### Getting Started with NumPy

`pip install numpy`

`import numpy as np`

### Common Functions in NumPy


###### Create an Array
- Create an array (list) of a desired size

`np.arange(start, stop, step)`

- Long way to do this

`
[x for x in range(start, stop, step)]
`

###### Random

Get a random integer or get a list of random numbers between 2 numbers

`np.random.randint(1,99,10)`

`array([62, 89, 18, 41,  6, 67, 83, 48, 98, 56])`

Create a random array, results will be between 0 and 1

`np.random.rand(3)`

`array([0.37556892, 0.88167555, 0.21885929])`

Create a random Multi-Dimensional array, results will be between 0 and 1

`np.random.rand(3,3)`

`array([[0.43373283, 0.81953016, 0.60483176],
       [0.46460322, 0.04923629, 0.53102037],
       [0.34119516, 0.78186692, 0.88462712]])`



###### Re Shape a 1-dimensional array to a multi-dimensional array

Multi-dimensional array needs to match the number of items in the 1-dimensional array

`arr = np.arange(24)`

`arr.reshape(4,6)`

`array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])`
       

Get the shape or the dimensions of the array, Row x Column

`arr.shape`



###### Max and Min

Get Min and Max values

`arr.min()` `arr.max()`

Get Min and Max index locations

`arr.argmin()` `arr.argmax()`

Get the location of the element with the lowest value

`numpy.unravel_index(arr.argmin(), arr.shape)`


###### Indexing

Similar to lists and string for 1-dimensional arrays

`arr[0:3]` or `arr[:-2]`

For 2-dimensional Arrays, Single Bracket with comma (will see this with pandas

`arr[row, col]`

`arr[1:3,3:5]`

###### Comparison Operations

Selecting only values that match a certain condition (will see this with pandas data selection)

`arr > 5` or `arr[arr > 5]`

###### Math

We can use standard arithmetic values

`arr *,/,+,-,** 5`

Division by 0 -> Numpy will continue to run but will return either an **nan** or **inf**

nan -> 0 / 0
inf -> number / 0

For more, there is a list of Universal Math functions that can be run in NumPy


###### NaN and Inf

Pandas uses NumPy to fill in missing values using np.nan

If we fill in 'Nan', this is a string and will not match the correct data type

`np.nan`

`np.inf`

Methods to check if the value is Nan or Inf

Nan -> `np.isnull()`

Inf -> `np.isinf()`




In [204]:
import numpy as np

In [207]:
type(np.arange(0,10,2))

numpy.ndarray

In [208]:
type([i for i in range(0,10,2)])

list

In [213]:
import random

In [236]:
# print(random.randint(0,99)) # only a single value
print(np.random.randint(0,99, 3)) # mutliple values

[35 76  3]


In [237]:
[random.randint(0,99) for i in range(3)]

[0, 89, 75]

In [243]:
print('Python Way', random.random())
print('Numpy Way', np.random.rand(10))

Python Way 0.1918050913184296
Numpy Way [0.30645379 0.70987277 0.05846995 0.65470264 0.01327191 0.70869469
 0.16701761 0.24770221 0.08349398 0.76240729]


In [248]:
# Multi-dimensional array

np.random.rand(5,5)

array([[0.75404714, 0.79893542, 0.07215075, 0.42165836, 0.05452504],
       [0.41323985, 0.77436316, 0.0920994 , 0.09394923, 0.29070174],
       [0.71828116, 0.06040244, 0.76133552, 0.54604724, 0.92287656],
       [0.15792741, 0.73898026, 0.32618219, 0.59333127, 0.77988975],
       [0.66569663, 0.685403  , 0.44415641, 0.35491677, 0.66289767]])

In [279]:
arr = np.arange(0,48, 2)

In [280]:
print(arr)
arr.shape

[ 0  2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]


(24,)

In [281]:
arr_reshaped = arr.reshape(6,4)

In [282]:
arr_reshaped

array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22],
       [24, 26, 28, 30],
       [32, 34, 36, 38],
       [40, 42, 44, 46]])

In [283]:
arr_reshaped.shape

(6, 4)

In [284]:
print(arr)
print('Min', arr.min())
print('Max', arr.max())

print('\n', arr_reshaped)
print('Min in multidimensional', arr_reshaped.min())
print('Max in multidimensional', arr_reshaped.max())

[ 0  2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
Min 0
Max 46

 [[ 0  2  4  6]
 [ 8 10 12 14]
 [16 18 20 22]
 [24 26 28 30]
 [32 34 36 38]
 [40 42 44 46]]
Min in multidimensional 0
Max in multidimensional 46


In [298]:
# Where are the min max values, what location?

print(arr)
print('Min Index', arr.argmin())
print('Max Index', arr.argmax())

print('Min Actual Value', arr[arr.argmin()])
print('Max Actual Value', arr[arr.argmax()])


print('\n', arr_reshaped)
print('Min in multidimensional', arr_reshaped.argmin())
print('Max in multidimensional', arr_reshaped.argmax())
# print('Min Actual Value', arr_reshaped[arr_reshaped.argmin()])
# # print('Max Actual Value', arr_reshaped[arr_reshaped.argmax()])

print('Min Index Location', numpy.unravel_index(arr_reshaped.argmin(), arr_reshaped.shape))
print('Max Index Location',numpy.unravel_index(arr_reshaped.argmax(), arr_reshaped.shape))


[ 0  2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
Min Index 0
Max Index 23
Min Actual Value 0
Max Actual Value 46

 [[ 0  2  4  6]
 [ 8 10 12 14]
 [16 18 20 22]
 [24 26 28 30]
 [32 34 36 38]
 [40 42 44 46]]
Min in multidimensional 0
Max in multidimensional 23
Min Index Location (0, 0)
Max Index Location (5, 3)


In [305]:
print(arr)
print(arr[1:10])

[ 0  2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
[ 2  4  6  8 10 12 14 16 18]


In [317]:
# Mutlidimensional Indexing
print(arr_reshaped)
print(arr_reshaped[3,2])

[[ 0  2  4  6]
 [ 8 10 12 14]
 [16 18 20 22]
 [24 26 28 30]
 [32 34 36 38]
 [40 42 44 46]]
28


In [318]:
arr_reshaped[1:5,1:3]

array([[10, 12],
       [18, 20],
       [26, 28],
       [34, 36]])

In [326]:
arr[(arr > 10) & (arr < 40)]

array([12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38])

In [330]:
arr_reshaped[arr_reshaped > 10]

array([12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44,
       46])

In [340]:
nan_inf = arr_reshaped / 0

  nan_inf = arr_reshaped / 0
  nan_inf = arr_reshaped / 0


In [338]:
type(np.nan)

float

In [339]:
type(np.inf)

float

In [355]:
nan_inf

array([[nan, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf]])

In [356]:
np.isnan(nan_inf)

array([[ True, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False]])

In [357]:
np.isinf(nan_inf)

array([[False,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [358]:
nan_inf[np.isnan(nan_inf)]

array([nan])

In [359]:
nan_inf[np.isinf(nan_inf)]

array([inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf,
       inf, inf, inf, inf, inf, inf, inf, inf, inf, inf])