# getting started with numpy

**Aditional resources:**

* Python Data Science Handbook: https://jakevdp.github.io/PythonDataScienceHandbook/
* Numpy quickstart tutorial: https://numpy.org/doc/stable/user/quickstart.html

<img src = "memes/tom.PNG">

# libraries

In [None]:
import numpy

import seaborn
import matplotlib.pyplot as plt

import time
import sys
import os

In [None]:
numpy.__version__

# python lists and numpy arrays

In [None]:
python_list = [ 56,  69,  88,  67, 113,  96,  73,  83, 172, 173]
numpy_array = numpy.array(python_list)
numpy_array_float = numpy.array(python_list, dtype = 'float')

In [None]:
python_list

In [None]:
numpy_array

In [None]:
numpy_array_float

In [None]:
type(python_list)

In [None]:
type(numpy_array), type(numpy_array_float)

*First element of each collection*

In [None]:
python_list[0], numpy_array[0], numpy_array_float[0]

In [None]:
type(python_list[0]), type(numpy_array[0]), type(numpy_array_float[0])

*Sum of elements*

In [None]:
sum(python_list), sum(numpy_array), numpy.sum(numpy_array)

*Minimum and maximum*

In [None]:
min(python_list), numpy.min(numpy_array), numpy_array.min()

# simple operations

*Reciprocate: 1/n for every item*

In [None]:
rec_python_list = []

for value in python_list:
    rec_python_list.append(1/value)

rec_python_list

In [None]:
[1/value for value in python_list]

In [None]:
# 1/python_list

However, numpy arrays allow for this kind of behaviour

In [None]:
numpy_array

In [None]:
1 / numpy_array

*Adding two lists and numpy arrays*

In [None]:
a = [1,2,3,4,5]
b = [10,20,30,40,50]

a + b

In [None]:
a = numpy.array([1,2,3,4,5])
b = numpy.array([10,20,30,40,50])

a + b

*Multiplication*

In [None]:
python_list * 2

In [None]:
numpy_array * 2

# more complicated operations

*Sigmoid function*
\begin{equation*}
\frac{1}{(1+e^{-x})}
\end{equation*}

In [None]:
numpy_array = numpy.array([range(10)])
numpy_array

In [None]:
sigmoid = 1 / (1 + numpy.exp(numpy_array))
sigmoid

# multidimensional arrays

In [None]:
numpy_array = numpy.array(range(100))
numpy_array = numpy_array.reshape(20,5)
numpy_array

In [None]:
numpy_array[5,3]

In [None]:
numpy_array.T

In [None]:
numpy_array @ numpy_array.T

# indexing

In [None]:
example_array = numpy.random.randint(20, 500, 200)
example_array

In [None]:
example_array > 200

In [None]:
example_array[example_array > 200]

# atributes, methods and functions

In [None]:
example_array = numpy.random.normal(20, 5, 1000)

In [None]:
seaborn.distplot(example_array)

In [None]:
example_array = example_array.reshape(250,4)
example_array

*some numpy array attributes*

In [None]:
example_array.size

In [None]:
example_array.shape

In [None]:
example_array.ndim

*some numpy array methods*

In [None]:
example_array.sum()

In [None]:
example_array.min()

In [None]:
example_array.argmin()

*some numpy functions*

In [None]:
numpy.mean(example_array)

In [None]:
numpy.sort(example_array)

In [None]:
numpy.sort(example_array, axis = 0)

In [None]:
numpy.sum(example_array)

*Adding columns*

In [None]:
numpy.sum(example_array, axis = 0)

*Adding rows*

In [None]:
numpy.sum(example_array, axis = 1)

# data types

In [None]:
python_list = [25, "Manuel", [1,2,3,4,5], 0.45]
python_list

In [None]:
[type(value) for value in python_list]

In [None]:
numpy_array = numpy.array(python_list)
numpy_array

In [None]:
[type(value) for value in python_list]

In [None]:
# numpy_array / 5

The numpy array takes the most "defensive" data type and assigns it to all the values given, This could generate some unwanted behaviour.

# performance comparison

What happens when we have really long lists?

In [None]:
lista = list(range(100000000))
numpy_array = numpy.array(lista)

In [None]:
len(lista), len(numpy_array)

*How much time does it take to calculate the sum*

In [None]:
def suma(lista):
    suma = 0

    for value in lista:
        suma += value 
    
    return suma

%timeit suma(lista)

In [None]:
%timeit sum(lista)

In [None]:
%timeit numpy.sum(numpy_array)

<img src= "memes/vectorized.PNG">

# be careful with copies

In [None]:
a = numpy.array(1)
b = a

a,b

In [None]:
b = b + 1
a, b

In [None]:
a = numpy.array(1)
b = a

In [None]:
b += 1
a,b 

In [None]:
a = numpy.array(1)
b = a.copy()
a, b

In [None]:
b +=1
a, b

<img src="https://i.redd.it/xz0rz5gw4lf21.png">

When making a copy of a numpy array it's better to be safe using the `.copy()` method

# Contact

Manuel Montoya 
* Mail: manuel.montoya@pucp.edu.pe
* Linkedin: https://www.linkedin.com/in/manuel-montoya-gamio/

Numpy pain: https://www.reddit.com/r/ProgrammerHumor/comments/aouyj1/when_you_program_python_for_a_year_and_realize/ 