# NumPy

 - What is NumPy?
 - Creating NumPy arrays
 - Indexing and advanced indexing
 - Slicing
 - Universal functions (Ufuncs)
 - Broadcasting
 - Masking, sorting and comparison
 - More

## What is NumPy? 

NumPy is the basis of Pandas and many other packages.
What makes NumPy such a powerful package is its data type (ndarray).
ndarray stands for "n-dimensional array", which is similar to a Python list.

However, it is much faster than a normal Python list.
A Python list can contain different types of data types, such as integers, strings, booleans, and even lists.

On the other hand, NumPy arrays can only contain one data type, and therefore it is not necessary to check the type of data type for each element of the array when performing calculations.

This feature makes NumPy a great tool for research and data science projects.

Before we begin, let's check the version of NumPy and Python.

In [None]:
# import numpy
import numpy as np

# sys was imported to check the python version
import sys 

# check the version of python and numpy
print('NumPy version:', np.__version__)
print('Python version',sys.version)

### Creating NumPy arrays

There are many ways to create arrays in NumPy. Here we will look at the most important ones:

In [None]:
# create a one-dimensional numpy array:
x = np.array([1, 2, 3])
print(x)
type(x)

In [None]:
# create an array of zeros:
np.zeros(3)

In [None]:
# create array of ones:
y = np.ones((3,5,7))
print(y)
y.shape

In [None]:
# create array of 3 random integers between 1 and 10:
np.random.randint(1,10, 3)

In [None]:
# create linearly spaced array:
np.linspace(0, 10, 5 )

In [None]:
# create two-dimensional array:
x = np.array([[1,2,3],
         [4,5,6],
         [7,8,9],
         [0,1,2]])
x

In [None]:
x[1][-1]

In [None]:
x[1,-1]

In [None]:
# create 3x4 array with random values between 0 and 1 (uniform distribution):
np.random.random((3,4))

In [None]:
# create 1D and 2D arrays a and b respectively:
a = np.array([1,2,3])
b = np.random.randint(0,10, (3,3))

print(a)
print(b)

In [None]:
a

In [None]:
# extend the array a:
a = np.append(a, 4)
a

In [None]:
b

In [None]:
np.append(b, [1,2,3])

In [None]:
# print shape and number of dimensions of the arrays:
print("Shape of a:", np.shape(a))
print("Shape of b:", np.shape(b))

print('Number of dimensions of array a:', np.ndim(a))
print('Number of dimensions of array b:', np.ndim(b))

In [None]:
# number of elements in the arrays:
print('Número de elementos de a:', np.size(a))
print('Número de elementos de b:', np.size(b))

### Indexing

In [None]:
# a is an 1D array that we created earlier:
a

In [None]:
# Access to the first element of a
# Next two lines give the same result:
print(a[0])
print(a[-4])

In [None]:
# Access to the last element of a
# Next two lines give the same result:
print(a[-1])
print(a[3])

In [None]:
# b is a 2D array created earlier:
b

In [None]:
# Access to the first row of b
# The following two lines give the same result:
print(b[0]) 
print(b[0,:])

In [None]:
# Access to the second column of b
b[:,1]

### Advanced indexing

In [None]:
# We will now create two new arrays: 
x = np.array(['a', 'b', 'c'])
y = np.array([['d','e','f'], 
              ['g', 'h', 'k']])

print(x)
print(y)

In [None]:
x

In [None]:
# Access to value c:
ind = [2,0]
x[ind]

In [None]:
y

In [None]:
# Advanced 2D array indexing
# Access to e,h values of y:
ind2 = ([0,1],[1])
y[ind2]

### Slicing
###### The character : is used for slicing

In [None]:
# create array of integers from 1 to 10:
X = np.arange(1, 11, dtype=int)
X

In [None]:
# Access the first two elements:
X[:2]

In [None]:
# Access third to fifth elements: 
X[2:5]

In [None]:
# Access elements in even positions: 
X[::2]

In [None]:
# Access elements in odd positions:
X[1::2]

In [None]:
# Create 2D array:
Y= np.arange(1,10).reshape(3,3)
Y

In [None]:
# Access first and second rows:
Y[:2,:]

In [None]:
# Access second and third columns:
Y[:, 1:]

In [None]:
# Access to elements 5 and 6:
Y[1,1:]

### Universal Functions (Ufuncs)

##### Press TAB after np. to see the list of available ufuncs. np.{TAB}

These functions allow fast calculations on NumPy arrays.

In [None]:
# We use the array we created earlier:
X

In [None]:
# Largest element of X:
np.max(X)

In [None]:
# Average value in X:
np.mean(X)

In [None]:
# Fourth power of each value:
a = np.power(X, 4)
print(a)
a.dtype

In [None]:
# Calculation of trigonometric functions:
print(np.sin(X))
print(np.tan(X))

In [None]:
# x^2 + y^2 = 1
np.square(np.sin(X)) + np.square(np.cos(X))

In [None]:
# A different way of computing the same:
np.sin(X)**2 + np.cos(X)**2

In [None]:
# All this can be applied to 2D arrays:
Y

In [None]:
np.multiply(Y, 2)

In [None]:
np.power(Y,4)

In [None]:
np.sin(Y)

In [None]:
np.square(np.sin(Y)) + np.square(np.cos(Y))

### Unique values

In [None]:
X = np.random.randint(1, 4, 15)
X

In [None]:
np.unique(X)

In [None]:
np.unique(X, return_counts=True)

In [None]:
uniq, counts = np.unique(X, return_counts=True)

In [None]:
print("Unique values:", uniq)
print("%:            ", 100*counts / counts.sum())

### Matrix operations

In [None]:
X

In [None]:
# Add 5 to each element:
X + 5

In [None]:
# A different way:
np.add(X, 5)

In [None]:
np.arange(3)

In [None]:
np.arange(3).reshape(3,1)

In [None]:
np.expand_dims(np.arange(3), 1)

In [None]:
np.expand_dims(np.arange(3), -1)

In [None]:
np.arange(3)[:, np.newaxis]

In [None]:
# Create new array Z: 
Z = np.arange(3)[:, np.newaxis]
Z

In [None]:
Y

In [None]:
# Multiplication of Y and Z:
np.multiply(Y, Z)

In [None]:
# Multiplication at matrix level:
Y.dot(Z)

### Sorting, Comparison and Masking

In [None]:
# create array of 10 random integers between 1 and 5:
x = np.random.randint(1, 5, 10)
x

In [None]:
# Ordering of elements in array x:
np.sort(x)

In [None]:
# create array of size (3,3) with random integers between 1 and 5
y = np.random.randint(1, 5, (3,3))
y

In [None]:
# sort values on the row axis (sort values within each column)
np.sort(y, axis=0)

In [None]:
# sort values on the column axis (sort values within each column)
np.sort(y, axis=1)

In [None]:
x

In [None]:
# Comparison operations on arrays:
# == , !=, < , >, >=, <=

# Results are Boolean
x > 3

In [None]:
# Filtering by masking:
x[x>3]

In [None]:
# Multiple comparisons:
x[(x <= 3) & (x>1)]

### Stacking numpy arrays (hstack and vstack)

In [None]:
A = np.array([[1,2], [3,4]])
A

In [None]:
B = np.array([[5,6], [7,8]])
B

In [None]:
C = np.array([[9,10], [11,12]])
C

In [None]:
np.hstack((A,B,C))

In [None]:
np.vstack((A,B,C))

### Reshaping

In [None]:
A

In [None]:
A.reshape(4)

In [None]:
A.reshape(1,4)

In [None]:
A.reshape((1,4))

In [None]:
A.reshape(4,1)

In [None]:
A.reshape(-1) # -1: "Python: compute that dimension"

In [None]:
A.reshape(1,-1)

In [None]:
A.reshape(-1,1)

In [None]:
# Flatten:
A.flatten()

### Copies and views

The difference between a copy and a view of an array is that the **copy** is a new array **independent of the original**, while **the view is not independent of the original**.

**The copy owns the data**, so that:
- Any changes made to the copy will not affect the original array.
- Any changes made to the original array will not affect the copy.

**The view does not own the data**, therefore:
- Any changes made to the view will affect the original array.
- Any change made to the original array will affect the view.

In [None]:
# Copy:

X = np.array([1, 2, 3, 4, 5])
Y = X.copy()
X[0] = 100

print(X)
print(Y)

In [None]:
# View:

X = np.array([1, 2, 3, 4, 5])
Y = X.view()
X[0] = 100 # we make changes in original

print(X)
print(Y)

In [None]:
# View:

X = np.array([1, 2, 3, 4, 5])
Y = X.view()
Y[0] = 100 # we make changes in original

print(X)
print(Y)

In [None]:
# to check if a numpy array owns its data:
# check if its **base** field is None

X = np.array([1, 2, 3, 4, 5])

Y = X.copy()
Z = X.view()

print(X.base==None)
print(Y.base==None)
print(Z.base==None)

### Traversal over a numpy array

In [None]:
X

In [None]:
for e in X:
    print(e)

In [None]:
A

In [None]:
# Row-wise:
for e in A:
    print(e)
    print()

In [None]:
# Column-wise:
for e in A.T: # T: transposed
    print(e)
    print()

### File recording and reading

#### Using NumPy:

In [None]:
A

In [None]:
np.save("data_A", A)

In [None]:
C = np.load("data_A.npy")
C

In [None]:
#### Using pickle:

In [None]:
# Save to the same file several matrices:

In [None]:
print(A, "\n\n", B)

In [None]:
import pickle
with open("data.pkl", "wb") as f:
    pickle.dump(A, f)
    pickle.dump(B, f)

In [None]:
with open("data.pkl", "rb") as f:
    C = pickle.load(f)
    D = pickle.load(f)

In [None]:
print(C, "\n\n", D)