<img src="../../figs/holberton_logo.png" alt="logo" width="500"/>

# Introduction to NumPy

`NumPy` (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the `PyData` Ecosystem rely on `NumPy` as one of their main building blocks.

## Using NumPy

Once you've installed `NumPy` you can import it as a library:

In [1]:
import numpy as np

Numpy has many built-in functions and capabilities. We won't cover them all but instead we will focus on some of the most important aspects of Numpy: vectors,arrays,matrices, and number generation. Let's start by discussing arrays.

## Numpy Arrays

NumPy arrays are the main way we will use Numpy throughout the course. Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly `1-d` arrays and matrices are `2-d` (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.

### Creating NumPy Arrays

#### From a Python List

We can create an array by directly converting a list or list of lists:

In [2]:
my_list = [1,2,3]
print(my_list)
np.array(my_list)

[1, 2, 3]


array([1, 2, 3])

In [3]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
print(my_matrix)
np.array(my_matrix)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### Built-in Methods

There are lots of built-in ways to generate Arrays

#### arange

Return evenly spaced values within a given interval.

In [4]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [5]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

#### zeros and ones

Generate arrays of zeros or ones

In [6]:
np.zeros(3)

array([0., 0., 0.])

In [7]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [8]:
np.ones(3)

array([1., 1., 1.])

In [9]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

#### linspace
Return evenly spaced numbers over a specified interval

In [10]:
np.linspace(0,10,3)

array([ 0.,  5., 10.])

In [11]:
np.linspace(0,10,50)

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

#### eye

Creates an identity matrix

In [12]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Random 

Numpy also has lots of ways to create random number arrays:

#### rand
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``.

In [13]:
np.random.rand(2)

array([0.52844475, 0.55786671])

In [14]:
np.random.rand(5,5)

array([[0.74359007, 0.60006596, 0.09022962, 0.88723963, 0.90472911],
       [0.82274633, 0.05747631, 0.00734971, 0.35069151, 0.15836297],
       [0.40780001, 0.73008254, 0.50147471, 0.55720255, 0.7637976 ],
       [0.2867937 , 0.54633798, 0.27405089, 0.90756191, 0.01493029],
       [0.61959802, 0.28245632, 0.38044427, 0.26051434, 0.31757599]])

#### randn

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

In [15]:
np.random.randn(2)

array([0.80433185, 0.37769725])

In [16]:
np.random.randn(5,5)

array([[-0.21208574, -1.03676064,  1.22190833,  0.75699079, -0.10362347],
       [-0.76155307,  0.09825306,  0.19556268,  1.12598011,  0.76910256],
       [ 1.2344827 , -0.40188102, -2.44294151, -0.1547488 , -0.39189008],
       [ 0.95737179,  0.74502995, -0.47471173,  1.36329749, -0.8726371 ],
       [-1.28017523, -0.5769546 ,  0.54340137,  0.63067699,  1.14718739]])

#### randint
Return random integers from `low` (inclusive) to `high` (exclusive).

In [19]:
np.random.randint(1,100,10)

array([12, 28, 33, 28, 79, 53, 95, 28, 27, 13])

### Array Attributes and Methods

Let's discuss some useful attributes and methods or an array:

In [20]:
arr = np.arange(25)
ranarr = np.random.randint(0,50,10)

print("Arr = ", arr)
print("Ran Arr = ", ranarr)

Arr =  [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]
Ran Arr =  [28 36 23 46  1 22  7 45 28 43]


###  Reshape
Returns an array containing the same data with a new shape.

In [21]:
arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

#### max,min,argmax,argmin

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [22]:
print("Data: ", ranarr)
print("Max = ", ranarr.max())
print("Min = ", ranarr.min())
print("Argmax = ", ranarr.argmax())
print("Argmin = ", ranarr.argmin())

Data:  [28 36 23 46  1 22  7 45 28 43]
Max =  46
Min =  1
Argmax =  3
Argmin =  4


### Shape

Shape is an attribute that arrays have (not a method):

In [23]:
# Vector
arr.shape

(25,)

## NumPy Indexing and Selection

In this part we will discuss how to select elements or groups of elements from an array.

In [24]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

### Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [26]:
arr[0]

0

In [27]:
arr[1:5]

array([1, 2, 3, 4])

In [29]:
arr[3:]

array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24])

### Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [30]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24])

In [31]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [32]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [33]:
#Change Slice
slice_of_arr[:]=99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [34]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This avoids memory problems!

In [35]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

### Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

In [36]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [37]:
#Indexing row
arr_2d[1]

array([20, 25, 30])

In [38]:
# Getting individual element value
arr_2d[1][0]

20

In [39]:
# Getting individual element value
arr_2d[1,0]

20

In [40]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

#### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [43]:
#Set up matrix
arr2d = np.zeros((10,10))
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [44]:
#Length of array
arr_length = arr2d.shape[1]
arr_length

10

In [45]:
#Set up array

for i in range(arr_length):
    arr2d[i] = i
    
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

Fancy indexing allows the following

In [46]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

In [47]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

### Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [48]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [49]:
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [None]:
arr[arr>2]

### Happy Coding