# Intro to NumPy

Core data structure in NumPy is the **ndarray / n-dimensional array**. An array describes a collection of elements, similar to a list.

1. First we import NumPy:

In [6]:
import numpy as np

2. Secondly we convert list to an ndarray using the **numpy.array()** constructor to create a 1D ndarray

In [2]:
my_list = [5, 10, 15, 20]

data_ndarray = np.array([my_list])

The NumPy library takes advantage of a processor feature called **Single Instruction Multiple Data (SIMD)** to process data faster. SIMD allows a processor to perform the same operation, on multiple data points, in a single processor cycle:

As a result, the NumPy version of our code would only take two processor cycles — a four times speed-up! This concept of replacing for loops with operations applied to multiple data points at once is called **vectorization** and ndarrays make vectorization possible.

It's often useful to know the number of rows and columns in an ndarray. We can use ndarray.shape attribute to see this.

In [4]:
data_ndarray = np.array([[5, 10, 15], 
                         [20, 25, 30]])

In [6]:
print(data_ndarray.shape)  # 2 rows & 3 columns

(2, 3)


When selecting a row and column in an ndarray, follow the syntax:

sel_np = data_np [1,3] 

**Alternate formats**

In [2]:
# columns_1_4_7 = taxi[:,(1,4,7)]
# row_99_columns_5_to_8 = taxi[99,5:9]
# rows_100_to_200_column_14 = taxi[100:201,14]

-------------------------------------------------------

As we saw in the last two screens, NumPy ndarrays allow us to select data much more easily. Beyond this, the selection we make is a lot faster when working with **vectorized operations** because the operations are applied to multiple data points at once.

In [4]:
my_numbers = [
                [6, 5],
                [1, 3],
                [5, 6],
                [1, 4],
                [3, 7],
                [5, 8],
                [3, 5],
                [8, 4]
            ]

The result of adding two 1D ndarrays is a 1D ndarray of the same shape (or dimensions) as the original. In this context, ndarrays can also be called **vectors**, a term taken from a branch of mathematics called linear algebra. In this example, adding two vectors together, is called **vector addition**.

In [9]:
# convert the list of lists to an ndarray
my_numbers = np.array(my_numbers)

# select each of the columns - the result
# of each will be a 1D ndarray
col1 = my_numbers[:,0]
col2 = my_numbers[:,1]

# add the two columns
sums = col1 + col2

In [19]:
sums = my_numbers[:,0] + my_numbers[:,1]
added_numbers = []
for i in sums:
    added_numbers.append(i)
print("Added numbers: {} ".format(added_numbers))
    

Added numbers: [11, 4, 11, 5, 10, 13, 8, 12] 


- When we selected each column, we used the syntax ndarray[:,c] where c is the column index we wanted to select. Like we saw in the previous screen, the colon selects all rows.
- To add the two 1D ndarrays, col1 and col2, we simply use the addition operator (+) between them.

Functions vs Methods in NumPy

Calculate the minimum value of trip_mph --> Function: np.min(trip_mph) -->  Method: trip_mph.min()

------------------------------

**Statistics for 2D ndarrays**

To determine maximum value of each row, we use axis parameter: 

We'd need to use the axis parameter and specify a **value of 1** to indicate we want to calculate the maximum value for each row.

If we want to find the maximum value of each column, we'd use an axis **value of 0**:

Format: fare_sums = fare_components.sum(axis=1)

**We learned** --

- How vectorization makes our code faster.
- About n-dimensional arrays, and NumPy's ndarrays.
- How to select specific items, rows, columns, 1D slices, and 2D slices from ndarrays.
- How to apply simple calculations to entire ndarrays.
- How to use vectorized methods to perform calculations across either axis of ndarrays.- 