# Pandas and NumPy Fundamentals

To use the NumPy library, we first need to import it into our Python environment. NumPy is commonly imported using the alias np:

In [1]:
import numpy as np

The NumPy library takes advantage of a processor feature called Single Instruction Multiple Data (SIMD) to process data faster. SIMD allows a processor to perform the same operation, on multiple data points, in a single processor cycle:

![SIMD](https://s3.amazonaws.com/dq-content/289/vectorized.gif)

As a result, the NumPy version of our code would only take two processor cycles — a four times speed-up! This concept of replacing for loops with operations applied to multiple data points at once is called vectorization and ndarrays make vectorization possible.

So far, we've only practiced creating one-dimensional ndarrays, but ndarrays can also be two-dimensional:
![NDimensional](https://s3.amazonaws.com/dq-content/289/Two_Dim.svg)

### NYC Taxi-Airport Data
We'll work with a subset of this data - approximately 90,000 yellow taxi trips to and from New York City airports between January and June 2016. Below is information about selected columns from the data set:
* `pickup_year`: The year of the trip.
* `pickup_month`: The month of the trip (January is 1, December is 12).
* `pickup_day`: The day of the month of the trip.
* `pickup_location_code`: The airport or borough where the trip started.
* `dropoff_location_code`: The airport or borough where the trip finished.
* `trip_distance`: The distance of the trip in miles.
* `trip_length`: The length of the trip in seconds.
* `fare_amount`: The base fare of the trip, in dollars.
* `total_amount`: The total amount charged to the passenger, including all fees, tolls and tips.

**Import the Data**

In [6]:
import csv 
import numpy as np

f = open('data/nyc_taxis.csv')
taxi_list = list(csv.reader(f))

# remove header
taxi_list = taxi_list[1:]

# convert all values to floats
converted_taxi_list = []
for row in taxi_list:
    converted_row = []
    for item in row:
        converted_row.append(float(item))
    converted_taxi_list.append(converted_row)
    
# convert to numpy array
taxi = np.array(converted_taxi_list)

In [7]:
type(taxi)

numpy.ndarray