<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Introduction to NumPy

_Authors: Hank Butler (ATX), James Hampton (SEA)_


In [None]:
import numpy as np

____

## Address Students: Initial check for understanding

Students should be able to explain what the first line of code did.

Import does what? Telling Python to load an external module/library

What is NumPY? Numerical Python, used for arrays/array aggregations and calculations, etc.

What is the "as np" doing? Importing numpy under the alias np. Flesh out why we do this!

We imported the library NumPy (short for 'Numerical Python') and gave it the alias 'np'

____

# Learning Objectives

## By the end of this lesson you should be able to:

- Understand what an array is and why it's important
- Create an array in NumPy
- Perform operations and aggregations on arrays:
    - Loops
    - Slicing
    - Aggregations
    - Array Arithmetic (MATH)


# Table of Contents

#### 0. Definition + Purpose of an ndarray
    - rectangular array of numbers (rows + columns, tabular data)
    - slowness of loops
    - built on C which is faster than python
    - data types
    
#### 1. Creating arrays
    - from lists (and nested lists)
    - np.ones, np.zeros, np.linspace
    - np.random

#### 2. Array attributes
    - dtype
    - size
    - shape
    
#### 3. Array Computations
    - arithmetic operations (+, -, x, /)
    - numpy operations (np.square, np.sin, np.cos)
    - aggregating functions (.mean, .median, .unique + np. versions)

#### 4. Array Slicing and Ordering
    - slicing
    - boolean arrays / conditions
    - sorting arrays
    

___

## 0. Definition + Purpose of Arrays

    - rectangular array of numbers
    - Slowness of Loops, Speed
    - Built on C, faster than Python
    - Data Types
    
___

### Rectangular Array of Numbers

- Helps to think of all data fundamentally as an array of numbers
- What do I mean by an array?
    - A matrix in two-dimensions (rows x columns)
- Slowness:
    - Efficient storage and manipulation of data is a fundamental part of data science
    - Python is built on-top of C. What do I mean by this?
- Data Types:
    - NumPy Arrays are similar to Python Lists, but provide more efficient storage.




In [None]:
# Display built-in documentation (pop up window)

np?

# Arrays in NumPy

NumPy arrays can be built in lots of ways (more on this later), but the most basic arrays can be built from simple lists

In [None]:
# One-dimensional arrays are written with lowercase variables


In [None]:
# Arrays have some important attributes


In [None]:
# Multidimensional arrays are lists of lists and are written with capital variables


Notice that our shape `(2, 3)` is now a tuple of length **2**. This is because our array `X` has **2** rows and **3** columns, and so has two dimensions (rows and columns).

In [None]:
# A complicated example
# ALL sublists must be the same length (in this case, length 3)


Our array now has **3** dimensions--our shape tuple `(2,2,3)` is length **3**.

**Nearly all** of the arrays we will use will be **two dimensional** (rows and columns, i.e. "tabular data"). So don't worry too much about 3dim or higher-dim arrays for now.

----
### PRACTICE

Create a one-dimensional array with your own values

Create a two-dimensional array with your own values

- Note: you can start with a list then convert it to an array
----

# Array Computation

This is the **bread and butter** of NumPy. The entire reason NumPy exists is because it can perform these mathematical operations **with X-TREME efficiency**.

## Arrays And Single Values

In [None]:
# We can use standard Python arithmetic operations on numpy arrays



In [None]:
# Using a math operation with a single value "broadcasts" the operation to EACH ELEMENT in the array


In [None]:
# Can also do modulo


----
### PRACTICE

1. Create an array from 1 to 10 (INCLUSIVE!) with `np.arange`

2. Add 2 to each element in your array

3. Multiply each element by 4

4. Subtract 5 from each element

5. Divide each element by 3

6. Divide each element by 2 using floor division

----

In [None]:
###---SOLUTION---###



## Arrays of Equal Size
Operations between arrays of equal size are done component-wise

----
### PRACTICE

With `X = np.array([[1,1], [2, 2]])` and `Y = np.array([[1, 2], [1, 2]])`, 

compute:

- `X + Y`
- `X / Y`
----

## Summary Statistics and Aggregate functions

NumPy will calculate means and standard deviations for us!

In [None]:
# Two dimensions


NumPy can also compute sums:

# Slicing Arrays

Each dimension of an array behaves a lot like a Python list

----
### PRACTICE

Use indexing on `X = np.array([[1,2,3,4,5], [6,7,8,9,10], [11,12, 13, 14, 15], [16, 17, 18, 19, 20]])` to:

- Get `10` from X
- Get only the 2nd column of X
- Get the second and third rows, along with the first and second columns, of X
----

In [None]:
# Solutions


# Intermediate NumPy

## Creating Arrays Automatically

In [None]:
# An array of all 0s


In [None]:
# An array of all 1s


In [None]:
# An array of all the same value


One particular kind of array-creation is especially important: **ranges**.  Creating ranges in NumPy is similar to how we've seen ranges in Python.

In [None]:
# From start to stop - 1


In [None]:
# Step through by 2s


In [None]:
# Create a range of evenly-spaced values from START to STOP


`np.linspace` is going to be very helpful later in the course when we begin performing Gridsearch on model hyperparameters. 

**You will understand all of these terms in a few weeks!**

----
### PRACTICE

1. Create an array of 10 values, evenly spaced between 0 and 100
- Hint: Use np.linspace()

2. Create a 3x3 array of all zeros

3. Create an 2x4 array of your favorite number
----

In [None]:
### --- SOLUTION --- ###



## Random number generation with NumPy

We can use NumPy to generate **arrays filled with random values**.

In [None]:
# Create a 3x3 array of evenly distributed random values b/w 0 and 1



In [None]:
# Create a 3x3 array of normally distributed (bell-curve) random values
# with mean 0 and std dev 1



In [None]:
#3x3 array of random integers from the interval [0, 10]



## Special Values in NumPY

### np.nan

Stands for **"not a number"**

Commonly used to represent **missing values.**

This will appear later in Pandas (as NaN)

In [None]:
# NaN doesn't like to cooperate with numbers


### np.inf

'infinity'

Good for when you want to do a comparison that you want to fail or succeed.

EG. set your max value to np.inf and any number will be less than your max value.

In [None]:
# Check if element in array is pos or neg inf



## Intermediate Numpy Slicing

You can slice NumPy arrays using _boolean masks_. We'll go through some simple examples here.

In [None]:
# For one-liners, conditions must be separated by parentheses


## Intermediate NumPy functions

NumPy has efficient implementations of standard arithmetic operations (multiplication, addition, epxonentation, division). It also gives us access to some useful math functions

In [None]:
# Absolute value


In [None]:
# Trigonometric Functions
# Also cos and tan


In [None]:
# Logs

 # Values must be strictly positive (greater than 0)

In [None]:
# exponents -- 2 ** x, 3 ** x


### Exponents and Logs

In [None]:
#Inverse of exponentials are called logarithms, np.log gives you basic natural log



### Aggregate Functions

In [None]:
#Calling .reduce on .add returns the sum of all elements



# Not sure if we need to show these two, they were the first that popped up on my tutorial

In [None]:
#Callling .reduce on .multiply results in product of all elements


In [None]:
# Show you how much faster arrays are



### Other useful agg functions

- np.std => std deviation
- np.mean => mean of elements
- np.var => compute variance
- np.median => compute median
- np.percentile => compute rank-based stats of elements
- np.any => evaluate whether any elements are true
- np.all => evaluate whether all elements are true

---

## 4. Slicing, Ordering, Comparison Operators

---

### Slicing

Basically same as slicing we did with lists

In [None]:
# 2-dimensional array


### comparison operators / boolean arrays

In [None]:
# Two dimensional example



In [None]:
# Counting entries


In [None]:
#Boolean arrays and masks



### Sorting Arrays

In [None]:
# Sorting rows / columns



In [None]:
# Sort Each Column of X


In [None]:
#sort each row of X


### Reshaping Arrays

In [None]:
#-1 automatically decides the number of cols

---