# Introduction to NumPy arrays

We've done some cool things with Python `list`s, but doing math with them gets tedious.  Today we'll learn the basics of NumPy, which is a numerical computation library for Python (hence the name).

First we'll import the NumPy library, using the short name `np` (just like using `plt` for `matplotlib.pyplt`):

In [None]:
import numpy as np

## First steps

In NumPy, the equivalent of a Python list is an array. In fact, we can create an array from a list by passing it as the argument to the np.array() function:

In [None]:
my_list = [1, 2, 3, 4, 5, 10]
my_array = np.array(my_list)

* What is the type of `my_array`?
* Does `len()` work like you expect?
* What happens if you index the array using square brackets (like you would for a list)?

*Challenge*:
* What does `my_array.size` do?  What about `my_array.shape`?
* What about this array: `np.array([[1, 2, 3],[4, 5, 10]])`?

## Easier math

Assume we have `my_array` as defined above.  Try the following statements and figure out what they do:
* `my_array` + 5
* `my_array` ** 2
* `my_array` + `my_array`

In [None]:
# Your code here...

*Challenge*:
* What happens if you put a string into a NumPy array?  Why do you think this is?
* What happens if you put a floating-point value into `my_array`?  Why?  What if you `add` a floating-point number to an array?  (Try printing `my_array.dtype`.)

## Creating arrays

Numpy has many ways to create special arrays.  A couple of them are:
* `linspace()` will create a linearly-spaced array of numbers.  It takes three arguments: `start`, `end`, and `number`.  `start` and `end` specify the ends of the range.  Unlike `range`, it is *inclusive* of the endpoint.  `number` specifies the number of points to use in the spacing (*not* the step).
* `zeros()` will create an array of the given size, containing all zeros.

In [None]:
x = np.linspace(-1, 1, 21)
print(x)

## Plotting with NumPy

Matplotlib was created with NumPy in mind, so you can pass NumPy arrays directly to `plot()`, just like we did with lists.

With that, we've got enough to replace your graphing calculator:

In [None]:
x = np.linspace(-5, 5, 20)
y = x ** 3 - 3 * x**2 - 15 * x + 4

import matplotlib.pyplot as plt
plt.plot(x, y, '-x')
plt.show()

## Booleans in NumPy

What happens if you use a comparison operator on a NumPy array?

In [None]:
b = np.array([-10, 5, 14, -2, 11, 12])

Now we get to one of the coolest features of NumPy: logical indexing.

If you use an *array of boolean (True/False) values* as the index for another array, then NumPy will select the elements where the index array is True:

In [None]:
select = np.array([True, True, False, False, False, True])
b[select]

Write NumPy code to select only the values >= 0.

## What NumPy can't do

NumPy arrays have many benefits over Python lists, but they don't do everything.  Specifically:
* All elements in the array must have the same type.
* You cannot remove items from an array.  However, you can create a new array, and just select the items you want to keep.
* You cannot `append` to an array.  However, you can create a new array by mashing two arrays together ("concatenation").

In [None]:
a = np.array([1, 2, 3])

# Note the extra parenthesis around the things being concatenated together!
# This is because the function only accepts one argument, but that argument can
# be a "bundle" (technically called a "tuple") of things to concatenate.
#np.concatenate((a, [5]))

a[1] = []
print(a)

## Bonus: How fast is numpy (versus lists)?

Numpy is faster than lists for certain operations.  To find out how much faster, we can use the IPython "magic" command `%%timeit` which executes a notebook cell as many times as it can in 2 seconds and reports how fast the cell runs on average.  Note that results are not saved (i.e., you'll have to run the cell without the magic if you want to use the results in a later cell.)

First, let's see how long it takes to create a list/array containing integers from 0 to 1000000:

In [None]:
%%timeit
x_list = list(range(1000001))

# There are worse ways to do this... how long do they take?

In [None]:
%%timeit
x_arr = np.linspace(0, 1000001, 1000001)

Write code to multiply every item in a list/array by 2, and compare the speeds.

In [None]:
# Run this cell to set up the lists for testing
x_list = list(range(1000001))
x_arr = np.linspace(0, 1000001, 1000001)

In [None]:
%%timeit
# Your code to multiply every item in a list by 2...

In [None]:
%%timeit
# Your code to multiply the values in a NumPy array by 2...