# Module 1 - Introducing Libraries: NumPy

## Introduction

#### _Our goals today are to be able to_: <br/>

- Identify and import Python libraries
- Identify differences between NumPy and base Python in usage and operation
- Create a new library of our own

#### _Big questions for this lesson_: <br/>
- What is a package, what do packages do, and why might we want to use them?
- When do we want to use NumPy?

### Activation:

![excel](excelpic.jpg)

Most people have used Microsoft Excel or Google sheets. But what are the limitations of excel?

- [Take a minute to read this article](https://www.bbc.com/news/magazine-22223190)
- make a list of problems excel presents

How is using python different?

## 1. Importing Python Libraries


In an earlier lesson, we wrote a function to calculate the mean of an list. That was **tedious**.

Thankfully, other people have wrote and optimized functions and wrapped them into **libraries** we can then call and use in our analysis.

![numpy](https://raw.githubusercontent.com/donnemartin/data-science-ipython-notebooks/master/images/numpy.png)

[NumPy](https://www.numpy.org/) is the fundamental package for scientific computing with Python. 


To import a package type `import` followed by the name of the library as shown below.

In [None]:
import numpy 

x=numpy.array([1,2,3])
print(x)

# Many packages have a canonical way to import them
import numpy as np

y=np.array([4,5,6])
print(y)

Because of numpy we can now get the **mean** and other quick math of lists and arrays.

In [None]:
example = [4,3,25,40,62,20]
print(np.mean(example))

Now let's import some other packages. We will cover in more detail some fun options for numpy later.

In [None]:
import scipy
import pandas as pd
import matplotlib as mpl

In [None]:
# sometimes we will want to import a specific module from a library
import matplotlib.pyplot as plt

# What happens when we uncomment the next line?
# %matplotlib inline

plt.plot(x,y)

In [None]:
# OR we can also import it this way
from matplotlib import pyplot as plt 
plt.plot(x,y)

Try importing the seaborn library as ['sns'](https://en.wikipedia.org/wiki/Sam_Seaborn) which is the convention.

In [None]:
#type your code here!

What happens if we mess with naming conventions? For example, import one of our previous libraries as `print`.


**PLEASE NOTE THAT WE WILL HAVE TO RESET THE KERNEL AFTER RUNNING THIS.**<br> Comment out your code after running it.


In [None]:
#your code here!

In [None]:
#Did we get an error? What about when we run the following command?

print(x)

#Restart your kernel and clear cells

#### Helpful links: library documenation

Libraries have associated documentation to explain how to use the different tools included in a library.

- [NumPy](https://docs.scipy.org/doc/numpy/)
- [SciPy](https://docs.scipy.org/doc/scipy/reference/)
- [Pandas](http://pandas.pydata.org/pandas-docs/stable/)
- [Matplotlib](https://matplotlib.org/contents.html)

## 2. NumPy versus base Python

Now that we know libraries exist, why do we want to use them? Let us examine a comparison between base Python and Numpy.

Python has lists and normal python can do basic math. NumPy, however, has the helpful objects called arrays.

Numpy has a few advantages over base Python which we will look at.

In [None]:
names_list=['Bob','John','Sally']
names_array=numpy.char.array(['Bob','John','Sally']) #use numpy.array for numbers and numpy.char.array for strings
print(names_list)
print(names_array)

In [None]:
# Make a list and an array of three numbers

#your code here
numbers_list =
numbers_array = 

In [None]:
# divide your array by 2

numbers_array/2

In [None]:
# divide your list by 2

numbers_list/2

Numpy arrays support the `_div_()` operator while python lists do not. There are other things that make it useful to utilize numpy over base python for evaluating data.

In [None]:
# shape tells us the size of the array

numbers_array.shape

In [None]:
# Selection and assignment work as you might expect
numbers_array[1]

In [None]:
numbers_array[1] = 10
numbers_array

Take 5 minutes and explore each of the following functions.  What does each one do?  What is the syntax of each?
- `np.zeroes()`
- `np.ones()`
- `np.full()`
- `np.eye()`
- `np.random.random()`

### Slicing in NumPy

In [None]:
# We remember slicing from lists
numbers_list = list(range(10))
numbers_list[3:7]

In [None]:
# Slicing in NumPy Arrays is very similar!
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
a

In [None]:
# first 2 rows, columns 1 & 2 (remember 0-index!)
b = a[:2, 1:3]
b

### Datatypes in NumPy

In [None]:
a.dtype

In [None]:
names_list.dtype

In [None]:
a.astype(np.float64)

### More Array Math

In [None]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print(np.add(x, y))

In [None]:
# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))

In [None]:
# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

In [None]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

In [None]:
# Elementwise square root; both produce the same array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(x ** .5)
print(np.sqrt(x))

Below, you will find a piece of code we will use to compare the speed of operations on a list and operations on an array. In this speed test, we will use the library [time](https://docs.python.org/3/library/time.html).

In [None]:
import time
import numpy as np

size_of_vec = 1000

def pure_python_version():
    t1 = time.time()
    X = range(size_of_vec)
    Y = range(size_of_vec)
    Z = [X[i] + Y[i] for i in range(len(X))]
    return time.time() - t1

def numpy_version():
    t1 = time.time()
    X = np.arange(size_of_vec)
    Y = np.arange(size_of_vec)
    Z = X + Y
    return time.time() - t1


t1 = pure_python_version()
t2 = numpy_version()
print("python: " + str(t1), "numpy: "+ str(t2))
print("Numpy is in this example " + str(t1/t2) + " times faster!")

In pairs, run the speed test with a different number, and share your results with the class.

## 3. Making our own library
![belle](https://media1.giphy.com/media/14ouz31oYQe1BS/giphy.gif?cid=790b76115d2fa99c4c4c535263dea9d0&rid=giphy.gif)

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import temperizer

## Example: Convert F to C

1. This function is already implemented in `temperizer.py`.
2. Notice that we can call the imported function and see the result.

In [None]:
# 32F should equal 0C
temperizer.convert_f_to_c(32)

In [None]:
# -40F should equal -40C
temperizer.convert_f_to_c(-40)

In [None]:
# 212F should equal 100C
temperizer.convert_f_to_c(212)

## Your turn: Convert C to F

1. Find the stub function in `temperizer.py`
2. The word `pass` means "this space intentionally left blank."
3. Add your code _in place of_ the `pass` keyword, _below_ the docstring.
4. Run these cells and make sure that your code works.

In [None]:
# 0C should equal 32F
temperizer.convert_c_to_f(0)

In [None]:
# -40C should equal -40F
temperizer.convert_c_to_f(0)

In [None]:
# 100C should equal 212F
temperizer.convert_c_to_f(100)

## Next: Adding New Functions

You need to add support for Kelvin to the `temperizer` library.

1. Create new _stub functions_ in `temperizer.py`:

    * `convert_c_to_k`
    * `convert_f_to_k`
    * `convert_k_to_c`
    * `convert_k_to_f`

    Start each function with a docstring and the `pass` keyword, e.g.:

    ```
    def convert_f_to_k(temperature_f):
        """Convert Fahrenheit to Kelvin."""
        pass
    ```

2. Add cells to this notebook to test and validate these functions, similar to the ones above.

3. Then, go back to `temperizer.py` to replace `pass` with your code.

4. Run the notebook cells to make sure that your new functions work.