# Day 6 Pre-Class: Introduction to NumPy
<img src="https://www.kdnuggets.com/wp-content/uploads/numpy-logo.jpg" width=300px>

### Goals for this pre-class assignment
By the end of this assignment, you should be able to:
* Import NumPy into Python, create and manipulate NumPy arrays.
* Use NumPy to do simple calculations with arrays

### Assignment instructions

___
### A brief aside on Jupyter notebooks
Before you get started with this notebook, take a few minutes to make sure you are feeling comfortable with **Jupyter notebooks** at this point in the course. 

* It is really important to keep in mind that Jupyter notebooks are based on cells. This is a markdown cell, for example. Because you can execute the cells in any order you wish, *it is up to you to keep track of what has been done, or not*. If it helps, you can make an initial habit of running all cells from the top down, although there will be cases where you won't want to do that. 

* It is worth taking a few minutes to get better at keyboard shortcuts, which will make using the notebooks much more efficient; and, learning some markdown is also helpful. Both are in this [webpage](https://murillogroupmsu.com/introduction-to-jupyter-notebooks/), and you can also do a web search to find other tutorials on these topics. 

* One of the best things about the notebooks is **rapid prototyping of code**. This means: try something quickly, delete it and move on. With the keyboard shortcuts, this becomes quite fast. Suppose you want to know what "4 + 5 * 3" will give. The steps are simple: create a new cell, type that expression in, shift-enter, examine the result, delete that cell, use the result. If you learn the keyboard shortcuts, this prototyping/debugging is fast:
    * press "esc" to get out of the cell you are in (go into command mode)
    * press "b" to create a cell below (or "a" for above - your choice)
    * press "return" to enter/activate the cell (go into edit mode)
    * type in your test (e.g., 4 + 5 * 3 or print(5//3))
    * press "shift-enter"
    * press "esc" to get out of the interior of the cell (go from edit to command mode)
    * press "x" to delete that temporary cell
    
This may seem like a lot of steps, but once you memorize the shortcuts, you can rapidly create, use and delete cells. When you have an error, this is best way to quickly isolate parts of the code to test them.

Try it now!

___
## Getting familiar with NumPy

Okay, now let's get to the subject of this pre-class assignment: **NumPy**. 

As you learned in the previous course material, Python comes with a large number of extremely useful libraries. In fact, Python itself is a rather small language on its own; it's true power is the myriad of libraries developed for it. A core library is [NumPy](https://en.wikipedia.org/wiki/NumPy), which translates to "Numerical Python". NumPy allows you to do mathematical operations both more easily (from a coding perspective) and faster (in the sense of how long you need to wait for the result). 

As with other libraries, you include NumPy through an import command. Execute the next cell.

In [None]:
import numpy as np

Note that NumPy gets imported, but also gets renamed as "np". You don't need to use "np" if you don't want to; but,  in the Python community, everyone else does and it makes it easier for people to read each other's code if we all use the same conventions. NumPy is vast.

Why do we have the dot notation in which we will make calls to libraries with `np.`? The reason is that Python is huge and some of the libraries have overlapping functionalities. For example, the math module you already learned contains libraries that *also* exist in NumPy. By using the dot notation, you can be sure you are using the library of _your_ choice, and _you_ are free to switch between `math` and `np` throughout your code. Sometimes you will use dots _twice_, as in `np.random.randn(500)`.

For now, we will focus on the key element of NumPy: the **array**.

**Now**, watch the following video to learn about the basics of *NumPy arrays* and how they differ from *lists*, which have been your main tool for storing information up to this point.

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo("g7epZeDA_lQ",width=640,height=360)

**Question**: Explain, in your own words, some of the similarities and differences between standard Python lists and the NumPy arrays.

< Answer here >

---
## Manipulating NumPy arrays and performing mathematical operations

Now that you understand a bit more about the NumPy array object in Python, watch the following video to understand how we can manipulate NumPy arrays and use them to perform mathematical operations, which is precisely what NumPy arrays were built for!

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo("V2C9expTF1o",width=640,height=360)

**Question**: If you had two arrays of the same length and wanted to create a new array that is the product of the first two, how would you do that?

In [None]:
# Write your code example here


**Question**: What functions can I use to find the sum, minimum value, maximum value, and average value of a NumPy array? How do I call these functions? (Provide some example code in your response).

In [None]:
# Write your code example here


___
## Working with NumPy Arrays ##
___

As you've learned at this point, at the core of NumPy is a data type called an array. An array is like a Python list, but has some very different features. It is best to not confuse them, even if sometimes they might be interchangable. All other opertions in NumPy, and many other Python libraries used for computations, will assume you are using this array type.

The first thing you need to learn to do is create an array, and there are several ways of doing this depending on your goals. Let's learn two related methods now.
* `linspace`
* `arange`

Note that there is a lot of documentation on the web, such as [this](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html), and you should get used to using Google to find help. For example, right now, type into your search engine "python arange example code"; you should see lots of places to find ideas/help. 

Okay, let's give it a try. (If you get an error that says 'np' is not defined, you need to make sure you've executed a cell that contains the import code, `import numpy as np`)

In [None]:
# comparing arange to linspace
my_array_range = np.arange(0,10,1) # Make sure you understand how this line is different...
my_array_linspace = np.linspace(0,9,10) # ... than this line
print("Using arange I get:", my_array_range)
print("Using linspace I get:", my_array_linspace)

Now, your turn. Notice that both `linspace` and `arange` take three arguments (the values in the parentheses (...)). Using Google and some trial and error, write some code cells below that illustrate what these three arguments do in each case, and give some examples of your own. Don't forget to use the "dot notation" of putting `np.` before the method to tell Python to go look in NumPy for the method, not somewhere else.

Give examples for both that vary each of the three parameters (comment them!). Put your code here:

In [None]:
# Put your code here

# What happens in arange and linspace when you only give two arguments? Try that here:


Now, in your own words, describe what these functions do. When do you think you would chose to use one over the other? Write your answer in this markdown cell.

< Answer here >

___
#### Array operations
___

The arrays look a lot like lists. But, a key difference is that you **cannot** have mixed types inside of the array. For example, try this code:

In [None]:
my_list = [1, 3.1415, 'CMSE'] # this list has three different types in it: integer, float and string
conv_to_array = np.array(my_list) # this converts a list to an array
print(conv_to_array)

What are the types of the elements in the new array? Are they the same as the original list? Are they all the same as each other? Try modifying the list with different initial variables types to see if you can figure out the rule Python uses for setting the element type in the array when the conversion step happens.

In [None]:
# code to examine different types of lists to see what np.array does to them
my_list = [1, 3.1415, 'a']    # If there is a string
conv_to_array = np.array(my_list)
print(conv_to_array)    # It will print out all as strings

my_list = [1, 3.1415, 400]    # If there is a float with no strings
conv_to_array = np.array(my_list)
print(conv_to_array)    # It will print out all as floats

my_list = [1, 3, 5]    # If there is no float and strings
conv_to_array = np.array(my_list)
print(conv_to_array)    # It will print out all integers

my_list = [1, 3, 'a']    # If there is a string with no floats
conv_to_array = np.array(my_list)
print(conv_to_array)    # It will print out all as strings

# It will prioritize: String > Float > Integer

Let's try some mathematical operations on lists and arrays. 

* Make a list that contains the numbers $0$ through $9$ and do the same thing in NumPy; store them in a variables with a well chosen names. 
* Divide the list and the array by $3$.
* `print` the result for both.
* Open a new markdown cell and describe the behavior you see. 

Do that in this cell:

In [None]:
# Put your code here.


What NumPy arrays do is provide fast mathematical operations on one data type (e.g., floats), whereas lists are a more general container for information for which mathematical operations aren't useful. For this reason, in CMSE 201, we will use the array much more than the list, although courses on other topics might be the reverse. 

___
#### Let's see the results!
___

Another feature of Python is that it contains its own math libraries, highly tuned to be used with NumPy arrays. As always, to access those you need to use the "dot notation". For example, run this cell:

In [None]:
# This code cell uses four NumPy methods:
# linspace, pi, sin and cos
my_numbers = np.linspace(0, 2*np.pi, 100)
my_simple_function = np.sin(my_numbers) * np.cos(my_numbers)
print( my_simple_function )

Describe what you see:

< Answer here >

We recently learned another library: `matplotlib`. Let's combine them. The first thing you need to do is `import` that library; do that here, ensuring that the plots appear in the notebook itself: 

In [None]:
# import matplotlib and make it run in the notebook
import matplotlib.pyplot as plt
%matplotlib inline

Now, plot your `my_simple_function` array versus your original `my_numbers` array.

In [None]:
# plot of my function
plt.plot(my_numbers, my_simple_function)
plt.grid()

___
#### Some statistics operations.
___

Finally, let's learn just a few of the easy operations in NumPy for doing statistics. Run the cell below to get started, and add comments to each line to describe what it does.

In [None]:
bell_curve = np.random.randn(500) # Gives an array with 500 random values according to Normal Distribution of mean 0 and std 1
plt.hist(bell_curve)  # Plot a histpgram with given data

One very useful strategy for developing your codes is the ability to create fake data, which is essentially what we just did. This is useful when you want to test your code using something for which you know the answer; then, you can apply the functioning code to real-word data. 

Using the fake data you just created with `randn`, try these using the dot notation. Print the result of each of these operations on the bell curve "data": 
* `np.sum`
* `np.mean`
* `np.median`
* `np.std`

In [None]:
# Put your commented code here:


---
## Watch this video to learn more

Almost done! Now that you are a familiar with some pieces of NumPy, **watch this video** to learn even more. While parts of this video will cover some of the content you've already learned, there are a few new pieces embedded in it as well. While you watch the video, **make note of the new things you learn** that go beyond what has already been covered in this assignment. Finally, don't forget where all of these NumPy videos live. As we progress further in the course, you may wish to come back to these videos and watch them again to remind yourself how NumPy arrays work.

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo("BTXyE3KLIOs",width=640,height=360)

**Question**: What new bits of information about NumPy arrays were included in this video? Did you learn about any new functions or features of NumPy arrays? Write down all of the new things you discovered in the cell below.

< Answer here >

---
## Assignment wrap-up

I hope you enjoy all these videos and exercises! Make sure you try (doesn't matter if you fail along the way!) everything and **take notes** of what you're confused of.

Be sure to **send me an email or text** of *all* the things you understand (and most importantly) don't understand! I'll make sure to address them and emphasize more on our in-class session.

-----
# Congratulations, you're done with your first in-class assignment!

&#169; Copyright 2020,  Amani Ahnuar