# Arrays & `numpy`

### Overview:
This module covers the basics of `array` data types using the `numpy` module, including common `numpy` funtions like:
1. `sum`
2. `average`
3. `median`
4. `min`
5. `max`
6. `sort`
7. `linspace`
8. `arange`

### Background requirements:
1. Printing
2. Variables
3. Lists and indexing

### Lesson:

`numpy` is a python `module`, which is a fancy word for a set of code (developed by others) which you can use yourself. 

**Modules are a collection of tools written in python** that is maintained and developed by people online. Often these modules come by default with any installation of python. However, if it exists on the internet, you can likely download it and use it.

Two common modules that are, by deafult, downloaded when you install python include `numpy` and `matplotlib`. We will focus on `numpy` here, but you will get an introduction to plotting with `matplotlib` in a later lesson.

### So `numpy` is a module, or a collection of code. How do we use it?

To use a module, you use the keyword `import`, which will load that module. Essentially, importing gives you access to all the code in that module, but specifying the modules you care about helps speed up load time since you don't need to access every module possible. 

It is traditional when importing `numpy` to rename it to `np` using the `as` keyword. For example, the code would be
```python
import numpy as np
```
Notice the green words are the python "keywords", basically saying that they have special meaning (and you can't use them as variable names).

People rename `numpy` as `np` literally because everyone does it. In fact, you can rename any module to anything you like using the `as` keyword. That does not mean you should, however.

Another way to use a module is to just import it and not use `as`, which would not rename the module. While you can do this for numpy, `np` is how anyone online will reference it.

*Let's try it out.*

Typically, people put all their imports in a cell at the top of their notebooks. So if you want to import any modules in the future, you'll want to put it here.

In [4]:
import numpy as np

## Why do we care about `numpy`?

`numpy` can do a lot of cool things much, much, *much* faster than if you did them by hand in python. Mostly, it is fast because you can do things in one line that typically require loops. Most scientists use `numpy`.

Let's explore some of these benefits.

**Exercise:** Make a new list where every element is twice that of the original list. To do so, use a loop to iterate through every element in `mylist` and multiply that element by 2. Save that value to a new list and print it at the end.

In [None]:
mylist = [1, 2, 3, 4, 5]
# double the above list element by element and print it here

In [None]:
# workspace as necessary

To use `numpy` in your code, use `np` (you renamed it) and a period `.` afterward, which specifies you want to access code in `numpy`. Then type the function or tool you want to use from the `numpy` module. 

For example, that would look like
```python
np.array()
```
which would access `numpy`'s `array` tool.

The `array` function is how you convert a list to an array, which is a different *data type*. See below.

In [None]:
myarr = np.array(mylist)
print(type(mylist), type(myarr))

<class 'list'> <class 'numpy.ndarray'>


Note: `numpy` automatically converts lists to N-dimensional arrays (hence the "nd" in front of "array"). We will only focus on one-dimensional arrays.

Okay. Now that we have our list as an `numpy` array, let's repeat the previous excercise (doubling a list) using the beautiful tools in `numpy`. Here, we can do all that in just one line! See below.

In [None]:
print(myarr * 2)

[ 2  4  6  8 10]


Wow, that was easy! `numpy` will assume every operation you apply to an array, you want to do for every element in that array. Thus, you can apply any mathematical operation (`+-*/`) to an array and `numpy` will automatically do that operation for every element in the array (instead of looping through it yourself).

Try some out below!

In [None]:
x = [1, 2, 3]
# convert x to a numpy array
# either add, subtract, divide, or multiply every element in x by 2
# test out any other mathematical operations with your array here

In [None]:
# extra workspace here

Here are some other useful `numpy` tools that you can use.

Say you have a `numpy` array called `x`.

1. `np.sum(x)` -- find the sum (total) of all elements of the array `x`
2. `np.average(x)` -- find the mean, or average, of all elements of the array `x`
3. `np.median(x)` -- find the median, or middle number, of all elements of the array `x`
4. `np.min(x)` -- find the smallest, or minimum, of all elements of the array `x`
5. `np.max(x)` -- find the largest, or maximum, of all elements of the array `x`
6. `np.sort(x)` -- sort all elements of the array `x` from lowest to highest

**Exercise:** Use three of the above six `numpy` tools on the array `x` below. Print all your results.

Extension: What happens when you use above functions on a list instead of an array?

In [None]:
x = np.array([2, 9, 4, 3, 6, 2, 7, 9])
# your code here

In [None]:
# extra workspace here

### Other ways to get a numpy array include the `arange` and `linspace` functions.

Another way to get a `numpy` array (other than converting a list using `np.array`) includes `np.arange` and `np.linspace`. Let's take a closer look here.

The `np.arange()` tool works just like `range` (from the loops lesson). However, it saves all values into an array.

Let's check this out.

In [9]:
start = 1 # feel free to change the numbers here
stop = 6
# default step is 1
x = np.arange(start, stop) 
print(x)

start = 10
stop = 100
step = 12
x = np.arange(start, stop, step) 
print(x)

[1 2 3 4 5]
[10 22 34 46 58 70 82 94]


The second way to generate a numpy array is using

```python
    min_value = 8
    max_value = 12
    number_of_elements = 30
    np.linspace(min_value, max_value, number_of_elements)
```
This tool creates an array that goes from the min value to the max value, with elements evenly spaced apart such that the total length of the array (or number of elements in the array) is equal to the number of elements. This let's you have elements that aren't integers, unlike `np.arange()`.

See below for an example.

In [None]:
y = np.linspace(1, 6, 11)
print(y)

[1.  1.5 2.  2.5 3.  3.5 4.  4.5 5.  5.5 6. ]


**Exercise:** Change `z` such that the elements in `z` are the same as the elements in `x`.

In [None]:
x = np.arange(1, 6)

### change z here
z = np.linspace(1, 1, 5)
###

print(f'x = {x}\nz = {z}')
print(z == x)

x = [1 2 3 4 5]
z = [1. 1. 1. 1. 1.]
[ True False False False False]


### Operations with more than one array

You can also perform mathematical operations between arrays.

In [10]:
length = 5
x = np.arange(1, length + 1)
y = np.linspace(1, 2, length)
print(f'x = {x}\ny = {y}\n')
print(f'x + y = {x + y}')
print(f'x - y = {x - y}')
print(f'x / y = {x / y}')
print(f'x * y = {x * y}')

x = [1 2 3 4 5]
y = [1.   1.25 1.5  1.75 2.  ]

x + y = [2.   3.25 4.5  5.75 7.  ]
x - y = [0.   0.75 1.5  2.25 3.  ]
x / y = [1.         1.6        2.         2.28571429 2.5       ]
x * y = [ 1.   2.5  4.5  7.  10. ]


It's important when doing mathematical operations between two arrays that their shapes match. You can check their shape using `myarr.shape`, where myarr a varaible name for an array.

In [None]:
print(x.shape, y.shape)

(5,) (5,)


Test some out on your own below, either using `x` and `y` as specified above, or make your own arrays.

For example, try to find the total sum of two arrays. Remember the `numpy` tools we tried out earlier as well!

In [None]:
# workspace

*An aside:* Note that `shape` is an `attribute` of every array. That means that every `numpy` array has the property `shape`.

Getting this property (which is called an `attribute` in computer science lingo) is syntatically different that calling a `numpy` tool using `np.`. Using `np.some_tool_name()` instead "calls a *function*" where `some_tool_name` is the name of the function you want.

More on functions in the next lesson. 

In [14]:
# For example
x = np.arange(1, 6)
print(x.ndim) # Calls an attribute
# vs
print(x.mean()) # Calls a function

1
3.0
