# [AHA! Activity Health Analytics](http://casas.wsu.edu/)
[Center for Advanced Studies of Adaptive Systems (CASAS)](http://casas.wsu.edu/)

[Washington State University](https://wsu.edu)
# L3 Numpy and Scipy: Part 1

## Learner Objectives
At the conclusion of this lesson, participants should have an understanding of:
* Numpy arrays and notation
* Utilizing Scipy for scientific computing

## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* [Scipy website](https://www.scipy.org/)
* [Numpy website](http://www.numpy.org/)
* Python for Data Analysis by Wes McKinney

## Scipy Ecosystem Overview
From the [Scipy website](https://www.scipy.org/):
>SciPy (pronounced "Sigh Pie") is a Python-based ecosystem of open-source software for mathematics, science, and engineering. In particular, these are some of the core packages:
* [Numpy](http://www.numpy.org/): Base N-dimensional array package
* [Scipy library](): Fundamental library for scientific computing
* [Matplotlib](): Comprehensive 2D plotting
* [IPython](): Enhanced interactive console
* [Sympy](): Symbolic mathematics
* [Pandas](): Data structures and analysis

In this class, we will use all of the above, except for Sympy.

## Numpy
From the [Numpy website](http://www.numpy.org/):
>NumPy is the fundamental package for scientific computing with Python. It contains among other things:
* a powerful N-dimensional array object (`ndarray`)
* sophisticated (broadcasting) functions
* tools for integrating C/C++ and Fortran code
* useful linear algebra, Fourier transform, and random number capabilities

>Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

Typically, `numpy` is imported as `np`:

In [1]:
import numpy as np

## `ndarray` Object
Numpy's N-dimensional array object, `ndarray`, is one of the main reasons to use Numpy for data analytics. `ndarray` is a fast, flexible container for large data sets in Python. We will often use `ndarray` objects in lieu of Python list objects because `ndarray` supports mathematical operations on whole blocks of data using similar syntax to the equivalent operations on scalars. 

Let's take a look at an example. Let's say we have a list of the numbers 0 through 10:

In [2]:
x = list(range(11))
print(x)

x = np.array(x)
print(x)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[ 0  1  2  3  4  5  6  7  8  9 10]


Note: We can omit the typecast to a list, because a `range` object can be converted to a `ndarray` object.

We can also make 2-D and N-D arrays. Numpy will "pretty print" the `ndarray` such that it is organized in a matrix format, instead of linear like the default printing for lists:

In [3]:
x = [[1, 2, 3], [4, 5, 6]]
print(x)

x = np.array(x)
print(x)
print("Number of dimensions: %d" %(x.ndim))
print("Shape (rows, cols): %s" %(str(x.shape)))
print("Datatype of items: %s" %(x.dtype))

# converting int items to float items
# astype() creates a new array
x_floats = x.astype(np.float)
print(x_floats)
print("Datatype of items: %s" %(x_floats.dtype))

[[1, 2, 3], [4, 5, 6]]
[[1 2 3]
 [4 5 6]]
Number of dimensions: 2
Shape (rows, cols): (2, 3)
Datatype of items: int32
[[ 1.  2.  3.]
 [ 4.  5.  6.]]
Datatype of items: float64


### `arange()`, `ones()`, and `zeros()`
Instead of using `range()` and then converting to an `ndarray`, we can create an `ndarray` object directly in a few ways:

In [6]:
x = np.arange(0, 11)
print(x)

x1 = np.arange(10)
print(x1)

x2 = np.ones(10)
print(x2)

x3 = np.zeros(10)
print(x3)

x4 = np.full(10, 5.0)
print(x4)

[ 0  1  2  3  4  5  6  7  8  9 10]
[0 1 2 3 4 5 6 7 8 9]
[ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 5.  5.  5.  5.  5.  5.  5.  5.  5.  5.]


### Indexing
Indexing `ndarray` objects works just like with lists:

In [13]:
x = np.arange(10)
print(x)
print(x[3])

[0 1 2 3 4 5 6 7 8 9]
3


We can also specify indices into N-dimensional `ndarray()` objects using commas:

In [14]:
ones = np.ones((2, 3))
print(ones[0][0])
# using a comma
print(ones[0, 0])

1.0
1.0


### Assignment
Just like with lists, we can update values in an `ndarray` using the assignment operator. For this example, we will work with a 3x4 array of random data:

In [15]:
from numpy.random import randn
rand_data = randn(3, 4)
print(rand_data)
rand_data[2][0] = 100
print(rand_data)

# Boolean array for negative values
negatives = rand_data < 0
print(negatives)
# set the negative values to 0
rand_data[negatives] = 0
print(rand_data)

[[ 0.8067337   2.32813807 -0.29061593  1.03325199]
 [ 0.24049094  0.42533723 -0.30862441 -0.12053085]
 [ 0.47168335 -0.89400941 -0.21375496  0.30472726]]
[[   0.8067337     2.32813807   -0.29061593    1.03325199]
 [   0.24049094    0.42533723   -0.30862441   -0.12053085]
 [ 100.           -0.89400941   -0.21375496    0.30472726]]
[[False False  True False]
 [False False  True  True]
 [False  True  True False]]
[[   0.8067337     2.32813807    0.            1.03325199]
 [   0.24049094    0.42533723    0.            0.        ]
 [ 100.            0.            0.            0.30472726]]


### Slicing
`ndarray` slicing works similar to lists; however, there are a few subtle differences:
* Slices are "views" of the `ndarray`, not copies
* Assigning a slice to a scalar (or an `ndarray` of a different dimension that the slice), broadcasts the scalar

In [16]:
x_list = list(range(10))
print("x_list: %s" %(x_list))
chunk = x_list[3:7]
print("chunk: %s" %(chunk))
# doesn't modify x_list because chunk is a copy
chunk[0] = 50
print("chunk: %s" %(chunk))
print("x_list: %s" %(x_list))


x = np.arange(10)
print(x)
print("x: %s" %(x))
chunk = x[3:7]
print("chunk: %s" %(chunk))
# does modify x_list because chunk is a view
chunk[0] = 50
print("chunk: %s" %(chunk))
print("x: %s" %(x))

# broadcasts
x[2:5] = 100
print(x)

x_list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
chunk: [3, 4, 5, 6]
chunk: [50, 4, 5, 6]
x_list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0 1 2 3 4 5 6 7 8 9]
x: [0 1 2 3 4 5 6 7 8 9]
chunk: [3 4 5 6]
chunk: [50  4  5  6]
x: [ 0  1  2 50  4  5  6  7  8  9]
[  0   1 100 100 100   5   6   7   8   9]


Note: if you want a copy of an `ndarray` slice instead of a view, you can copy the `ndarray()` using the `ndarray()` method `copy()`:

In [17]:
x = np.arange(10)
print(x)
print("x: %s" %(x))
chunk = x[3:7].copy()
print("chunk: %s" %(chunk))
# does modify x_list because chunk is now a copy
chunk[0] = 50
print("chunk: %s" %(chunk))
print("x: %s" %(x))

[0 1 2 3 4 5 6 7 8 9]
x: [0 1 2 3 4 5 6 7 8 9]
chunk: [3 4 5 6]
chunk: [50  4  5  6]
x: [0 1 2 3 4 5 6 7 8 9]
