# [AHA! Activity Health Analytics](http://casas.wsu.edu/)
[Center for Advanced Studies of Adaptive Systems (CASAS)](http://casas.wsu.edu/)

[Washington State University](https://wsu.edu)
# L4 Numpy and Scipy: Part 2

## Learner Objectives
At the conclusion of this lesson, participants should have an understanding of:
* Numpy arrays and notation
* Utilizing Scipy for scientific computing

## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* [Scipy website](https://www.scipy.org/)
* [Numpy website](http://www.numpy.org/)
* Python for Data Analysis by Wes McKinney

## `ndarray` Object Continued

In [3]:
import numpy as np

### Vectorization
Now, suppose we want to add two equal-length sequences together. Using lists we have to write a loop, such as the following:

In [4]:
x = range(1, 11)
y = [10] * 10
z = []
for i in range(len(x)):
    z.append(x[i] + y[i])
print(z)

[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]


Using an `ndarray`, we can *vectorize* the addition operation to each item in the sequences, without writing a loop:

In [5]:
x = np.arange(1, 11)
y = [10] * 10
z = x + y
print(z)

x = np.arange(10)
print(x)
x += 1
print(x)

[11 12 13 14 15 16 17 18 19 20]
[0 1 2 3 4 5 6 7 8 9]
[ 1  2  3  4  5  6  7  8  9 10]


Vectorization enables you to express batch operations on data without writing any loops.

Operations between differently sized arrays is called *broadcasting*. For example, we can broadcast a scaler (i.e. an array of length one) operation to each item in an array:

In [10]:
x = np.arange(11)
x *= np.array([2])
print(x)

[ 0  2  4  6  8 10 12 14 16 18 20]


Note: See Chapter of Python for Data Analysis or the [Numpy docs](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) if you want to learn more about broadcasting.

Relational operators (==, !=, <, <=, >, >=) can be vectorized:

In [11]:
m_names = np.array(["Mary", "Michael", "Margaret", "Mary", "Marcus", "Molly"])
m_ages =  np.array([28    , 72       , 12        , 34    , 40      , 68])
# marys is a Boolean array
marys = m_names == "Mary"
print(m_names)
print(marys)

print(m_ages[marys])

['Mary' 'Michael' 'Margaret' 'Mary' 'Marcus' 'Molly']
[ True False False  True False False]
[28 34]


Boolean operators (`and`, `or`, `not`) can be vectorized as well. For vectorized `and`, use `&`. For vectorized `or`, use `|`.

Note: `and` and `or` reserved keywords do not work with Boolean arrays.

In [12]:
m_names = np.array(["Mary", "Michael", "Margaret", "Mary", "Marcus", "Molly"])
m_ages =  np.array([28    , 72       , 12        , 34    , 40      , 68])
mary_marcus = (m_names == "Mary") | (m_names == "Marcus")
print(m_names)
print(mary_marcus)

print(m_ages[mary_marcus])

['Mary' 'Michael' 'Margaret' 'Mary' 'Marcus' 'Molly']
[ True False False  True  True False]
[28 34 40]


### Reshaping
We can change the shape of an `ndarray` object, i.e. we can change the dimensions. For example, say we have a 1D array that we want to change into a 2D array:

In [18]:
ints = np.arange(10)
print(ints.shape)
print(ints)
ints = ints.reshape(5, 2)
print(ints.shape)
print(ints)

(10,)
[0 1 2 3 4 5 6 7 8 9]
(5, 2)
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


### Transposing
Matrix transposition turns the rows of the matrix into columns and the columns into rows. `ndarray` has support for tranposing:

In [19]:
x = np.arange(6).reshape((2, 3))
print(x)
print(x.shape)
x_t = x.T
print(x_t)
print(x_t.shape)

[[0 1 2]
 [3 4 5]]
(2, 3)
[[0 3]
 [1 4]
 [2 5]]
(3, 2)


### `ndarray` Functions
`ndarray` has several fast, vectorized universal functions (ufuncs) that perform element-wise operations on data.

#### Unary ufuncs
Unary ufuncs accept a single `ndarray` and apply an operation element-wise. Example ufuncs include:
* `np.sqrt()`: Element wise square root
* `np.absolute()`: Element wise absolute value
* `np.sine()`: Element wise trigonometric sign

For a full list of available ufuncs, please read the [Numpy docs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs), there are over 60 of them!

In [21]:
nums = np.arange(10)
print(nums)
print(np.sqrt(nums))

nums2 = np.random.randn(4, 4)
print(nums2)
print(np.absolute(nums2))

[0 1 2 3 4 5 6 7 8 9]
[ 0.          1.          1.41421356  1.73205081  2.          2.23606798
  2.44948974  2.64575131  2.82842712  3.        ]
[[ 0.09974505  0.12077332  0.78635635 -0.00338038]
 [ 0.27127806 -0.64444694 -1.35002247 -0.79343794]
 [-0.10411761  0.26385509  0.76403851 -0.61235238]
 [ 0.69307445  0.05751858  0.82928344  0.37826543]]
[[ 0.09974505  0.12077332  0.78635635  0.00338038]
 [ 0.27127806  0.64444694  1.35002247  0.79343794]
 [ 0.10411761  0.26385509  0.76403851  0.61235238]
 [ 0.69307445  0.05751858  0.82928344  0.37826543]]


#### Binary ufuncs
Binary ufuncs accept two `ndarray` objects, apply an operation element-wise, and return a single array as a result. Example binary ufuncs include:
* `np.power()`: Element wise exponentiation
* `np.maximum()`: Element wise maximum comparison
* `np.minimum()`: Element wise minimum comparison

For a full list of available ufuncs, please read the [Numpy docs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs), there are over 60 of them!

In [6]:
nums = np.arange(5)
print(nums)
powers = np.full(5, 2.0)
powers[-1] = 3
print(powers)
nums2 = np.arange(5) + 1

print(np.power(nums, powers)) # or simply np.power(nums, 2) to broadcast
print(np.power(nums, nums2))
nums[2] = 100
print(nums)
print(nums2)
print(np.maximum(nums, nums2))

[0 1 2 3 4]
[ 2.  2.  2.  2.  3.]
[  0.   1.   4.   9.  64.]
[   0    1    8   81 1024]
[  0   1 100   3   4]
[1 2 3 4 5]
[  1   2 100   4   5]
