## CMPINF 2100 Week 03

### Manipulating 2D arrays

We can manipulate 2D arrays just like 1D arrays!!

But we need to be careful!

### Import NumPy

In [3]:
import numpy as np

### Create arrays
We will use 1D array to contrast with manipulating 2D arrays.

In [5]:
my_1d = np.arange(6)

In [6]:
my_1d

array([0, 1, 2, 3, 4, 5])

In [8]:
my_2d = np.array( [[0, 0, 0],
                   [1, 1, 1],
                   [2, 2, 2,],
                   [3, 3, 3]])

## Slicing

In [9]:
my_1d[0]

0

What happens if we slice a 2D array just like a 1D array

In [10]:
my_2d[0]

array([0, 0, 0])

In [11]:
my_2d[1]

array([1, 1, 1])

Providing a single index to SLICE causes NumPy to slice the ROWS!!

It returns ALL columns from that row!!

The more formal syntax use the `:` operator to specify ALL rows.

In [14]:
my_2d[0, :]

array([0, 0, 0])

In [15]:
my_2d[-1, :]

array([3, 3, 3])

In [16]:
my_2d[1, :]

array([1, 1, 1])

This highlights slicing or indexing 2D arrays required an index for ROWS and COLUMNS!!!

In [18]:
my_2d

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

In [21]:
my_2d[1,1]

1

In [23]:
my_2d[2,0]

2

It's easy to SLICE a subset of rows and columns!

In [24]:
my_2d[:2, :]

array([[0, 0, 0],
       [1, 1, 1]])

In [25]:
my_2d[:2, :1]

array([[0],
       [1]])

In [26]:
my_2d

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

## Conditional subsetting

We can apply conditional subset to 2D arrays just like 1D arrays but we need to be careful which subset or dimension we are applying the MASK to!!

In [27]:
my_2d

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Let's find all rows where 1 column has values greater than 1.

In [28]:
my_2d[:, 1]

array([0, 1, 2, 3])

In [30]:
my_2d[:, 1] > 1

array([False, False,  True,  True])

In [31]:
my_1d > 1

array([False, False,  True,  True,  True,  True])

In [32]:
my_2d[my_2d[:, 1] > 1, :]

array([[2, 2, 2],
       [3, 3, 3]])

Let's see this again, but this time fnding all rows with EVEN values in the 1 column.

In [46]:
my_2d[:, 0]

array([0, 1, 2, 3])

In [47]:
my_2d[:, 0] % 2 == 0

array([ True, False,  True, False])

In [48]:
my_2d[ my_2d[:, 0] % 2 == 0, :]

array([[0, 0, 0],
       [2, 2, 2]])

KEEPING or STORING the result of a SLICE or CONDITIONAL SUBSET required assigning the result ot a new variable.

Keep just the first 2 columns!

In [49]:
odd_array = my_2d[ my_2d[:, 0] % 2 != 0, :2]

In [50]:
odd_array

array([[1, 1],
       [3, 3]])

In [51]:
my_2d

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Arrays are mutable so we can change values!!

In [52]:
odd_array[0,0] = 101

In [53]:
odd_array

array([[101,   1],
       [  3,   3]])

In [54]:
odd_array[-1, -1] = -101

In [55]:
my_2d

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

But what if we would have SLICED rather than applied a MASK to conditionall subset the larger array?

In [57]:
odd_b = my_2d[1:4:2, :2]

In [59]:
odd_b

array([[1, 1],
       [3, 3]])

In [60]:
odd_b[0, 0] = 301

In [61]:
my_2d

array([[  0,   0,   0],
       [301,   1,   1],
       [  2,   2,   2],
       [  3,   3,   3]])

In [62]:
odd_b[-1, -1] = -301

In [63]:
odd_b

array([[ 301,    1],
       [   3, -301]])

In [64]:
odd_array

array([[ 101,    1],
       [   3, -101]])

In [65]:
my_2d

array([[   0,    0,    0],
       [ 301,    1,    1],
       [   2,    2,    2],
       [   3, -301,    3]])

SLicing arrays into smaller objects can create a SIDE EFFECT!!!

The sliced array points back to the larger array it originally came from!!!

Modifying the SLICED smaller array will therefore impact the original larger array it points to!!!

SLICING via MASKED or CONDITIONAL subset may not produce the same SIDE EFFECT!!!!

We can REMOVE the POINTING SIDE EFFECT by forcing a HARD COPY!!!

In [66]:
my_2d

array([[   0,    0,    0],
       [ 301,    1,    1],
       [   2,    2,    2],
       [   3, -301,    3]])

In [67]:
odd_copy = my_2d[1:4:2, :2].copy()

In [68]:
odd_copy[-1, -1] = 3

In [72]:
odd_copy[0, 0] = 1

In [73]:
odd_copy

array([[1, 1],
       [3, 3]])

In [74]:
my_2d

array([[   0,    0,    0],
       [ 301,    1,    1],
       [   2,    2,    2],
       [   3, -301,    3]])

## Sorting

In [75]:
my_2d

array([[   0,    0,    0],
       [ 301,    1,    1],
       [   2,    2,    2],
       [   3, -301,    3]])

When we sort, we need to remember it modified in place and we should consider the DIMENSIONS or AXIS we are sorting over!!!

In [76]:
my_2d.sort(axis=0)

In [77]:
my_2d

array([[   0, -301,    0],
       [   2,    0,    1],
       [   3,    1,    2],
       [ 301,    2,    3]])

In [78]:
my_2d.sort(axis=1)

In [79]:
my_2d

array([[-301,    0,    0],
       [   0,    1,    2],
       [   1,    2,    3],
       [   2,    3,  301]])

## Summarize

Let's review 1D summaries.

In [80]:
my_1d

array([0, 1, 2, 3, 4, 5])

In [81]:
my_1d.min()

0

In [82]:
my_1d.max()

5

In [83]:
my_1d.mean()

2.5

In [86]:
my_1d.std(ddof=1)

1.8708286933869707

In [85]:
my_1d.var(ddof=1)

3.5

What happens if we summarize a 2D array??

In [87]:
my_2d

array([[-301,    0,    0],
       [   0,    1,    2],
       [   1,    2,    3],
       [   2,    3,  301]])

In [88]:
my_2d.min()

-301

In [89]:
my_2d.max()

301

In [90]:
my_2d.mean()

1.1666666666666667

But part of using 2D arrays is that we often want to summarize EACH COLUMN separately from the other columns!!!

Meaning, we do not want the OVERALL statistic!

In [91]:
my_2d.shape

(4, 3)

In [92]:
my_2d.min(axis=1)

array([-301,    0,    1,    2])

In [93]:
my_2d.max(axis=1)

array([  0,   2,   3, 301])

In [94]:
my_2d.max(axis=0)

array([  2,   3, 301])