# Tutorial 2.2 - Selecting Data in NumPy Arrays: Part I
Python for Data Analytics | Module 2  
Professor James Ng

In [None]:
# SETUP CODE - DON'T MODIFY THIS
# JUST EXECUTE IT
import numpy as np
import pandas as pd
pd.set_option('display.max_colwidth', -1)

In module one, you learned how to pull elements out of `list` objects using *index notation* and *slice notation*.

In this tutorial, we will be covering how to use the same techniques to access data that is held inside of single or multidimensional `ndarray` objects.

## Accessing Single Array Elements w/ Index Notation

In [None]:
# As you know, you access a single element of a list
# by using passing it's index value inside of square brackets [].
example_list = ['lists', 'are', 'cool']
example_list[2]

In [None]:
# The same thing applies to ndarray objects
example_array = np.array(['arrays', 'are', 'super cool'])
example_array[2]

**Pythonista Note:** Just remember the first element of `list` and `ndarray` objects is `0`, not `1`. Forget this and you'll have all sorts of problems.


### Negative Indexing

In [None]:
numeric_array = np.array(range(1, 11))
numeric_array

This array has the numbers 1-10 in it. Just like with a `list`, you can use a negative number to specify element positions from the right-side (or end) of the array instead of the beginning; starting with -1 for the last element in the array.

In [None]:
numeric_array[-1]

You can retrieve any element from the array in the same fashion. For instance:

In [None]:
# Retrieve the 3rd from the last element
numeric_array[-3]

<div class="alert alert-block alert-info">
Remember when using negative indexing that you cannot use `-0` as the index value because the value of `-0` is the same as `0`, and Python will give you the first element in the array. When you are using negative indexing, you start with `-1`.
</div> 

### What About Multidimensional Arrays? 
Conceptually, to access elements inside of a multi-dimensional array you would use the same syntax as getting elements out of a list that is nested inside of another list.

To get down to a single element, you simply have provide an index value for each dimension. There are a couple of different allowable syntaxes for doing this.

Let's illustrate with a two dimensional "grid" array.

In [None]:
# two dimensional array
two_dim_array = np.random.randint(10, size=(2, 3))  
two_dim_array

In [None]:
# You can retrieve elements specifying a 
# comma separated list of index values

# First you specify the row you want
# then the column.
two_dim_array[0, 1]

In [None]:
# Or you can specify each of the values as a separate index.
# This syntax also works with nested Python lists, whereas [0, 1] wouldn't.
two_dim_array[0][1]

In [None]:
# Don't forget you can use negative dimensions in one, or both 
# of the values for a two dimensional array.
two_dim_array[0, -2]
two_dim_array[0][-2]

**Note:** As you increase the number of dimensions past two in an array, take care that you specify the dimensions in the correct order.

When working with multi-dimensional arrays, you are not required to specify an index for each dimension.

In [None]:
# For example, if you only specify one level of indexing
# for `two_dim_array` it will return the entire row specified
# by the first index value
two_dim_array[0]

### Remember Mutability?
In module 1, we demonstrated that `list` objects were **mutable**.  That is, you could add/change/delete elements inside of them.

Well, `ndarray` objects also share this quality to some extent: you can change the value of existing elements. Let's demonstrate how this works for both our `numeric_array` and `two_dim_array` objects.

In [None]:
# Refresh our memory of what our objects look like
print("Numeric Array", numeric_array, '\n', sep='\n')
print("Two Dimensional Array", two_dim_array, sep='\n')

In [None]:
# Change the 5th element of `numeric_array` to 15
numeric_array[4] = 15
numeric_array

In [None]:
# Change the top-right element of `two_dim_array` to 450
two_dim_array[0][-1] = 450
two_dim_array

In [None]:
# But notice that if you try to assign a
# value to a new index, it will fail.
numeric_array[10] = 10

## Array Slicing: Accessing Array "Views"

Just as we can use square brackets to access individual array elements, we can also use them to access "views" with the *slice* notation. 

The NumPy slicing syntax follows that of the standard Python `list`; to access a slice of an array `x`, use the `start:stop:step` notation inside of brackets:
``` python
x[start:stop:step]
```

If any of these are unspecified, they default to the values: 
* `start=0`
* `stop=last element of the array dimension`
* `step=1`.

#### Sample Slice Notations
Here are some sample slice notations and their meanings:
* `1:5:1`: Return elements 2 through 5 in normal order. (Remember that with 0-based indexing, 1 is the second element.
* `:8:1`: Return elements 0 through 8 in normal order. Since the `start` parameter is left out, it assumes it's default value.
* `::-1`: Return all elements (the first two parameters are default values) in reverse order.

Let's try slice notations out with our `numeric_array` and `two_dim_array` objects.

### Single Dimensional Array: `numeric_array`

In [None]:
# Print the whole array for reference.
print(numeric_array)

# Return the 4th through 9th (inclusive) elements.
numeric_array[3:8]

**Pythonista Note: ** The output here might seem confusing. Just remember that the `3` index refers to the fouth element in the array and that `8` index refers to the ninth element in the array.

This can be quite confusing at first.

In [None]:
# Return everything up to the 4th element
numeric_array[:4]

In [None]:
# Return everything from the sixth element on
numeric_array[5:]

In [None]:
# Return every third element
numeric_array[::3]

In [None]:
# Return every other element, starting with the second element
numeric_array[1::2]

#### Special Case: Negative Step Values
You can use a negative number for the `step` condition of a slice. 
If you do, the default values of `start` and `stop` are swapped around.

This is often used in practice to reverse the order of a data set:

In [None]:
numeric_array[::-1]

Here's an example of how to use it to get every other element in reverse order starting with the 7th element.

In [None]:
numeric_array[7::-2]

## Multidimensional Array Views

We previously demonstrated how you could retrieve a single element out of a multidimensional array by specifying an index value for each dimension. Similiarly, you can specify a slice notation for each dimension of an array.

In [None]:
# Let's start by reseting our two dimensional array.
two_dim_array = np.random.randint(10, size=(5, 3))  
two_dim_array

In [None]:
# Slice the first 4 rows and 1st column of each row
two_dim_array[:4, :1]

### Syntax Restrictions on Array Slicing
When we were selecting individual elements of arrays, we demonstrated that there were two ways of getting to a specific element of a multidimensional array.
*   `two_dim_array[1, 1]`
*   `two_dim_array[1][1]`

You can **not** use the second form when slicing an array.
`two_dim_array[:4, :1]` will give very different results than `two_dim_array[:4][:1]`

In [None]:
# Slice all rows and every other column
two_dim_array[:, ::2]

### Reversing multidimensional arrays
As we saw with single dimensional arrays, you can use a negative `step` value in a slice notation to reverse elements. 

If you want to fully reverse a multi-dimension array, you must reverse each dimension.

In [None]:
# Reserve `two_dim_array`
two_dim_array[::-1, ::-1]

In [None]:
# You can also reverse just one of the dimensions

# Array w/o Transformation for Comparison
print("Regular Array", two_dim_array, sep='\n')

In [None]:
# Only reverse rows
# Remember that this just means the order of the rows are reversed
# NOT the elements within the rows. Important distinction.
print("Reversed Rows", two_dim_array[::-1], sep='\n') 

In [None]:
# Only reverse columns
# This is what will reverse the order of elements within a given row.
print("Reversed Columns", two_dim_array[:, ::-1], sep='\n') 

### What if I only want one column or one row?

Well, turns out that is a pretty common use case. If you want to get a row, you just supply the row index as we've seen before:

In [None]:
# Just the first row please
two_dim_array[0]

It's a little bit more complicated if you want to retrieve the values of a particular column. First you have to specify an open (or completely default) slice notation to get all the rows, then specify the index of the column you want.

It sounds more complicated that it actually is. Here's how to do it:

In [None]:
# This will return all the elements in the 3rd column of the array.
two_dim_array[:, 2]

**Pythonista Note: **What does that output look like a row and not a column?

Up to this point, when NumPy has displayed <code>two_dim_array</code> it has shown columns vertically. So, it may look weird to you that this time it looks like a row.

This reason for this is that NumPy only represents things as a grid when there are two dimensions to display. In this case, since we are only selecting the elements of a single column, there is no second dimension to display.

## `ndarray` Slices are Views, not Copies
When we use slice notation on an `ndarray`, it returns a view of the original, as opposed to a copy. This means that if you modify the elements of a slice, you will also change them in the original array.

<div class="alert alert-block alert-danger">
It is very important for experienced Python programmers to note since this is the opposite of what happens when you slice a `list` object.
</div> 

While this may seem like a bad thing at first, particular for those who are used to slicing `list` objects, this can actually help us when we want to process little chunks of data at a time.

Let's demonstrate:

In [None]:
# Let's create a slice of the first row in `two_dim_array`
row_one = two_dim_array[0]
row_one

In [None]:
# Ok, here is `two_dim_array` before we change the slice
two_dim_array

In [None]:
# Now let's change the elements in our `row_one` slice to be 1, 2, 3
row_one[:] = [1, 2, 3]

# And re-output `two_dim_array`
two_dim_array

See how the first row of `two_dim_array` was changed to 1, 2, 3?

**Note: ** You might be wondering what the heck <code>row_one[:] = [1, 2, 3]</code> works. </p>

<p>Think of it as saying, "take all the elements of row_one and assign them these values over here on the right side of the equals sign [1, 2, 3]".
</p>

<p>This won't work if you specify a number of values on the right side of the equation that doesn't match the number of elements in the array on the left side, unless you specify just one value on the right, in which case all elements will take on that value.</p>

</div> 