# Accessing Elements in NumPy Arrays

This lesson covers:

* Accessing specific elements in NumPy arrays

Accessing elements in an array or a DataFrame is a common task. To begin this lesson, clear the
workspace set up some vectors and a $5\times5$ array. These vectors and matrix will make it easy
to determine which elements are selected by a command.


Using `arange` and `reshape` to create 3 arrays:

* 5-by-5 array `x` containing the values 0,1,...,24 
* 5-element, 1-dimensional array `y` containing 0,1,...,4
* 5-by-1 array `z` containing 0,1,...,4


In [1]:
import numpy as np

x = np.arange(25).reshape((5,5)) 
x

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [2]:
y = np.arange(5)
y

array([0, 1, 2, 3, 4])

In [3]:
# The -1 tells numpy to automatically compute the size of
# the dimension using the remaining elements, in this case, 5
z = np.arange(5).reshape((-1, 1))
z

array([[0],
       [1],
       [2],
       [3],
       [4]])

## Zero-based indexing
Python indexing is 0 based so that the first element has position `0`, the second has position `1`
and so on until the last element has position `n-1` in an array that contains `n` elements in
total.

## Problem: Picking elements out of arrays
1. Select the number 2 in all three, `x`, `y`, and `z`.
2. Select the number 11 in `x` 
3. Using double index notation, select the (0,2) and the (2,0) element of `x`.

**Issues to ponder**

* Which index is rows and which index is columns?
* Does NumPy count across first then down or down first then across? 

In [4]:
print(x[0, 2])
print(y[2])
print(z[2, 0])

2
2
2


In [5]:
# Incorrect
print(x[2])
print(y[2])
print(z[2])


[10 11 12 13 14]
2
[2]


In [6]:
print(x[2, 0])  # 11 is in row 2, col 0

10


In [7]:
print(x[0, 2])
print(x[2, 0])

2
10


## Problem: Selecting Single Rows
1. Select the 2nd row of `x` and `z`
2. Select the 2nd element `y`.

**Issues to ponder**

* What happens to the dimension in each case? **Hint** Use `np.ndim` on the
output of each. 


In [8]:
print(x[1, :])
print(x[1])
print(f"Dim: {np.ndim(x[1])}")

[5 6 7 8 9]
[5 6 7 8 9]
Dim: 1


In [9]:
print(y[1])
print(f"Dim: {np.ndim(y[1])}")

1
Dim: 0


In [10]:
print(z[1, :])
print(z[1])
print(f"Dim: {np.ndim(z[1])}")

[1]
[1]
Dim: 1


## Problem: Preserving Dimensions 

Repeat the previous selection using:

* A slice
* A list

In [11]:
print(x[1:2, :])
print(f"Dim: {np.ndim(x[1:2])}")
print(x[[1]])
print(f"Dim: {np.ndim(x[[1]])}")

[[5 6 7 8 9]]
Dim: 2
[[5 6 7 8 9]]
Dim: 2


In [12]:
print(y[1:2])
print(f"Dim: {np.ndim(y[1:2])}")

[1]
Dim: 1


In [13]:
print(z[[1], :])
print(z[[1]])
print(f"Dim: {np.ndim(z[[1], :])}")


[[1]]
[[1]]
Dim: 2


## Problem: Selecting a single Column
Select the 2nd column of x using the colon (:) operator. 

In [14]:
print(x[:, 1])

[ 1  6 11 16 21]


In [15]:
print(x[:, [1]])
print(x[:, 1:2])


[[ 1]
 [ 6]
 [11]
 [16]
 [21]]


[[ 1]
 [ 6]
 [11]
 [16]
 [21]]


## Problem: Selecting Specific Rows or Columns
1. Select the 2nd and 3rd columns of x using a slice.
2. Select the 2nd and 4th rows of x using both a slice and a list. 
3. Combine these be combined to select columns 2 and 3 and rows 2 and 4. 

In [16]:
print(x[:, 1:3])

[[ 1  2]
 [ 6  7]
 [11 12]
 [16 17]
 [21 22]]


In [17]:
print(x[[1, 3], :])
print(x[1:4:2, :])

[[ 5  6  7  8  9]
 [15 16 17 18 19]]
[[ 5  6  7  8  9]
 [15 16 17 18 19]]


In [18]:
print(x[1:4:2, 1:3])

# Right
print(x[[1,3], 1:3])

# Also Right
print(x[1:4:2, [1, 2]])

# Wrong
print("Looks right, but wrong!!")
print(x[[1, 3],[1, 2]])

[[ 6  7]
 [16 17]]
[[ 6  7]
 [16 17]]
[[ 6  7]
 [16 17]]
Looks right, but wrong!!
[ 6 17]


## Problem: Use `ix_` to select rows and columns using list
Use `ix_` to select the 2nd and 4th rows and 1st and 3rd columns of `x`.

In [19]:
# Must use ix_ when both selectors are "fancy" to get blocks
x[np.ix_([1, 3],[1, 2])]

array([[ 6,  7],
       [16, 17]])

In [20]:
# Also correct, but hard to get right
x[[[1,1],[3,3]],[[1,2],[1,2]]]

array([[ 6,  7],
       [16, 17]])

## Problem: Convert a DataFrame to a NumPy array

Use  `.to_numpy` to convert a DataFrame to a NumPy array.

In [21]:
# Setup: Create a DataFrame
import pandas as pd
import numpy as np

names = ["a", "b", "c", "d", "e"]
x = np.arange(25).reshape((5,5))
x_df = pd.DataFrame(x, index=names, columns=names)
print(x_df)


    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14
d  15  16  17  18  19
e  20  21  22  23  24


In [22]:
x_np = x_df.to_numpy()
print(x_np)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


## Problem: Use `np.asarray` to convert to an array

Use  `np.asarray` to convert a DataFrame toa NumPy array.

In [23]:
x_np = np.asarray(x_df)
print(x_np)


[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
