# What we cover?

* Attributes of arrays: Determining the size, shape, memory consumption, and data types of arrays
* Indexing of arrays: Getting and setting the value of individual array elements
* Slicing of arrays: Getting and setting smaller subarrays within a larger array
* Reshaping of arrays: Changing the shape of a given array
* Joining and splitting of arrays: Combining multiple arrays into one, and splitting one array into many

## Attributes of Arrays

In [3]:
import numpy as np
np.random.seed(123)

x1 = np.random.randint(10, size=6)
x2 = np.random.randint(10, size=(3, 4))
x3 = np.random.randint(10, size=(3, 4, 5))

Each array has attributes such as `ndim`(the number of dimensions), `shape`(the size of each dimension), and `size`(the total size of the array):

In [4]:
print("x3 ndim",  x3.ndim)
print("x3 shape",  x3.shape)
print("x3 size",  x3.size)

x3 ndim 3
x3 shape (3, 4, 5)
x3 size 60


Also, there are some attributes representing data type and occupied size of array such as `dtype`, `itemize`(the size of each array element) and `nbytes`(the total size of the array)

In [5]:
print("dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

dtype: int64
itemsize: 8 bytes
nbytes: 480 bytes


## Indexing of arrays

Indexing of arrays is alike Python style indexing.

In [6]:
x1

x1[0]
x1[4]
x1[-2]

x2

x2[0, 0]
x2[2, 0]
x2[2, -2]

array([2, 2, 6, 1, 3, 9])

2

3

3

array([[6, 1, 0, 1],
       [9, 0, 0, 9],
       [3, 4, 0, 0]])

6

3

0

Values can be modified in using the above index notation. One thing we should know is unlike Pytho lists, NumPy arrays have a fixed type, and insertion value can be modified as the type of NumPy array.

In [7]:
x2[0, 0] = 12
x2

x1[0] = 3.14159 # fixed to int64
x1 

array([[12,  1,  0,  1],
       [ 9,  0,  0,  9],
       [ 3,  4,  0,  0]])

array([3, 2, 6, 1, 3, 9])

## Slicing of Array

We can access subarrays with the *slice* notation marked by the colon(:) character. The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:

```
x[start:stop:step]
```
where defalut value would `start=0, stop=size of dimensions, step=1`

### Single dimension array

In [8]:
x = np.arange(10)
x

x[:5]
x[5:]
x[4:7]

x[::2]
x[1::2]

# What if step has negative value
x[::-1] # all elements, reversed
x[5::-2] # reversed every other from index -2

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

array([0, 1, 2, 3, 4])

array([5, 6, 7, 8, 9])

array([4, 5, 6])

array([0, 2, 4, 6, 8])

array([1, 3, 5, 7, 9])

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

array([5, 3, 1])

### Multi-dimension array

In [10]:
x2

x2[:2, :3] # 2 rows, three columns

x2[:3, ::2]  # all rows, every other column

x2[::-1, ::-1] # reversed version

array([[12,  1,  0,  1],
       [ 9,  0,  0,  9],
       [ 3,  4,  0,  0]])

array([[12,  1,  0],
       [ 9,  0,  0]])

array([[12,  0],
       [ 9,  0],
       [ 3,  0]])

array([[ 0,  0,  4,  3],
       [ 9,  0,  0,  9],
       [ 1,  0,  1, 12]])

One row or column selection can be easily done using slice(:) character. Also, in the case of row access, the empty slice can be omitted for a more compact syntax 

In [11]:
x2[:, 0]
x2[0, :]
x2[0] # equivalet to x[0, :]

array([12,  9,  3])

array([12,  1,  0,  1])

array([12,  1,  0,  1])

#### Subarrays as no-copy views

One important–and extremely useful–thing to know about array slices is that they return *views* rather than *copies* of the array data. This is one area in which NumPy array slicing differs from Python list slicing: in lists, slices will be copies. Let's compare list and array then:

In [37]:
x_array = np.random.randint(10, size=(3, 4))
x_list = x_array.tolist()

x_array
x_list

array([[4, 3, 3, 7],
       [6, 8, 6, 4],
       [4, 7, 0, 0]])

[[4, 3, 3, 7], [6, 8, 6, 4], [4, 7, 0, 0]]

In [39]:
x_array_sub = x_array[0, 0:2]
x_array_sub

x_list_sub = x_list[0][0:2]
x_list_sub

array([4, 3])

[4, 3]

Let me modify the subarray of each types

In [40]:
x_array_sub[0] = 12
x_list_sub[0] = 12

x_array 
x_list

array([[12,  3,  3,  7],
       [ 6,  8,  6,  4],
       [ 4,  7,  0,  0]])

[[4, 3, 3, 7], [6, 8, 6, 4], [4, 7, 0, 0]]

Note that in the case of lists, selectig one element returns *views* rather than *copies*

In [42]:
x_list_sub2 = x_list[0]
x_list_sub2
x_list_sub2[0] = 10
x_list

[4, 3, 3, 7]

[[10, 3, 3, 7], [6, 8, 6, 4], [4, 7, 0, 0]]

#### Creating copies of arrays

Despite the nice features of array view, it is sometimes useful to instead explicitly copy the data within an array or a subarray. This can be most easily doe with `copy()` method:

In [44]:
x2_sub_copy = x2[:2, :2].copy()
x2_sub_copy

x2_sub_copy[0, 0] = 42
x2_sub_copy

x2

array([[12,  1],
       [ 9,  0]])

array([[42,  1],
       [ 9,  0]])

array([[12,  1,  0,  1],
       [ 9,  0,  0,  9],
       [ 3,  4,  0,  0]])

### Reshaping of Arrays

Reshape can be done easily using `reshape()` method. Where possible, the rehshape method will use a *no-copy view* of the initial array, but with non-contiguous memory buffers this is not always the case

What is *Where possible?*. If we can use contiguous memory buffers when we reshape, we don't have to use *copy*. To see more https://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays

Another common reshaping pattern is the conversion of a 1-dimensional array into a 2-dimensional row or column matrix. This can be done with the `reshape` method, or more easily done by making use of the `newaxis` keyword withi a slice operation:

In [47]:
grid = np.arange(1, 10).reshape((3, 3))
grid

x = np.array([1,2,3])

# row vector via reshape & np.newaxis
x.reshape((1,3))
x[np.newaxis, :]

# column vector via reshape & np.newaxis
x.reshape((3,1))
x[:, np.newaxis]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

array([[1, 2, 3]])

array([[1, 2, 3]])

array([[1],
       [2],
       [3]])

array([[1],
       [2],
       [3]])

### Joining and spliting of Arrays

#### Concatenation of arrays
Concatenation, or joining of two arrays in NumPy, is primarily accomplished using the routines `np.concatenate, np.vstack, and np.hstack. np.concatenate` takes a tuple or list of arrays as its first argument, as we can see here:

In [49]:
x = np.array([1, 2, 3])
y = np.array([3, 2 ,1])
np.concatenate([x, y])

array([1, 2, 3, 3, 2, 1])

It can also be used for 2-dimensional arrays:

In [51]:
grid = np.array([[1, 2, 3],
                 [4, 5, 6]])
grid.shape

z = np.concatenate([grid, grid])
z.shape
h = np.concatenate([grid, grid], axis=1)
h.shape

(2, 3)

(4, 3)

(2, 6)

Sometimes, usage of `vstack, hstack, dstack` can give us more clarification about the dimension of concatenation. Each method refers axis=0, axis=1, axis=2 concatenation

In [52]:
grid = np.random.randint(10, size=(1,2,3))
grid.shape

# axis 0 concat
_vstack = np.vstack([grid, grid])
_vstack.shape

# axis 1 concat
_hstack = np.hstack([grid, grid])
_hstack.shape

# axis 2 concat
_dstack = np.dstack([grid, grid])
_dstack.shape

(1, 2, 3)

(2, 2, 3)

(1, 4, 3)

(1, 2, 6)

#### Spliting of Arrays

The opposite of concatenation is splitting, which is implemented by the functions `np.split, np.hsplit, and np.vsplit`. For each of these, we can pass a list of indices giving the split points:

In [53]:
x = np.arange(10)
x1, x2, x3 = np.split(x, [3, 5])
x1, x2, x3

(array([0, 1, 2]), array([3, 4]), array([5, 6, 7, 8, 9]))

Notice that *N* split-points leads to *N+1* subarrays

In [59]:
grid = np.arange(16).reshape((4, 4))
grid

print('vsplit is to split at the axis=0')
up, down = np.vsplit(grid, [1])
print(up, '\n', up.shape)  
print(down, '\n', down.shape)

print('hsplit is to split at the axis=1')
left, right = np.hsplit(grid, [1])
print(left, '\n', left.shape)  
print(right, '\n', right.shape)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

vsplit is to split at the axis=0
[[0 1 2 3]] 
 (1, 4)
[[ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]] 
 (3, 4)
hsplit is to split at the axis=1
[[ 0]
 [ 4]
 [ 8]
 [12]] 
 (4, 1)
[[ 1  2  3]
 [ 5  6  7]
 [ 9 10 11]
 [13 14 15]] 
 (4, 3)


### END