## Array manipulation

In addition to the array indexing methods, numpy also provides functionality for inserting, deleting or adding elements to an array. In all of these operations we have to take special care with the dimensions of the array. 


***
### Axis in an array

Some of the operations on an array are done through a given axis of the array. In an n-dimensional array the last dimension will be the columns of the array. For example in a two-dimensional array the first dimension will be the rows (axis=0) and the second dimension will be the columns (axis=1). In a three-dimensional array, the first dimension will be the depth (axis=0), the second dimension will be the rows (axis=1) and the last dimension will be the columns (axis=2).

For example for a 2D array:
- Performing an operation across the 0 axis will imply performing the operation across the rows, i.e. for each column. 
- Performing an operation across axis 1 will mean performing the operation across the columns, i.e. for each row. 

![](data/arr_axis.jpg)

Specifying **axis=None**, implies that the operation is performed on the **one-dimensional** version of the array.

In [2]:
# Imports 
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [3]:
# Create an array of (3, 3) with values of 1, 2 and 3 for all elements in rows 1, 2 and 3.
# Print the resulting array
lista = [[n+1]*3 for n in range(3)]
arr = np.array(lista)
print (arr)

[[1 1 1]
 [2 2 2]
 [3 3 3]]


In [4]:
# Averages all the elements of the array
# Print the result
media = arr.mean()
print (media)

2.0


In [5]:
# Performs the average for each column (across rows)
# Print the result
media = arr.mean(axis = 0)
print (media)

[2. 2. 2.]


In [6]:
# Perform the average for each row (across the columns)
media = arr.mean(axis = 1)
print (media)

[1. 2. 3.]


In [7]:
# Creates an array of (2, 3, 4) with random integer values between [0, 10)
# Print the resulting array
np.random.seed(111) # We use one seed so that we all get the same values.
arr = np.random.randint(0, 10, (2, 3, 4))
print (arr)

[[[4 4 4 6]
  [3 9 2 6]
  [2 8 7 9]]

 [[7 1 0 8]
  [2 5 6 8]
  [5 6 5 8]]]


In [10]:
# Sums all elements of the array
# Print the result
suma = arr.sum()
print (suma)

125


In [11]:
# Sum the elements of the array across its first axis (axis=0)
# Sum the elements of the array across its first axis (axis=1)
# Sum the elements of the array across its first axis (axis=2)
# Print the resulting arrays
suma0 = arr.sum(axis=0)
suma1 = arr.sum(axis=1)
suma2 = arr.sum(axis=2)
print (suma0)
print ("-"*15)
print (suma1)
print ("-"*15)
print (suma2)

[[11  5  4 14]
 [ 5 14  8 14]
 [ 7 14 12 17]]
---------------
[[ 9 21 13 21]
 [14 12 11 24]]
---------------
[[18 20 26]
 [16 21 24]]


***
### Array dimensions
Some of the concatenation operations, inserting or appending values, require that the arrays have the same number of compatible dimensions and shapes.

In [12]:
# Create a one-dimensional array with 10 consecutive values of [0, 9].
# From the first array, create a second array of (10, 1)
# From the first array, create a second array of (1, 10)
# Print the three arrays, their number of dimensions and their dimensions
arr1 = np.arange(10)
arr2 = arr1.reshape((10, 1))
arr3 = arr1.reshape((1, 10))

for idx, arr in enumerate((arr1, arr2, arr3)):
    print ("Array {0}".format(idx))
    print (arr)
    print (arr.ndim)
    print (arr.shape)
    print ("-" * 20)

Array 0
[0 1 2 3 4 5 6 7 8 9]
1
(10,)
--------------------
Array 1
[[0]
 [1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]]
2
(10, 1)
--------------------
Array 2
[[0 1 2 3 4 5 6 7 8 9]]
2
(1, 10)
--------------------


***
### Exchanging array dimensions (array transpose)

We have two main options for transposing an array:
1. The **array.T** property, returns an array with the swapped dimensions (its transpose).
2. The **array.transpose()** method, returns an array with the swapped dimensions (their transpose).

Both arrays are "views" of the original array *(share memory with the initial array)*.

In [13]:
# Creates two lists x and y with random values between 0 and 10
x = [np.random.randint(0, 10) for n in range(10)]
y = [np.random.randint(0, 10) for n in range(10)]

# Creates an array of (10, 2) with the created values
# Print the resulting array
arr = np.array((x, y))
print (arr)

[[6 2 6 4 3 7 9 3 1 6]
 [8 5 4 1 1 7 8 7 9 6]]


In [16]:
# Create an array by transposing the previous array.
# Create a second array of (10, 2) using the reshape method.
# Print the two arrays and note the differences
print ("\nTRANSPONEMOS EL ARRAY\n" + "-"*22)
arr1 = arr.T
print (arr1)

print ("\nMETODO RESHAPE\n" + "-"*22)
print (arr.reshape((10,2)))


TRANSPONEMOS EL ARRAY
----------------------
[[6 8]
 [2 5]
 [6 4]
 [4 1]
 [3 1]
 [7 7]
 [9 8]
 [3 7]
 [1 9]
 [6 6]]

METODO RESHAPE
----------------------
[[6 2]
 [6 4]
 [3 7]
 [9 3]
 [1 6]
 [8 5]
 [4 1]
 [1 7]
 [8 7]
 [9 6]]


***
### Iterating the elements of an array

Iteration in a multidimensional array is performed **along its first axis (axis=0)**. To iterate all the elements of an array, we can iterate on its one-dimensional version. This 1D version can be achieved with numpy's ravel function, the flatten method, or by using the reshape(-1) property

> `numpy.ravel(arr, order='C')`.

> numpy function that returns a one-dimensional version of the array 'arr'. The order parameter, either `C' (default) or `F', to return the elements sorted by columns or by rows.

***
> `array.flatten(order='C')`.

> Array method that returns a one-dimensional array. The order parameter, can be 'C' (default) or 'F', to return the elements sorted by columns or by rows.

***
> `array.reshape(-1)`.

>This method returns a **one-dimensional view** of the array (it shares memory with the original array).


In [17]:
# Creates a one-dimensional array with 5 consecutive values between [0, 5)
# Iterate the array elements and print each iteration
arr = np.arange(5)
for elemento in arr:
    print (elemento)

0
1
2
3
4


In [18]:
# Create an array of (2, 6) with consecutive values between [10, 21].
# Iterate the elements of this array and print each iteration.
arr = np.arange(10, 22).reshape(2, 6)
for elemento in arr:
    print (elemento)

[10 11 12 13 14 15]
[16 17 18 19 20 21]


In [19]:
# Create an array of (2, 3, 7) with random numbers between [0, 10)
# Print the resulting array
# Iterate the elements of this array and print each iteration 
# Iterate the elements of a 3-D array (iteration is done for its first axis)
np.random.seed(1111)
arr = np.random.randint(0,10,(2,3,7))
print (arr)
print ("=" * 25)
for elemento in arr:
    print (elemento)
    print ("-" * 25)

[[[7 5 1 2 4 8 6]
  [4 8 6 2 6 8 9]
  [3 4 8 9 9 7 2]]

 [[0 4 8 5 5 1 2]
  [0 7 5 6 2 2 8]
  [0 7 9 0 8 4 5]]]
[[7 5 1 2 4 8 6]
 [4 8 6 2 6 8 9]
 [3 4 8 9 9 7 2]]
-------------------------
[[0 4 8 5 5 1 2]
 [0 7 5 6 2 2 8]
 [0 7 9 0 8 4 5]]
-------------------------


***
### numpy.unravel_index
> ``numpy.unravel_index(index, shape, order='C')`

> numpy function that returns the indices of the shape dimensions, starting from an index of the one-dimensional version of the array.
>- *index*: Index (or list of indices) of the one-dimensional version of the array
>- *shape*: Tuple with the shape of the array</li> >- *order*: Specifies the shape of the array.
>- *order*: Specifies whether "flattening" was in terms of columns ("C") or rows ("F")

In [20]:
# Iterate all elements of the above array
# Get the positions of the even values greater than or equal to 6
# Print the values and positions
posiciones = []
ind = 0
for elemento in arr.ravel():
    if elemento >= 6 and elemento%2==0:
        posiciones.append(ind)
    ind += 1
pos = np.unravel_index(posiciones, arr.shape)
print (arr[pos])
print (pos)

[8 6 8 6 6 8 8 8 6 8 8]
(array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1]), array([0, 0, 1, 1, 1, 1, 2, 0, 1, 1, 2]), array([5, 6, 1, 2, 4, 5, 2, 2, 3, 6, 4]))


In [21]:
# Repeat the above operation
# But now change the even values greater than 6 to 0.
ind = 0
for elemento in arr.ravel():
    if elemento % 2 == 0 and elemento >= 6:
        pos = np.unravel_index(ind, arr.shape)
        arr[pos] = 0
    ind += 1    

print (arr)

[[[7 5 1 2 4 0 0]
  [4 0 0 2 0 0 9]
  [3 4 0 9 9 7 2]]

 [[0 4 0 5 5 1 2]
  [0 7 5 0 2 2 0]
  [0 7 9 0 0 4 5]]]


***
### 1. Add elements to an array

#### numpy.append

> `numpy.append(arr, values, axis=None)`

> Append elements to the end of an array along the axis set by axis. If axis is not specified (axis = None), elements are appended to a one-dimensional version of the array. The values to be added must have the correct shape. This function **returns an array** with the added elements.

>- *arr*: Input array
>- *values*: Values to aggregate. Array with the same dimensions as arr (except for the axis where the data is aggregated).
>- *axis*: Axis where the values will be aggregated.

In [22]:
# Creates an array of (4, 3) with consecutive numbers
# Print the array and its dimensions
arr = np.arange(12).reshape((4,3))
print (arr)
print (arr.shape)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
(4, 3)


In [23]:
# Add the values [5, 5, 5] to the end of the array without specifying the axis.
# Print the resulting array
valores = np.array([5, 5, 5])
res = np.append(arr, valores)
print (res)

[ 0  1  2  3  4  5  6  7  8  9 10 11  5  5  5]


In [24]:
# Add the values [5, 5, 5] to the end of the array specifying axis 0
# Note that the dimensions must be compatible.
# Print the resulting array
valores = np.array([5, 5, 5]).reshape(1, 3)
res = np.append(arr, valores, axis=0)
print (res)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [ 5  5  5]]


In [25]:
# Add the values [5, 5, 5, 5, 5] to the end of the array specifying axis 1.
# Note that the dimensions must be compatible.
# Print the resulting array
valores = np.array([5, 5, 5, 5]).reshape(4, 1)
res = np.append(arr, valores, axis=1)
print (res)

[[ 0  1  2  5]
 [ 3  4  5  5]
 [ 6  7  8  5]
 [ 9 10 11  5]]


***
### 2. Insert elements into an array
#### numpy.insert
>`numpy.insert(arr, obj, values, axis=None)`

>inserts elements at a particular position in an array (specified by obj) on the axis set by axis. If axis is not specified (axis = None), elements are inserted into a one-dimensional version of the array. The values to be added must have the correct shape. This function **returns an array** with the inserted elements.
>- *arr*: Input array
>- *obj*: Object that specifies the index before which the values elements are to be inserted<
>- *values*: Values to be inserted. Must be an array with the **same dimensions as arr**. The dimensions of the values to be inserted must be correct.
>- *axis*: Axis where the values will be added.

In [26]:
# Creates an array of (2, 7) with consecutive numbers
# Print the array and its dimensions
arr = np.arange(14).reshape((2, 7))
print (arr)
print (arr.shape)

[[ 0  1  2  3  4  5  6]
 [ 7  8  9 10 11 12 13]]
(2, 7)


In [27]:
# Inserts two columns with values of 88 and 99 after the 3rd column
# Print the result
valores = np.array([88, 88, 99, 99]).reshape(2,2)
res = np.insert(arr, 2, valores, axis=1)
print (res)

[[ 0  1 88 99  2  3  4  5  6]
 [ 7  8 88 99  9 10 11 12 13]]


In [28]:
# Inserts a row with values of 88 in the first position of the array
# Print the result
valores = np.array([99]*7).reshape(1,7)
res = np.insert(arr, 0, valores, axis=0)
print (res)

[[99 99 99 99 99 99 99]
 [ 0  1  2  3  4  5  6]
 [ 7  8  9 10 11 12 13]]


In [29]:
# Inserts two columns with values of 88 in the first, third and fifth columns
# Print the result
valores = np.array([88, 88]).reshape(2, 1)
res = np.insert(arr, (0, 2, 3), valores, axis = 1)
print (res)

[[88  0  1 88  2 88  3  4  5  6]
 [88  7  8 88  9 88 10 11 12 13]]


***
### 3. Delete elements in an array
#### numpy.delete
>`numpy.delete(arr, obj, axis=None)`

>`numpy.delete(arr, obj, axis=None)` Deletes elements in an array from a particular position (specified by obj) on the axis set by axis. If axis is not specified (axis = None), elements are inserted into a one-dimensional version of the array. This function **returns an array** with the removed elements.

>- *arr*: Input array
>- *obj*: Object specifying the index before which the values elements are to be inserted
>- *axis*: Axis where the values will be added

In [30]:
# Creates an array (4, 4) with random integers from 0 to 10
np.random.seed(1111)
arr = np.random.randint(0, 10, (4,4))
print (arr)

[[7 5 1 2]
 [4 8 6 4]
 [8 6 2 6]
 [8 9 3 4]]


In [31]:
# Deletes the second column
# Prints the original array and the array with the deleted column
res = np.delete(arr, 1, axis = 1)
print (res)
print (arr)

[[7 1 2]
 [4 6 4]
 [8 2 6]
 [8 3 4]]
[[7 5 1 2]
 [4 8 6 4]
 [8 6 2 6]
 [8 9 3 4]]


In [32]:
# Delete the third row
# Print the result
res = np.delete(arr, 2, axis = 0)
print (res)

[[7 5 1 2]
 [4 8 6 4]
 [8 9 3 4]]


In [33]:
# Delete rows 0 and 3
# Print the result
res = np.delete(arr, (0, 3), axis = 0)
print (res)

[[4 8 6 4]
 [8 6 2 6]]


***
### Concatenate arrays
#### numpy.concatenate
>`numpy.concatenate((a1, a2, a3,...), axis=None)`

>concatenates a sequence of arrays a1, a2, a3 ... according to the axis specified by axis. If no axis is specified (axis = 0), concatenation is done for the 1D versions of the arrays.

In [34]:
# Create an array of type integer with 10 random x and y coordinates between 0 and 100.
# Create a second array of type integer with 10 elevation values between 650 and 800.
# Concatenate both arrays to have an array of (10, 3)
np.random.seed(12345)
xy_values = np.random.randint(0, 100, (10,2))
z_values = np.random.randint(650, 800, (10,1))
datos = np.concatenate((xy_values, z_values), axis = 1)
print (datos)

[[ 98  29 693]
 [  1  36 673]
 [ 41  34 679]
 [ 29   1 681]
 [ 59  14 747]
 [ 91  80 741]
 [ 73  11 737]
 [ 77  10 686]
 [ 81  82 714]
 [ 38   7 781]]


In [35]:
# Add the following two pieces of data to the above array (use concatenate)
# data --> [30, 40, 790]
# data --> [70, 20, 640]
# Print the results
dato1 = np.array([30, 40, 790]).reshape(1, 3)
dato2 = np.array([70, 20, 640]).reshape(1, 3)
res = np.concatenate((datos, dato1, dato2), axis = 0)
print (res)

[[ 98  29 693]
 [  1  36 673]
 [ 41  34 679]
 [ 29   1 681]
 [ 59  14 747]
 [ 91  80 741]
 [ 73  11 737]
 [ 77  10 686]
 [ 81  82 714]
 [ 38   7 781]
 [ 30  40 790]
 [ 70  20 640]]


***
### 5. Join arrays

To join arrays we have several functions:

>`numpy.stack((a1, a2, a3,...), axis=0)`

> Join different arrays along a new axis.

***
>`numpy.vstack((a1, a2, a3,...))` ` > Join different arrays vertically along a new axis.

>Join different arrays vertically. 

***
>`numpy.hstack((a1, a2, a3,...))` ` >Join different arrays horizontally.

> Join different arrays horizontally.

In [36]:
# Create three arrays of (4, 4) with values of 1, 2 and 3.
# Join the three arrays into a three-dimensional array
# Print the result and its dimensions
arr1 = np.ones((4,4))
arr2 = np.ones((4,4))
arr3 = np.ones((4,4))
arr1.fill(1)
arr2.fill(2)
arr3.fill(3)

res = np.stack((arr1, arr2, arr3), axis=0)
print (res)
print (res.shape)

[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]

 [[3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]]]
(3, 4, 4)


In [37]:
# Join the three arrays vertically
# Print the result
res = np.vstack((arr1, arr2, arr3))
print (res)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]]


In [38]:
# Join the three arrays horizontally
# Print the result
res = np.hstack((arr1, arr2, arr3))
print (res)

[[1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
 [1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
 [1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
 [1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]]
