### Stacking

Stacking arrays means placing arrays one on top of each other, or side by side - so we can stack vertically or horizontally.

Of course, the stacks must be compatible.

For vertical stacking each array must have the same number of columns.

For horizontal stacking (side by side), each array must have the same number of rows.

In [1]:
import numpy as np

In [2]:
a1 = np.arange(1, 6)
a1

array([1, 2, 3, 4, 5])

In [3]:
a2 = np.arange(1, 11).reshape(2, 5)
a2

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [4]:
s1 = np.vstack((a1, a2))
s1

array([[ 1,  2,  3,  4,  5],
       [ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

If we try to vstack two arrays with a different number of columns, we'll get an exception:

In [5]:
try:
    np.vstack((np.arange(3), a2))
except ValueError as ex:
    print(ex)

all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 5


When we stack arrays, what happens to the data type?

NumPy will adjust the datatype to use the broadest one needed to store all the elements in a single array.

For example, if we stack an array of integers and an array of floats, we'll end up with an array of floats:

In [6]:
a1 = np.array([1, 2, 3, 4])
a2 = np.array([0.1, 0.2, 0.3, 0.4])
result = np.vstack((a1, a2))
result

array([[1. , 2. , 3. , 4. ],
       [0.1, 0.2, 0.3, 0.4]])

In [7]:
result.dtype

dtype('float64')

We could also have two arrays with different integer types:

In [8]:
a1 = np.array([1, 2, 3, 4], dtype=np.uint8)
a2 = np.array([1000, 2000, 3000, 4000], dtype=np.uint16)
result = np.vstack((a1, a2))
result

array([[   1,    2,    3,    4],
       [1000, 2000, 3000, 4000]], dtype=uint16)

As you can see, NumPy picked the broadest data type needed for this particular stack.

We can stack more than just two arrays, and NumPy will adjust the data type if needed as well:

In [9]:
np.vstack(
    (
        np.arange(5), 
        np.linspace(0, 1, 5),
        np.eye(5)
    )
)

array([[0.  , 1.  , 2.  , 3.  , 4.  ],
       [0.  , 0.25, 0.5 , 0.75, 1.  ],
       [1.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 1.  , 0.  , 0.  ],
       [0.  , 0.  , 0.  , 1.  , 0.  ],
       [0.  , 0.  , 0.  , 0.  , 1.  ]])

Note that the resulting type is not necessarily picked from the types of the arrays being stacked:

In [10]:
a1 = np.array([1, 2], dtype=np.uint8)
a2 = np.array([2, 3], dtype=np.uint64)
a3 = np.array([4, 5], dtype=np.int32)

result = np.vstack([a1, a2, a3])
result

array([[1., 2.],
       [2., 3.],
       [4., 5.]])

In [11]:
result.dtype

dtype('float64')

As you can see, NumPy decide to choose a `float64`.

If you really want to control that, you will need to use arrays that have the same `dtype`.

We can easily convert an array from one type to another, by using the `astype` method:

In [12]:
a1

array([1, 2], dtype=uint8)

In [13]:
a1.astype(np.float32)

array([1., 2.], dtype=float32)

So we can use this when we stack the arrays. Of course, we have to be careful to choose a data type that will not result in data loss.

In [14]:
result = np.vstack(
    (
        a1.astype(np.int64), 
        a2.astype(np.int64),
        a3.astype(np.int64)
    )
)
result

array([[1, 2],
       [2, 3],
       [4, 5]])

In [15]:
result.dtype

dtype('int64')

**IMPORTANT**

Unlike reshaped arrays, stacked arrays do not "share" their elements with the original arrays.

In [16]:
a1 = np.array([1, 2, 3, 4])
a2 = np.array([10, 20, 30, 40])
result = np.vstack((a1, a2))
result

array([[ 1,  2,  3,  4],
       [10, 20, 30, 40]])

In [17]:
a1[0] = 100
a1

array([100,   2,   3,   4])

In [18]:
result

array([[ 1,  2,  3,  4],
       [10, 20, 30, 40]])

In [19]:
result[1, 0] = -10
result

array([[  1,   2,   3,   4],
       [-10,  20,  30,  40]])

In [20]:
a2

array([10, 20, 30, 40])

As we can see modifying the original array does **not** affect the stacked array, and modifying the stacked array does **not** affect the original array.

The `hstack` method works similarly, except we are stacking side by side.

In [21]:
a1 = np.linspace(0, 5, 10).reshape(5, 2)
a1

array([[0.        , 0.55555556],
       [1.11111111, 1.66666667],
       [2.22222222, 2.77777778],
       [3.33333333, 3.88888889],
       [4.44444444, 5.        ]])

In [22]:
np.random.seed(0)
a2 = np.random.randint(0, 10, 10).reshape(5, 2)
a2

array([[5, 0],
       [3, 3],
       [7, 9],
       [3, 5],
       [2, 4]])

In [23]:
result = np.hstack((a1, a2))
result

array([[0.        , 0.55555556, 5.        , 0.        ],
       [1.11111111, 1.66666667, 3.        , 3.        ],
       [2.22222222, 2.77777778, 7.        , 9.        ],
       [3.33333333, 3.88888889, 3.        , 5.        ],
       [4.44444444, 5.        , 2.        , 4.        ]])