In [48]:
import torch

# **Merging Tensors: 5 functions you should be aware of**

"*PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.*" 

 -from PyTorch Documentation

When it comes to machine learning, we have dealt with pandas DataFrame and ways to merge them. We could see those as merging of 2 dimensional data structures. But since tensors can be of  3 dimensions or even more, it is essential to know the ways in which they can be merged. 

Here we try and look at 5 functions you should know of to understand different ways of merging tensors.


**Function 1 - torch.cat**

**Function 2 - torch.stack**

**Function 3 - torch.unsqueeze**

**Function 4 - torch.hstack**

**Function 5 - torch.vstack**



**N.B:** 

Before we dive into merging, it's best to remember the following:
dimensions are numbered, starting from 0, similar to python indexing, But merging two 3 D tensors can be tricky and

Merging along 
> **dimension 0** is like merging two tensors **channel-wise**  visually

> **dimension 1** is like merging two tensors **row-wise** visually

> **dimension 2** is like merging two tensors **column-wise** visually



Hence a tensor with shape (2,3,4) would look like it has 2 channels, each containing  3 rows and 4 column (tensor t displayed below for reference) and a tensor with shape (4,1,3) looks like it has 4 channels, each containing a 1 row and 3 columns.



In [49]:
t =torch.tensor([[[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
print(t)
print(t.size())

tensor([[[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
torch.Size([2, 3, 4])


#**Function 1 - torch.cat**

*torch.cat(tensors, dim=0, *, out=None) → Tensor*

**Concatenates the given sequence of seq tensors in the given dimension.** All tensors must either have the same shape (except in the concatenating dimension) or be empty.


This is one of the most common ways in which two tensors can be merged. 

**Note:** In order to make the size of tensors apparent, I'll be using ***torch.full*** to create tensors. 

In [50]:
import torch

#Example 1.1-(woking)

t1 =torch.full((2,2,2),4)
print('t1:\n',t1)

t2 =torch.full((2,2,2),7)
print('t2:\n',t2)

t3 =torch.cat((t1,t2))
print('t3:\n',t3)
print('t3 size:\n',t3.size())

t1:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]]])
t2:
 tensor([[[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t3:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]],

        [[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t3 size:
 torch.Size([4, 2, 2])


**In simple words, we specify the tensors we want to concatenate and mention the dimension in which we want the concatenation to occur.**

If no dimension is specified, it concatenates along dimension 0. 

In [51]:
#The above is same as torch.cat((t1,t2), dim=0) 
t4 =torch.cat((t1,t2), dim=0)
print('t4:\n',t4)
print('t4 size:\n',t4.size())

t4:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]],

        [[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t4 size:
 torch.Size([4, 2, 2])


**While using torch.cat, tensors can have different sizes, the only condition being only that dimension along which it is being merged can have different values.**

In [52]:
#Example 1.2 -(woking)
t5 =torch.full((2,1,1),6)
print('t5:\n',t5)

t6 =torch.full((2,1,3),8)
print('t6:\n',t6)

t7 =torch.cat((t5,t6), dim=2) #merging along dimension 2 (third dimension)
print('t7:\n',t7)
print('t7 size:\n',t7.size())

t5:
 tensor([[[6]],

        [[6]]])
t6:
 tensor([[[8, 8, 8]],

        [[8, 8, 8]]])
t7:
 tensor([[[6, 8, 8, 8]],

        [[6, 8, 8, 8]]])
t7 size:
 torch.Size([2, 1, 4])


**ERROR:** While using torch.cat, error can occur probably in two ways:

 Case 1: tensors have different sizes in dimensions other than dimension zero, and dimension is not specified while using torch.cat.

Case 2: tensors have different sizes in dimensions other than dimension specified while using torch.cat.

In [53]:
#Example 1.31 -(breaking)
#Case1

t5 =torch.full((2,1,1),6)
print('t5:\n',t5)

t6 =torch.full((2,1,3),8)
print('t6:\n',t6)

t8 =torch.cat((t5,t6)) # dimension not specified
print('t8:\n',t8)
print('t8 size:\n',t8.size())

t5:
 tensor([[[6]],

        [[6]]])
t6:
 tensor([[[8, 8, 8]],

        [[8, 8, 8]]])


RuntimeError: ignored

In [None]:
#Example 1.32 -(breaking)
#Case2

t5 =torch.full((2,1,1),6)
print('t5:\n',t5)

t6 =torch.full((2,1,3),8)
print('t6:\n',t6)

t9 =torch.cat((t5,t6), dim=1) # merging along dimension 1 (Second dimension)
print('t9:\n',t9)
print('t9 size:\n',t9.size())

## **SUMMARY**

*torch.cat* **can be used when we need to merge two tensors along a dimension specified by us, as long as the sizes of dimensions** (other than the one along which we are merging) **remains the same.**

# **Function 2 - torch.stack**

*torch.stack(tensors, dim=0, *, out=None) → Tensor*

**Concatenates a sequence of tensors along a new dimension.** All tensors need to be of the same size.


To see the difference between **torch.cat** and **torch.stack**, let's take example 1 used for torch.cat and use torch.stack instead. (Remember that in example 1, the dimension was zero)

In [54]:
#Example 2.1-(woking) (same example used in torch.cat (exmaple 1.1))

t1 =torch.full((2,2,2),4)
print('t1:\n',t1)

t2 =torch.full((2,2,2),7)
print('t2:\n',t2)

t10 =torch.stack((t1,t2))
print('t10:\n',t10)
print('t10 size:\n',t10.size())

t1:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]]])
t2:
 tensor([[[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t10:
 tensor([[[[4, 4],
          [4, 4]],

         [[4, 4],
          [4, 4]]],


        [[[7, 7],
          [7, 7]],

         [[7, 7],
          [7, 7]]]])
t10 size:
 torch.Size([2, 2, 2, 2])


**By looking at the output size, it is clear that the output has 4 dimensions. So what actually happened? In order to explain that, we need to introduce a new function,** *torch.unsqueeze*.

 
 # **Function 3 - torch.unsqueeze**

*torch.unsqueeze(input, dim) → Tensor*

**Returns a new tensor with a dimension of size one inserted at the specified position.**The returned tensor shares the same underlying data with this tensor.
    


A dim value within the range [-input.dim() - 1, input.dim() + 1) can be used. Negative dim will correspond to unsqueeze() applied at dim = dim + input.dim() + 1.

In [55]:
# Example 3.1 - (working)

t1 =torch.full((2,2,2),4)
print('t1:\n',t1)

t1_unsqueezed =torch.unsqueeze(t1, dim=0) #unsqueezed along dim 0
print('t1_unsqueezed:\n',t1_unsqueezed)
print('t1_unsqueezed size:\n',t1_unsqueezed.size())


t1:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]]])
t1_unsqueezed:
 tensor([[[[4, 4],
          [4, 4]],

         [[4, 4],
          [4, 4]]]])
t1_unsqueezed size:
 torch.Size([1, 2, 2, 2])


We can see that what torch.unsqueeze did is that, it **returned a tensor with size 1 introduced at the specified dimension (dimension 0), pushing the original size at dimension 0 to dimension; original size at dimension 1 to dimension 2 and so on, increasing the size of the initial tensor by 1 additional dimension.**

The best way to understand this is by considering an analogy. 

Let's take a 3 dimensional tensor as 3 persons standing in a straight line. 

 
When we unsqueeze that tensor along dimension 0, we are introducing a new person of size 1 at the left most position of the line, and now the line has 4 persons, just like a tensor having 4 dimensions.

When we unsqueeze that tensor along dimension 1, we are introducing a new person of size 1 at the second position from left, pushing the current 2nd positioned person into 3rd position and current 3rd positioned person into position 4, while the person at position 1 remains there, and now the line has 4 persons, just like a tensor having 4 dimensions.

Similar way, unsqueeze can introduce a new dimension of size 1 at dimensions specifed by us.

In [56]:
# Example 3.2 - (working)
t6 =torch.full((2,1,3),8)
print('t6:\n',t6)

t6_unsqueezed =torch.unsqueeze(t6, dim=-1) 
print('t6_unsqueezed:\n',t6_unsqueezed)
print('t6_unsqueezed size:\n',t6_unsqueezed.size())

t6:
 tensor([[[8, 8, 8]],

        [[8, 8, 8]]])
t6_unsqueezed:
 tensor([[[[8],
          [8],
          [8]]],


        [[[8],
          [8],
          [8]]]])
t6_unsqueezed size:
 torch.Size([2, 1, 3, 1])


**similar to how -1 in a python list means last entry, here dim= -1 means, tensor is unsqueezed along dim 3**

**ERROR:** 

Unsqueeze can introduce an additional dimension of size 1 but it has a range of [-input.dim() - 1, input.dim() + 1) only.

this means a 3 dimensional tensor has a range of dimensional value [-4,3]. if we specify a dimension beyond this range, it will display an error.

In [57]:
# Example 3.3 - (breaking)

t2 =torch.full((2,2,2),7)
print('t2:\n',t2)

t2_unsqueezed =torch.unsqueeze(t2, dim= 4) 
print('t2_unsqueezed:\n',t2_unsqueezed)
print('t2_unsqueezed size:\n',t2_unsqueezed.size())

t2:
 tensor([[[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])


IndexError: ignored

## **SUMMARY**

*torch.unsqueeze* **can be used when we need to create a new tensor with one additional dimension of size 1, at a desired dimension, from an already existing tensor.**

**NOW LET'S GET BACK TO** torch.stack()

In [58]:
#unsqueezing t2 along dimension 0

t2 =torch.full((2,2,2),7)
print('t2:\n',t2)

t2_unsqueezed =torch.unsqueeze(t2, dim= 0) 
print('t2_unsqueezed:\n',t2_unsqueezed)
print('t2_unsqueezed size:\n',t2_unsqueezed.size())



t2:
 tensor([[[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t2_unsqueezed:
 tensor([[[[7, 7],
          [7, 7]],

         [[7, 7],
          [7, 7]]]])
t2_unsqueezed size:
 torch.Size([1, 2, 2, 2])


In [59]:

#Revisiting Example 2.1

t1 =torch.full((2,2,2),4)
print('t1:\n',t1)

t2 =torch.full((2,2,2),7)
print('t2:\n',t2)

print('t1_unsqueezed:\n', t1_unsqueezed)
print('t2_unsqueezed:\n', t2_unsqueezed)

#comparing torch.cat and torch.stack

t10 =torch.stack((t1,t2))
print('t10:\n',t10)

t3_new = torch.cat((t1_unsqueezed,t2_unsqueezed),dim=0)
print('t3_new:\n',t3_new)


print('t10 size:\n',t10.size())
print('t3_new size:\n',t3_new.size())

t1:
 tensor([[[4, 4],
         [4, 4]],

        [[4, 4],
         [4, 4]]])
t2:
 tensor([[[7, 7],
         [7, 7]],

        [[7, 7],
         [7, 7]]])
t1_unsqueezed:
 tensor([[[[4, 4],
          [4, 4]],

         [[4, 4],
          [4, 4]]]])
t2_unsqueezed:
 tensor([[[[7, 7],
          [7, 7]],

         [[7, 7],
          [7, 7]]]])
t10:
 tensor([[[[4, 4],
          [4, 4]],

         [[4, 4],
          [4, 4]]],


        [[[7, 7],
          [7, 7]],

         [[7, 7],
          [7, 7]]]])
t3_new:
 tensor([[[[4, 4],
          [4, 4]],

         [[4, 4],
          [4, 4]]],


        [[[7, 7],
          [7, 7]],

         [[7, 7],
          [7, 7]]]])
t10 size:
 torch.Size([2, 2, 2, 2])
t3_new size:
 torch.Size([2, 2, 2, 2])


**From this comparison it is clear that when we torch.stack(t1,t2) along dim =0, what we are essentially doing is unsqueeze t1 and t2 along dim 0, and then torch.cat them along dim 0**

In [60]:

# Example 2.2 - (working)

t11 =torch.full((2,6,7,5),4)
#print('t11:\n',t11)          # commenting this out due to the lengthy result.Uncomment this to get the result

t12 =torch.full((2,6,7,5),7)
#print('t12:\n',t12)          # commenting this out due to the lengthy result.Uncomment this to get the result

t13 =torch.stack((t11,t12), dim=-1)
#print('t13:\n',t13)          # commenting this out due to the lengthy result.Uncomment this to get the result
print('t13 size:\n',t13.size())

t13 size:
 torch.Size([2, 6, 7, 5, 2])


**Just like torch.unsqueeze, dimensions can be in the range of [-input.dim() - 1, input.dim() + 1)**

In [61]:
# Example 2.3 - (breaking)

t11 =torch.full((2,6,7,5),4)
#print('t11:\n',t11)          # commenting this out due to the lengthy result.Uncomment this to get the result

t14 =torch.full((2,6,7,1),7)
#print('t14:\n',t14)          # commenting this out due to the lengthy result.Uncomment this to get the result

t15 =torch.stack((t11,t14), dim= 0)
#print('t15:\n',t15)          # commenting this out due to the lengthy result.Uncomment this to get the result
print('t15 size:\n',t15.size())

RuntimeError: ignored

**For torch.stack to work, both tensors should have same size in all dimensions. If the size is not matching even in one dimension, it will display error.**

## **SUMMARY**

*torch.stack* **can be used when we need to concatenate two tensors along a new dimension.**

 
 # **Function 4 - torch.hstack**

*torch.hstack(tensors, *, out=None) → Tensor*

**Stack tensors in sequence horizontally (column wise).** This is equivalent to concatenation along the first axis for 1-D tensors, and along the second axis for all other tensors.

torch.hstack is a **special case of torch.cat** where merging column-wise is the only acceptable merging.

In [62]:
#Example 4.1 -(working)

t16 =torch.full((1,2),4)
print('t16:\n',t16)         

t17 =torch.full((1,3),7)
print('t17:\n',t17)          

t18 =torch.hstack((t16,t17))
print('t18:\n',t18)         
print('t18 size:\n',t18.size())

t16:
 tensor([[4, 4]])
t17:
 tensor([[7, 7, 7]])
t18:
 tensor([[4, 4, 7, 7, 7]])
t18 size:
 torch.Size([1, 5])


**Since the above two tensors are 2D, merging column-wise means merging along dimension 1. Size of tensors can varying only in dimension 1 here, otherwise an error will be displayed.**

In [63]:
#Example 4.2 -(working)

t19 =torch.full((2,1,4),4)
print('t19:\n',t19)         

t20 =torch.full((2,2,4),7)
print('t20:\n',t20)          

t21 =torch.hstack((t19,t20))
print('t21:\n',t21)         
print('t21 size:\n',t21.size())

t19:
 tensor([[[4, 4, 4, 4]],

        [[4, 4, 4, 4]]])
t20:
 tensor([[[7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7]]])
t21:
 tensor([[[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[4, 4, 4, 4],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
t21 size:
 torch.Size([2, 3, 4])


**Although we know that hstack merges two tensors column-wise, it looks completely different when we look at the input and output tensors.** 

**If you look carefully, you can see that both input tensors looks like having 2 channels, each containing 1 row and 4 columns for the first tensor (t19) ; while each containing 2 rows and 4 columns for the second tensor(t20).**

**Moreover, the output looks as if first channel of t19 combined with first channel of t20 row-wise and the same happened to second channel of t19 and t20.** 

In [64]:
#Example 4.3 -(breaking)

t11 =torch.full((2,6,7,5),4)
#print('t11:\n',t11)          # commenting this out due to the lengthy result.Uncomment this to get the result

t14 =torch.full((2,6,7,1),7)
#print('t14:\n',t14)          # commenting this out due to the lengthy result.Uncomment this to get the result

t22 =torch.hstack((t11,t14))
#print('t22:\n',t22)          # commenting this out due to the lengthy result.Uncomment this to get the result
print('t22 size:\n',t22.size())

RuntimeError: ignored

**Since torch.hstack only accepts merging column-wise for 2d tensors or dimension 1 for 3 or more dimensional tensors, when two tensors don't have same size in any other dimension other than dimension 1, error is displayed.**

## **SUMMARY**

*torch.hstack* **can be used when we need to concatenate two tensors along a dimension 1 by default.**

 
 # **Function 5 - torch.vstack**

*torch.vstack(tensors, *, out=None) → Tensor*

**Stack tensors in sequence vertically (row wise).** This is equivalent to concatenation along the first axis after all 1-D tensors have been reshaped by torch.atleast_2d().

torch.hstack is a **special case of torch.cat** where merging row-wise is the only acceptable merging.

In [65]:
#Example 5.1 -(working)

t16 =torch.full((1,2),4)
print('t16:\n',t16)         

t17 =torch.full((3,2),7)
print('t17:\n',t17)          

t23 =torch.vstack((t16,t17))
print('t23:\n',t23)         
print('t23 size:\n',t23.size())

t16:
 tensor([[4, 4]])
t17:
 tensor([[7, 7],
        [7, 7],
        [7, 7]])
t23:
 tensor([[4, 4],
        [7, 7],
        [7, 7],
        [7, 7]])
t23 size:
 torch.Size([4, 2])


**similar to hstack except that it only accepts row-wise merging.**

In [66]:
#Example 5.2 -(working)

t19 =torch.full((2,3,4),4)
print('t19:\n',t19)         

t20 =torch.full((3,3,4),7)
print('t20:\n',t20)          

t24 =torch.vstack((t19,t20))
print('t24:\n',t24)         
print('t24 size:\n',t24.size())

t19:
 tensor([[[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]]])
t20:
 tensor([[[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
t24:
 tensor([[[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
t24 size:
 torch.Size([5, 3, 4])


**In torch.vstack, merging occurs only along dimension 0, hence we could say that torch.vstack is same as default torch.cat.**

In [67]:
#Comparing torch.vstack and default torch.cat (meaning dim=0)

t24 =torch.vstack((t19,t20)) #from the earlier example
print('t24:\n',t24)         
print('t24 size:\n',t24.size()) 

t25 =torch.cat((t19,t20)) 
print('t25:\n',t25)         
print('t25 size:\n',t25.size()) 

t24:
 tensor([[[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
t24 size:
 torch.Size([5, 3, 4])
t25:
 tensor([[[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[4, 4, 4, 4],
         [4, 4, 4, 4],
         [4, 4, 4, 4]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]],

        [[7, 7, 7, 7],
         [7, 7, 7, 7],
         [7, 7, 7, 7]]])
t25 size:
 torch.Size([5, 3, 4])


**HENCE PROVED**

In [68]:
#Example 5.3 -(breaking)

t11 =torch.full((2,6,7,5),4)
#print('t11:\n',t11)          # commenting this out due to the lengthy result.Uncomment this to get the result

t14 =torch.full((2,6,7,1),7)
#print('t14:\n',t14)          # commenting this out due to the lengthy result.Uncomment this to get the result

t26 =torch.vstack((t11,t14))
#print('t26:\n',t26)          # commenting this out due to the lengthy result.Uncomment this to get the result
print('t26 size:\n',t26.size())

RuntimeError: ignored

**Since torch.vstack only accepts merging row-wise for 2d tensors or dimension 0 for 3 or more dimensional tensors, when two tensors don't have same size in any other dimension other than dimension 0, error is displayed.**

## **SUMMARY**

*torch.vstack* **can be used when we need to concatenate two tensors along a dimension 0 by default.**