![cover](cover/12.%20Concatenation.png)

#### Outline

* Concat Function
* Ignore Index
* Axis
* Join
  * Inner
  * Outer
* Keys
* Sort
* Names

### Concatenation

This function allows us to stack, multiple pandas objects, such as series and dataframes, into a single object. Below is a diagram demonstrating how two objects can be stacked into a single object, depending on the `axis` parameter:

**figure 1** 

*concat() function demonstration*

<img src="diagrams/concat_axis.png" alt="pivot table" width="700">

*A diagram depicting the concat() function applied to two objects on axis' 0 and 1, on the right: both stacked horizontally, and vertically*

##### Import Pandas

In [1]:
import pandas

##### Import Datasets

In [2]:
# dataframe 1
df1 = pandas.DataFrame({
    'num': [1, 2, 3, 4, 5],
    'roman': ['I', 'II', 'III', 'IV', 'V']
})
# dataframe 2
df2 = pandas.DataFrame({
    'num': [6, 7, 8, 9, 10],
    'roman': ['VI', 'VII', 'VIII', 'IX', 'X']
})

### concat()

`pandas.concat([a,b,...,n])`

This function allows us to merge multiple pandas objects, such as series or dataframes, into a single object

* Series
* DataFrame

#### Series

In [3]:
s1 = pandas.Series([1,2,3,4])
s2 = pandas.Series([5,6,7,8])
pandas.concat([s1, s2]) # concatenate two series

0    1
1    2
2    3
3    4
0    5
1    6
2    7
3    8
dtype: int64

#### Dataframes

In [4]:
pandas.concat([df1, df2])   # concatenate two dataframes

Unnamed: 0,num,roman
0,1,I
1,2,II
2,3,III
3,4,IV
4,5,V
0,6,VI
1,7,VII
2,8,VIII
3,9,IX
4,10,X


#### Change Positions

In [5]:
pandas.concat([s2, s1]) # swap the positions

0    5
1    6
2    7
3    8
0    1
1    2
2    3
3    4
dtype: int64

In [6]:
pandas.concat([df2, df1]) # swap the positions

Unnamed: 0,num,roman
0,6,VI
1,7,VII
2,8,VIII
3,9,IX
4,10,X
0,1,I
1,2,II
2,3,III
3,4,IV
4,5,V


### ignore_index

`pandas.concat(ignore_index = True)`

Will ignore the index of the original objects and use an axis of [`0`, `n-1`] for the indices of the new object

In [7]:
pandas.concat([s1, s2], ignore_index=True)  # ignore the original index of each object, use a new index

0    1
1    2
2    3
3    4
4    5
5    6
6    7
7    8
dtype: int64

In [8]:
pandas.concat([df1, df2], ignore_index=True)    # ignore the original index of each object, use a new index

Unnamed: 0,num,roman
0,1,I
1,2,II
2,3,III
3,4,IV
4,5,V
5,6,VI
6,7,VII
7,8,VIII
8,9,IX
9,10,X


### axis
`pandas.concat(axis = n)` \- where `n` can be `0`, or `1`

Changes the axis upon which we concatenate along

In [9]:
pandas.concat([s1, s2], axis=1) # change the concatenation axis to horizontal

Unnamed: 0,0,1
0,1,5
1,2,6
2,3,7
3,4,8


In [10]:
pandas.concat([df1, df2], axis=1)   # change the concatenation axis to horizontal

Unnamed: 0,num,roman,num.1,roman.1
0,1,I,6,VI
1,2,II,7,VII
2,3,III,8,VIII
3,4,IV,9,IX
4,5,V,10,X


### join
`pandas.concat(join = '')`

Determines whether to use union (`outer`) or intersection (`inner`) on the columns (`axis=0`) or rows (`axis=1`) of data

In [11]:
# dataframe 1
df1 = pandas.DataFrame({
    'A' : [1, 2],
    'B' : [3, 4]
})

# dataframe 2
df2 = pandas.DataFrame({
    'B' : [5, 6],
    'C' : [7, 8]
})

#### Inner

In [12]:
pandas.concat([df1, df2], join='inner') # inner join gives us the common values (two 'B' columns)

Unnamed: 0,B
0,3
1,4
0,5
1,6


#### Outer

In [13]:
pandas.concat([df1, df2], join='outer') # outer join gives us all the values from both sets ('A', both 'B' columns and 'C' column)

Unnamed: 0,A,B,C
0,1.0,3,
1,2.0,4,
0,,5,7.0
1,,6,8.0


### Keys
`pandas.concat(keys=['', '',...,''])`

Allows us to label the individual objects of the concatenation

In [14]:
pandas.concat([df1, df2], keys=['dataframe 1', 'dataframe 2'])  # label each dataframe

Unnamed: 0,Unnamed: 1,A,B,C
dataframe 1,0,1.0,3,
dataframe 1,1,2.0,4,
dataframe 2,0,,5,7.0
dataframe 2,1,,6,8.0


### Names
`pandas.concat(names = ['', '',...,''])`

Allows us to label the MultiIndex levels

In [15]:
pandas.concat([df1, df2], keys=['dataframe 1', 'dataframe 2'], names=['dataframe', 'index'])    # label the dataframe and index columns

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B,C
dataframe,index,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
dataframe 1,0,1.0,3,
dataframe 1,1,2.0,4,
dataframe 2,0,,5,7.0
dataframe 2,1,,6,8.0


### Sort
Decides whether to keep the original order of columns before concatenation, or to sort them lexicographically

In [16]:
# dataframe 1
df1 = pandas.DataFrame({
    'A' : [1, 2],
    'C' : [3, 4]
})

# dataframe 2
df2 = pandas.DataFrame({
    'B' : [5, 6]
})

In [17]:
pandas.concat([df1, df2])   # default, unsorted

Unnamed: 0,A,C,B
0,1.0,3.0,
1,2.0,4.0,
0,,,5.0
1,,,6.0


In [18]:
pandas.concat([df1, df2], sort=True)    # now in alphabetical order

Unnamed: 0,A,B,C
0,1.0,,3.0
1,2.0,,4.0
0,,5.0,
1,,6.0,


### For Source code:
https://sites.google.com/view/aorbtech/programming/

#### @Aorb Tech