---   
 <img align="left" width="75" height="75"  src="https://upload.wikimedia.org/wikipedia/en/c/c8/University_of_the_Punjab_logo.png"> 

<h1 align="center">Department of Data Science</h1>
<h1 align="center">Course: Tools and Techniques for Data Science</h1>

---
<h3><div align="right">Instructor: Muhammad Arif Butt, Ph.D.</div></h3>    

### _Merging, Joining, Concatenating and Appending Dataframes_
<img align="right" width="400" height="400"  src="images/pandas-apps.png"  >

## Learning agenda of this notebook

**PART-I: (Merging and Joining)**
1. Merging DataFrames using `pd.merge()` method
   - Perform **Inner Join** (which is default)
   - Peform **Outer**/**Full Outer Join**
   - Perform **Left Outer Join**
   - Perform **Right Outer Join**<br><br>
2. Additional Parameters to `pd.merge()` Method  
   - Use of `indicator` parameter to indicate the df to which the value belong
   - Use of `suffixes` parameter to differentiate between common column labels
   - Use of `validate` parameter to check for duplicate keys
   
**PART-II: (Concatenating and Appending)**    

3. Row wise Concatenation using `pd.concat()`

4. Column wise Concatenation using `pd.concat()`

5. Adding a Single Row/Column in a Dataframe using `pd.concat()`

6. Appending Dataframes using `df.append()`

## 1. Merging DataFrames using `pd.merge()` Method
Pandas `pd.merge()` is a versatile method to perform all standard database join operations between DataFrame or named Series objects.

```
pd.merge(left, right, how="inner", indicator=False, on=None, suffixes=("_x", "_y"), validate=None)
```
Where,
- **`left`:** A DataFrame or named Series object.
- **`right`:** Another DataFrame or named Series object.
- **`how`:** specifies the type of join {`inner`, `outer`, `left`, `right`} (default is `inner`)
- **`on`:** Column or index level names to join on. Must be found in both the left and right DataFrame and/or Series objects. 
- **`indicator`:** If set to True, adds a column to the output DataFrame called **`_merge`** with information on the source of each row {`left_only` means, this element is present only in left Dataframe, `right_only` means this is present only in right dataframe, `both` means they are present in both
- **`suffixes`:** A tuple of string suffixes to apply to overlapping columns. Defaults to ('_x', '_y').
- **`validate`:** If specified, checks for uniqueness of keys. This parameter can take following four values (default is None):
    - “one_to_one” or “1:1”: checks if merge keys are unique in both left and right datasets.
    - “one_to_many” or “1:m”: checks if merge keys are unique in left dataset.
    - “many_to_one” or “m:1”: checks if merge keys are unique in right dataset.
    - “many_to_many” or “m:m”: allowed, but does not result in checks.

### a. Inner  Join:

It is the most common type of join you’ll be working with. It returns a dataframe with only those rows that have common characteristics.
An inner join requires each row in the two joined dataframes to have matching column values. This is similar to the intersection of two sets.

<img align="center" width="900" height="600"  src="images/join-inner.png"  >

In [14]:
# Let's create a simple data frame
import pandas as pd

# This dataframe doesn't have entry for Lahore

df_temp = pd.DataFrame({
    'city': ['Lagos', 'Abuja', 'Abia', 'Kaduna', 'Ebonyi', 'Edo', 'Kano', 'Delta'],
    'temperature': [39, 23, 40, 32, 33, 28, 34, 26]
})
df_temp

Unnamed: 0,city,temperature
0,Lagos,39
1,Abuja,23
2,Abia,40
3,Kaduna,32
4,Ebonyi,33
5,Edo,28
6,Kano,34
7,Delta,26


In [12]:

df_hum = pd.DataFrame({
    'city': ['Lagos', 'Abuja', 'Abia', 'Kaduna', 'Kwara', 'Delta', 'Edo', 'Jigawa'],
    'humidity': [79, 93, 80, 63, 88, 98, 70, 77]
})
df_hum

Unnamed: 0,city,humidity
0,Lagos,79
1,Abuja,93
2,Abia,80
3,Kaduna,63
4,Kwara,88
5,Delta,98
6,Edo,70
7,Jigawa,77


**Note the column `city` on which we want to perform an inner join, in the two dataframes has only four cities in common. So the resulting dataframe will have only four rows that are common in both dataframes**

In [15]:
d1 = pd.merge(df_temp, df_hum, how='inner')
d1

Unnamed: 0,city,temperature,humidity
0,Lagos,39,79
1,Abuja,23,93
2,Abia,40,80
3,Kaduna,32,63
4,Edo,28,70
5,Delta,26,98


In [19]:
# merge will perform only for those cities that are common in both, which means it by-default performs inner-join
d1 = pd.merge(df_temp, df_hum, on='city', how='inner', indicator=True)
d1

Unnamed: 0,city,temperature,humidity,_merge
0,Lagos,39,79,both
1,Abuja,23,93,both
2,Abia,40,80,both
3,Kaduna,32,63,both
4,Edo,28,70,both
5,Delta,26,98,both


In [21]:
# Note only the sequence of o/p dataframe changes once we change the order in case of inner join
d1 = pd.merge(df_hum, df_temp, on='city', how='inner', indicator=True)
d1

Unnamed: 0,city,humidity,temperature,_merge
0,Lagos,79,39,both
1,Abuja,93,23,both
2,Abia,80,40,both
3,Kaduna,63,32,both
4,Delta,98,26,both
5,Edo,70,28,both


### b. Full Join:
Also known as Full Outer Join, returns all those records which either have a match in the left or right dataframe. This is similar to the union of two sets.

<img align="center" width="900" height="600"  src="images/join-fullouter.png"  >

In [24]:
d2 = pd.merge(df_temp, df_hum, on='city', how='outer', indicator=True)
d2

Unnamed: 0,city,temperature,humidity,_merge
0,Lagos,39.0,79.0,both
1,Abuja,23.0,93.0,both
2,Abia,40.0,80.0,both
3,Kaduna,32.0,63.0,both
4,Ebonyi,33.0,,left_only
5,Edo,28.0,70.0,both
6,Kano,34.0,,left_only
7,Delta,26.0,98.0,both
8,Kwara,,88.0,right_only
9,Jigawa,,77.0,right_only


In [25]:
# Note only the sequence of o/p dataframe changes once we change the order in case of inner join
d3 = pd.merge(df_hum, df_temp, on='city', how='outer', indicator=True)
d3

Unnamed: 0,city,humidity,temperature,_merge
0,Lagos,79.0,39.0,both
1,Abuja,93.0,23.0,both
2,Abia,80.0,40.0,both
3,Kaduna,63.0,32.0,both
4,Kwara,88.0,,left_only
5,Delta,98.0,26.0,both
6,Edo,70.0,28.0,both
7,Jigawa,77.0,,left_only
8,Ebonyi,,33.0,right_only
9,Kano,,34.0,right_only


###  c. Left Join
Also known as Left outer join. It is simply performs an inner join plus all the non-matching rows of the left dataframe are taken as it is filled with NaN for columns of the right dataframe.

<img align="center" width="900" height="600"  src="images/join-leftouter.png"  >

In [26]:
# In left outer join, it takes all the rows from left dataframe and only common rows from right dataframe
d3 = pd.merge(df_temp, df_hum, on='city', how='left', indicator=True)
d3

Unnamed: 0,city,temperature,humidity,_merge
0,Lagos,39,79.0,both
1,Abuja,23,93.0,both
2,Abia,40,80.0,both
3,Kaduna,32,63.0,both
4,Ebonyi,33,,left_only
5,Edo,28,70.0,both
6,Kano,34,,left_only
7,Delta,26,98.0,both


In [27]:
d4=pd.merge(df_hum, df_temp, on='city', how='left', indicator=True)
d4

Unnamed: 0,city,humidity,temperature,_merge
0,Lagos,79,39.0,both
1,Abuja,93,23.0,both
2,Abia,80,40.0,both
3,Kaduna,63,32.0,both
4,Kwara,88,,left_only
5,Delta,98,26.0,both
6,Edo,70,28.0,both
7,Jigawa,77,,left_only


### d. Right  Join
Also known as Right outer join. It is simply performs an inner join plus  all the non-matching rows of the right dataframe are taken as it is filled with NaN for columns of the left dataframe.

<img align="center" width="900" height="600"  src="images/join-rightouter.png"  >

In [28]:
# In Right outer join, it takes all the rows from Right dataframe and only common rows from left dataframe
d5 = pd.merge(df_temp, df_hum, on='city', how='right', indicator=True)
d5

Unnamed: 0,city,temperature,humidity,_merge
0,Lagos,39.0,79,both
1,Abuja,23.0,93,both
2,Abia,40.0,80,both
3,Kaduna,32.0,63,both
4,Kwara,,88,right_only
5,Delta,26.0,98,both
6,Edo,28.0,70,both
7,Jigawa,,77,right_only


In [29]:
d6 = pd.merge(df_hum, df_temp, on='city', how='right', indicator=True)
d6

Unnamed: 0,city,humidity,temperature,_merge
0,Lagos,79.0,39,both
1,Abuja,93.0,23,both
2,Abia,80.0,40,both
3,Kaduna,63.0,32,both
4,Ebonyi,,33,right_only
5,Edo,70.0,28,both
6,Kano,,34,right_only
7,Delta,98.0,26,both


### 2. Additional Parameters to `pd.merge()` Method

####  Use of `suffixes` Parameter
- When you merge dataframes having columns with same labels, other than the one on which you are joining ('city`)
- The resulting dataframe will have appended suffixes (`_x`, `_y`) with column labels to differentiate b/w columns of both dataframes
- For better understanding you can pass `suffixes`.....
- Let us understand this by example

In [32]:
df1 = pd.DataFrame({
    'city': ['Lagos', 'Abuja', 'Abia', 'Kaduna', 'Ebonyi', 'Edo', 'Kano', 'Delta'],
    'temperature': [39, 23, 40, 32, 33, 28, 34, 26],
    'humidity' : [67,99,86,90,76,86,88,92]
})
df1

Unnamed: 0,city,temperature,humidity
0,Lagos,39,67
1,Abuja,23,99
2,Abia,40,86
3,Kaduna,32,90
4,Ebonyi,33,76
5,Edo,28,86
6,Kano,34,88
7,Delta,26,92


In [36]:
df2 = pd.DataFrame({
    'city': ['Lagos', 'Abuja', 'Abia', 'Kaduna', 'Kwara', 'Delta', 'Edo', 'Jigawa'],
    'temperature' : [32,21,22,44,33,35,37,40],
    'humidity': [79, 93, 80, 63, 88, 98, 70, 77]
})
df2

Unnamed: 0,city,temperature,humidity
0,Lagos,32,79
1,Abuja,21,93
2,Abia,22,80
3,Kaduna,44,63
4,Kwara,33,88
5,Delta,35,98
6,Edo,37,70
7,Jigawa,40,77


In [37]:
df3 = pd.merge(df1, df1, on='city', how='inner')
df3

Unnamed: 0,city,temperature_x,humidity_x,temperature_y,humidity_y
0,Lagos,39,67,39,67
1,Abuja,23,99,23,99
2,Abia,40,86,40,86
3,Kaduna,32,90,32,90
4,Ebonyi,33,76,33,76
5,Edo,28,86,28,86
6,Kano,34,88,34,88
7,Delta,26,92,26,92


- **Note that `merge` has automatically appended suffixes with column labels to differentiate b/w columns of both dataframes**
- **You can use the `suffixes` parameter to `pd.merge()` method to specify the suffixes other than `_x` and `_y` to something more meaningful.**

In [38]:
d3 = pd.merge(df1, df2, on='city', how='inner', suffixes=('_df1', '_df2'))
d3

Unnamed: 0,city,temperature_df1,humidity_df1,temperature_df2,humidity_df2
0,Lagos,39,67,32,79
1,Abuja,23,99,21,93
2,Abia,40,86,22,80
3,Kaduna,32,90,44,63
4,Edo,28,86,37,70
5,Delta,26,92,35,98


####  Use `validate` Parameter to Check for Duplicate Keys
- We can use the `validate` parameter to the `pd.merge()` method to check for uniqueness of keys. This parameter can take following four values (default is None):
    - `one_to_one` or `1:1`: checks if merge keys are unique in both left and right datasets.
    - `one_to_many` or `1:m`: checks if merge keys are unique in left dataset.
    - `many_to_one` or `m:1`: checks if merge keys are unique in right dataset.
    - `many_to_many` or `m:m`: allowed, but does not result in checks.

In [41]:
df_temp

Unnamed: 0,city,temperature
0,Lagos,39
1,Abuja,23
2,Abia,40
3,Kaduna,32
4,Ebonyi,33
5,Edo,28
6,Kano,34
7,Delta,26


In [42]:
df_hum

Unnamed: 0,city,humidity
0,Lagos,79
1,Abuja,93
2,Abia,80
3,Kaduna,63
4,Kwara,88
5,Delta,98
6,Edo,70
7,Jigawa,77


In [43]:
df_1 = pd.concat([df_temp, df_hum], ignore_index=True)
df_1

Unnamed: 0,city,temperature,humidity
0,Lagos,39.0,
1,Abuja,23.0,
2,Abia,40.0,
3,Kaduna,32.0,
4,Ebonyi,33.0,
5,Edo,28.0,
6,Kano,34.0,
7,Delta,26.0,
8,Lagos,,79.0
9,Abuja,,93.0


>**`one_to_one` or `1:1`: checks if merge keys are unique in both left and right dataframes, if not then throw exception**

In [44]:
df_temp, df_hum

(     city  temperature
 0   Lagos           39
 1   Abuja           23
 2    Abia           40
 3  Kaduna           32
 4  Ebonyi           33
 5     Edo           28
 6    Kano           34
 7   Delta           26,
      city  humidity
 0   Lagos        79
 1   Abuja        93
 2    Abia        80
 3  Kaduna        63
 4   Kwara        88
 5   Delta        98
 6     Edo        70
 7  Jigawa        77)

In [45]:
pd.merge(df_temp, df_hum, on='city', how='outer', validate='one_to_one')

Unnamed: 0,city,temperature,humidity
0,Lagos,39.0,79.0
1,Abuja,23.0,93.0
2,Abia,40.0,80.0
3,Kaduna,32.0,63.0
4,Ebonyi,33.0,
5,Edo,28.0,70.0
6,Kano,34.0,
7,Delta,26.0,98.0
8,Kwara,,88.0
9,Jigawa,,77.0


>**`one_to_many` or `1:m`: checks if merge keys are unique in left dataframe, if not then throw exception**

In [46]:
df_temp, df_hum

(     city  temperature
 0   Lagos           39
 1   Abuja           23
 2    Abia           40
 3  Kaduna           32
 4  Ebonyi           33
 5     Edo           28
 6    Kano           34
 7   Delta           26,
      city  humidity
 0   Lagos        79
 1   Abuja        93
 2    Abia        80
 3  Kaduna        63
 4   Kwara        88
 5   Delta        98
 6     Edo        70
 7  Jigawa        77)

In [47]:
pd.merge(df_temp, df_hum, on='city', how='outer', validate='one_to_many')

Unnamed: 0,city,temperature,humidity
0,Lagos,39.0,79.0
1,Abuja,23.0,93.0
2,Abia,40.0,80.0
3,Kaduna,32.0,63.0
4,Ebonyi,33.0,
5,Edo,28.0,70.0
6,Kano,34.0,
7,Delta,26.0,98.0
8,Kwara,,88.0
9,Jigawa,,77.0


>**`many_to_one` or `m:1`: checks if merge keys are unique in right dataframe, if not then throw exception**

In [48]:
pd.merge(df_temp, df_hum,on='city', how='outer', validate='many_to_one')

Unnamed: 0,city,temperature,humidity
0,Lagos,39.0,79.0
1,Abuja,23.0,93.0
2,Abia,40.0,80.0
3,Kaduna,32.0,63.0
4,Ebonyi,33.0,
5,Edo,28.0,70.0
6,Kano,34.0,
7,Delta,26.0,98.0
8,Kwara,,88.0
9,Jigawa,,77.0


>**`many_to_many` or `m:m`: No checks are performed on keys uniqueness**

In [49]:
pd.merge(df_temp, df_hum, on='city', how='outer', validate='many_to_many')

Unnamed: 0,city,temperature,humidity
0,Lagos,39.0,79.0
1,Abuja,23.0,93.0
2,Abia,40.0,80.0
3,Kaduna,32.0,63.0
4,Ebonyi,33.0,
5,Edo,28.0,70.0
6,Kano,34.0,
7,Delta,26.0,98.0
8,Kwara,,88.0
9,Jigawa,,77.0


In [50]:
import pandas as pd

# Left Dataframe (df_temp)
df_temp = pd.DataFrame({
    'city': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Los Angeles'],
    'temperature': [25, 30, 22, 28, 32]
})

# Right Dataframe (df_hum)
df_hum = pd.DataFrame({
    'city': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Los Angeles', 'Chicago'],
    'humidity': [50, 60, 45, 55, 65, 40]
})

# Merging with Many-to-Many Validation
result_many_to_many = pd.merge(df_temp, df_hum, on='city', validate='many_to_many')

# Print the Merged Result
print(result_many_to_many)


          city  temperature  humidity
0     New York           25        50
1  Los Angeles           30        60
2  Los Angeles           30        65
3  Los Angeles           32        60
4  Los Angeles           32        65
5      Chicago           22        45
6      Chicago           22        40
7      Houston           28        55


# Part-II (Concatenating and Appending)


## Concatenation of  DataFrames (Row Wise + Column Wise)

<img align="left" width="350" height="90"  src="images/row.png"  >
<img align="right" width="490" height="100"  src="images/concat_2.png" >

<br><br><br><br><br><br><br><br><br><br>


- The `pd.concat()` method is used to concat pandas objects along a particular axis with optional set logic along the other axes. 
```
pd.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, verify_integrity=False)
```

Where,
- `objs`: a sequence or mapping of Series or DataFrame objects
- `axis`: The axis to concatenate along. {0/’index’, 1/’columns’}, default 0
- `join`{‘inner’, ‘outer’}, Default is `outer` for union. If `inner` that means intersection
- `ignore_index`: If True, the resulting axis will be labeled 0, …, n - 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information. (default is False)
- `keys`: sequence, default None (Construct hierarchical index using the passed keys as the outermost level.)
- `verify_integrity` : boolean, default False. Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation.


## 3. Row-Wise Concatenation
<img align="left" width="350" height="90"  src="images/row.png"  >

In [51]:
nig_weather = pd.DataFrame({
    'city': [ 'Lagos', 'Kano', 'Plateau', 'Ilorin', 'Minna'],
    'temperature' : [35, 39, 33, 29, 15],
    'humidity' : [76, 95, 72, 81, 70],
})
nig_weather

Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70


In [52]:
UAE_Weather = pd.DataFrame({
    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],
    'temperature' : [41, 44, 47, 45],
    'humidity' : [88, 99, 79, 86],
})
UAE_Weather

Unnamed: 0,city,temperature,humidity
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


In [53]:
df1 = pd.concat([nig_weather, UAE_Weather], join='outer')
df1

Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


In [55]:
df2 = pd.concat([nig_weather, UAE_Weather], axis=0)
df2

Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


In [59]:
df2a = pd.concat([nig_weather, UAE_Weather], axis=0, ignore_index=True)
df2a

Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70
5,Dubai,41,88
6,Sharja,44,99
7,Ajman,47,79
8,Abu Dhabi,45,86


- Other than the numeric index, if you want to have an additional index for your sub groups, you can use the `keys` argument to `pd.concat()` method
- It provides multi-indexing
- Remember this will work only if the `ignore_index` argument is `False` which is the default

In [60]:
df3 = pd.concat([nig_weather, UAE_Weather], axis=0, keys=["city", ])
df3

Unnamed: 0,Unnamed: 1,city,temperature,humidity
city,0,Lagos,35,76
city,1,Kano,39,95
city,2,Plateau,33,72
city,3,Ilorin,29,81
city,4,Minna,15,70


In [64]:
df3 = pd.concat([nig_weather, UAE_Weather], axis=0, keys=["Nig", "UAE"])
df3

Unnamed: 0,Unnamed: 1,city,temperature,humidity
Nig,0,Lagos,35,76
Nig,1,Kano,39,95
Nig,2,Plateau,33,72
Nig,3,Ilorin,29,81
Nig,4,Minna,15,70
UAE,0,Dubai,41,88
UAE,1,Sharja,44,99
UAE,2,Ajman,47,79
UAE,3,Abu Dhabi,45,86


- The advantage of doing this is you can use `df.loc` to get a subset of your dataframe
- So, after getting a big dataframe if you want to get the dataframe from which it was created keys arg is useful

In [69]:
# df3.loc['Nig', ['city', 'humidity']]


Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70


In [70]:
df3.loc['UAE', :]

Unnamed: 0,city,temperature,humidity
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


#### What will Happen if one of the Dataframe has an Additional Column
- If you combine two Dataframe objects which do not have all the same columns, then the columns outside the intersection will be filled with NaN values.

In [71]:
# NaN will be placed where values are missing
df = pd.concat([nig_weather,UAE_Weather], axis=0, ignore_index=True)
df

Unnamed: 0,city,temperature,humidity
0,Lagos,35,76
1,Kano,39,95
2,Plateau,33,72
3,Ilorin,29,81
4,Minna,15,70
5,Dubai,41,88
6,Sharja,44,99
7,Ajman,47,79
8,Abu Dhabi,45,86


In [72]:
UAE_Weather = pd.DataFrame({
    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],
    'temperature' : [41, 44, 47, 45],
    'humidity' : [88, 99, 79, 86],
})
UAE_Weather

Unnamed: 0,city,temperature,humidity
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


In [73]:
Pak_Weather = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
    
})
Pak_Weather

Unnamed: 0,city,temperature
0,Lahore,35
1,Karachi,39
2,Peshawer,33
3,Islamabad,29
4,Muree,15


In [75]:
df = pd.concat([Pak_Weather, UAE_Weather], axis=0, ignore_index=False)
df

Unnamed: 0,city,temperature,humidity
0,Lahore,35,
1,Karachi,39,
2,Peshawer,33,
3,Islamabad,29,
4,Muree,15,
0,Dubai,41,88.0
1,Sharja,44,99.0
2,Ajman,47,79.0
3,Abu Dhabi,45,86.0


## 4. Column Wise Concatenation
- It is not advised to concatenate dataframes column wise. If you want to then you need to take care of some checks like:
    - the number of rows must be same in both dataframes, and
    - Indexes of both dataframes are sorted
- If you are done with all the checks then you can simply use `axis=1` to do the job.

<img align="left" width="490" height="100"  src="images/concat_2.png"  >

### a. Creating a two Simple Dataframe

In [76]:
temp_df = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
})
temp_df

Unnamed: 0,city,temperature
0,Lahore,35
1,Karachi,39
2,Peshawer,33
3,Islamabad,29
4,Muree,15


In [77]:
wind_df = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'wind speed' : [9, 12, 7, 13, 18],
})
wind_df

Unnamed: 0,city,wind speed
0,Lahore,9
1,Karachi,12
2,Peshawer,7
3,Islamabad,13
4,Muree,18


### b. Concatenate Dataframes (column-wise)

In [78]:
df = pd.concat([temp_df, wind_df], axis=1)

In [79]:
df

Unnamed: 0,city,temperature,city.1,wind speed
0,Lahore,35,Lahore,9
1,Karachi,39,Karachi,12
2,Peshawer,33,Peshawer,7
3,Islamabad,29,Islamabad,13
4,Muree,15,Muree,18


### c. What will happen if we have missing data in our dataframes

In [80]:
# This dataframe do not have the temperature for Lahore
temp_df = pd.DataFrame({
    'city': [ 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [39, 33, 29, 15],
})
temp_df

Unnamed: 0,city,temperature
0,Karachi,39
1,Peshawer,33
2,Islamabad,29
3,Muree,15


In [81]:
#This dataframe do not have the windspeed of Islamabad
wind_df = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Muree'],
    'wind speed' : [9, 12, 7, 18],
})
wind_df

Unnamed: 0,city,wind speed
0,Lahore,9
1,Karachi,12
2,Peshawer,7
3,Muree,18


In [82]:
df1 = pd.concat([temp_df, wind_df], axis=1)
df1

Unnamed: 0,city,temperature,city.1,wind speed
0,Karachi,39,Lahore,9
1,Peshawer,33,Karachi,12
2,Islamabad,29,Peshawer,7
3,Muree,15,Muree,18


In [83]:
# This dataframe do not have the temperature for Lahore
temp_df = pd.DataFrame({
    'city': [ 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [39, 33, 29, 15],
},index=[0,1,2,3])
temp_df

Unnamed: 0,city,temperature
0,Karachi,39
1,Peshawer,33
2,Islamabad,29
3,Muree,15


In [84]:
#This dataframe do not have the windspeed of Islamabad
# Note the indices in wind_df are related to indices of temp_df
wind_df = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Muree'],
    'wind speed' : [9, 12, 7, 18],
}, index=[4,0,1,3])
wind_df

Unnamed: 0,city,wind speed
4,Lahore,9
0,Karachi,12
1,Peshawer,7
3,Muree,18


In [85]:
df = pd.concat([temp_df,wind_df], axis=1)
df

Unnamed: 0,city,temperature,city.1,wind speed
0,Karachi,39.0,Karachi,12.0
1,Peshawer,33.0,Peshawer,7.0
2,Islamabad,29.0,,
3,Muree,15.0,Muree,18.0
4,,,Lahore,9.0


>- Concatenating Dataframes along axis = 1 adds one Dataframe along the other. It is like a full outer join. Placing NaN for non-matching rows in the left as well as right Dataframes.
>- By default, a concatenation results in a set union, where all data is preserved.

## 5. Adding a Single Row/Column in a Dataframe
- Now let us see how we can concat a single row or a single column to a dataframe using the `pd.concat()` method.

In [86]:
df1 = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
    'humidity' : [76, 95, 72, 81, 70],
})
df1

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95
2,Peshawer,33,72
3,Islamabad,29,81
4,Muree,15,70


In [87]:
df2 = pd.DataFrame({"city": "Multan", "temperature": 45, "humidity": 75}, index=[5])
df2

Unnamed: 0,city,temperature,humidity
5,Multan,45,75


>**You can place the new row at your desired location using slicing operator, as shown below**

In [90]:
df1[:2]

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95


In [91]:
df1[2:]

Unnamed: 0,city,temperature,humidity
2,Peshawer,33,72
3,Islamabad,29,81
4,Muree,15,70


In [92]:
df2

Unnamed: 0,city,temperature,humidity
5,Multan,45,75


In [89]:
df3 = pd.concat([df1[:2], df2, df1[2:]], ignore_index=True)
df3

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95
2,Multan,45,75
3,Peshawer,33,72
4,Islamabad,29,81
5,Muree,15,70


### b. Adding a Column in a Dataframe

In [93]:
Pak_Weather = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
    'humidity' : [76, 95, 72, 81, 70],
})
Pak_Weather

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95
2,Peshawer,33,72
3,Islamabad,29,81
4,Muree,15,70


In [94]:
s = pd.Series(["Humid", 'Dry', 'Rainy', 'Humid', 'Rainy'], name="event")
s

0    Humid
1      Dry
2    Rainy
3    Humid
4    Rainy
Name: event, dtype: object

In [95]:
df=pd.concat([Pak_Weather, s], axis=1)

In [96]:
df

Unnamed: 0,city,temperature,humidity,event
0,Lahore,35,76,Humid
1,Karachi,39,95,Dry
2,Peshawer,33,72,Rainy
3,Islamabad,29,81,Humid
4,Muree,15,70,Rainy


<img align="right" width="310" height="100"  src="images/append.png"  >

## 6. Appending DataFrames
- The `df1.append(df2)` method is used to concat the second dataframe’s records at the end of first dataframe (along axis=0). Columns not present in the first DataFrame are added as new columns
- The `df1.append(df2)` method considers the calling dataframe as main object and adds rows to that dataframe from the dataframes that are passed to the function as argument.
- It returns a new dataframe object consisting of the rows of caller and the rows of `other`. The dataframe that called the `append()` method,  remain unchanged.
```
df.append(other, ignore_index=False, verify_integrity=False, sort=False)
```

    - `other`: DataFrame or Series/dict-like object, or list of these (The data to append.)
    - `ignore_index`: If True, the resulting axis will be labeled 0, 1, …, n - 1 (default is False)
    - `verify_integrity`: If True, raise ValueError on creating index with duplicates (default is False)
    - `sort`: Sort columns if the columns of `self` and `other` are not aligned (default is False)

### a. Append Two DataFrames

In [100]:
Pak_Weather = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
    'humidity' : [76, 95, 72, 81, 70],
})
Pak_Weather

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95
2,Peshawer,33,72
3,Islamabad,29,81
4,Muree,15,70


In [101]:
UAE_Weather = pd.DataFrame({
    'city': [ 'Dubai', 'Sharja', 'Ajman', 'Abu Dhabi'],
    'temperature' : [41, 44, 47, 45],
    'humidity' : [88, 99, 79, 86],
})
UAE_Weather

Unnamed: 0,city,temperature,humidity
0,Dubai,41,88
1,Sharja,44,99
2,Ajman,47,79
3,Abu Dhabi,45,86


In [104]:
# append Dataframe
df2 =  Pak_Weather.append(UAE_Weather, ignore_index=True)
df2

AttributeError: 'DataFrame' object has no attribute 'append'

### b. Append a Row in DataFrame

In [105]:
Pak_Weather = pd.DataFrame({
    'city': [ 'Lahore', 'Karachi', 'Peshawer', 'Islamabad', 'Muree'],
    'temperature' : [35, 39, 33, 29, 15],
    'humidity' : [76, 95, 72, 81, 70],
})
Pak_Weather

Unnamed: 0,city,temperature,humidity
0,Lahore,35,76
1,Karachi,39,95
2,Peshawer,33,72
3,Islamabad,29,81
4,Muree,15,70


In [106]:
# Creating a row to be appended
d1 = pd.DataFrame({"city": "Multan", "temperature": 45, "humidity": 75}, index=[5])
d1

Unnamed: 0,city,temperature,humidity
5,Multan,45,75


In [107]:
# Append this dataframe having single row to Pak_Weather dataframe
df3 =  Pak_Weather.append(d1)
df3

AttributeError: 'DataFrame' object has no attribute 'append'

In [108]:
d1 = pd.DataFrame({"city": "Sialkot", "temperature": 45, "humidity": 75, "newcol": 66}, index=[5])
d1

Unnamed: 0,city,temperature,humidity,newcol
5,Sialkot,45,75,66


In [109]:
df3 =  Pak_Weather.append(d1)
df3

AttributeError: 'DataFrame' object has no attribute 'append'