<center><h1>Merging DataFrames</h1></center>

In [1]:
# Alright, let's get started with our data journey! We're importing the pandas library, a powerful tool for data manipulation and analysis.
# Think of pandas as our trusty guide, helping us navigate through the vast landscape of data with ease.

import pandas as pd

In [2]:
# Now that we have pandas by our side, let's embark on our data adventure! We're creating two DataFrames, df1 and df2,
# each containing information about letters and their corresponding numbers.

# Imagine df1 as a collection of letters A, B, C, and D, paired with their respective numbers 1, 2, 3, and 4.
# It's like having a treasure map with clues leading us to hidden treasures represented by these letters and numbers.

# Similarly, df2 holds another set of letters C, D, E, and F, along with their associated numbers 3, 4, 5, and 6.
# It's as if we stumbled upon another map, revealing new treasures waiting to be discovered.

df1 = pd.DataFrame({'letter':['A', 'B', 'C', 'D'],
                   'number':[1,2,3,4],})

df2 = pd.DataFrame({'letter':['C', 'D', 'E', 'F'],
                   'number':[3,4,5,6],})

## Left Join

In [3]:
# Ah, it seems we've stumbled upon an intersection of our data maps! We're merging DataFrame df1 with df2,
# using the 'number' column as our guide. This merge operation will help us uncover connections between the two datasets,
# revealing shared treasures and uncovering new insights.

# By specifying 'left' as the merging method, we're ensuring that all rows from df1 are retained,
# with matching rows from df2 appended where available. It's like overlaying one map onto another,
# allowing us to see where the treasures align and where they diverge.

# The 'number' column serves as our key, guiding us through the merging process and helping us navigate the data landscape.
# With each merge, we piece together the puzzle of our data story, unlocking hidden gems and revealing patterns.

# The 'how' parameter determines the merging method. Here, we've chosen 'left', indicating that we want to retain all rows from df1,
# even if there are no matching rows in df2.

# The 'on' parameter specifies the column(s) to merge on. In our case, we're merging based on the 'number' column,
# which acts as a common identifier between the two DataFrames.

merged_df = df1.merge(df2, how='left', on='number')
merged_df

Unnamed: 0,letter_x,number,letter_y
0,A,1,
1,B,2,
2,C,3,C
3,D,4,D


## Inner Join

In [4]:
# It looks like we're diving deeper into the intersection of our data maps! We're merging DataFrame df1 with df2 once again,
# but this time using an inner join to uncover only the shared treasures between the two datasets.

# By specifying 'inner' as the merging method, we're focusing solely on the overlapping regions of our data maps,
# where the treasures are shared between both datasets. It's like zooming in on a specific area of our maps,
# where we expect to find common landmarks and hidden gems.

# The 'left_on' and 'right_on' parameters allow us to specify the columns to merge on from each DataFrame.
# Here, we're using 'number' as the key column from both df1 and df2, ensuring that we align the treasures correctly
# and uncover meaningful connections between the two datasets.

# As we embark on this inner journey, we anticipate discovering shared insights and uncovering hidden patterns
# that will enrich our understanding of the data landscape.

merged_df = df1.merge(df2, how='inner', left_on='number', right_on='number')
merged_df

Unnamed: 0,letter_x,number,letter_y
0,C,3,C
1,D,4,D


## Right Join

In [5]:
# Ah, it seems we're exploring a different path this time! We're merging DataFrame df1 with df2, but with a twist.
# This time, we're using a right join to ensure that all rows from df2 are retained, even if there are no matching rows in df1.

# By specifying 'right' as the merging method, we're prioritizing the rows from df2, ensuring that they all find a place in the merged DataFrame.
# It's like extending an invitation to all the treasures on the right side of our map, ensuring they're not left behind.

# The 'on' parameter specifies the column to merge on, and here we're using the 'number' column as our guide,
# ensuring that the merging process aligns with the numeric keys in both datasets.

# Additionally, we're using the 'suffixes' parameter to add a suffix to any overlapping column names between df1 and df2.
# This helps us differentiate between columns from the left and right DataFrames, ensuring clarity and avoiding confusion.

# As we merge df1 with df2 using a right join, we anticipate uncovering new insights and expanding our understanding
# of the data landscape, with each row representing a unique treasure waiting to be discovered.

merged_df = df1.merge(df2, how='right', on='number', suffixes=('', '_right'))
merged_df

Unnamed: 0,letter,number,letter_right
0,C,3,C
1,D,4,D
2,,5,E
3,,6,F


## Union with pd.concat

In [7]:
# Ah, it seems we're embarking on a journey of merging without boundaries! We're merging DataFrame df1 with df2 using the concat() function,
# which allows us to seamlessly combine the two datasets into a single DataFrame, df3.

# With each DataFrame representing a distinct map of treasures, the concat() function acts like a magical binder,
# bringing together the treasures from both df1 and df2 into a unified collection.

# By resetting the index after concatenation, we ensure that the index of the resulting DataFrame is continuous,
# without retaining any previous index values. It's like reshuffling the deck of cards to create a fresh start,
# where each treasure is assigned a new position in the merged DataFrame.

# As we venture into the merged DataFrame, df3, we anticipate discovering a rich tapestry of treasures,
# where the boundaries between df1 and df2 blur, and new connections emerge.

df3 = pd.concat([df1, df2]).reset_index(drop=True)
df3

Unnamed: 0,letter,number
0,A,1
1,B,2
2,C,3
3,D,4
4,C,3
5,D,4
6,E,5
7,F,6


In [8]:
# Alright, it seems we're refining our merged DataFrame to ensure uniqueness and clarity. We're combining DataFrame df1 with df2
# using the concat() function, which seamlessly merges the two datasets into a single DataFrame, df3.

# However, to avoid any duplication of treasures, we're applying the drop_duplicates() function to df3.
# This ensures that any identical rows in the merged DataFrame are removed, leaving behind only unique treasures.

# By resetting the index after dropping duplicates, we ensure that the resulting DataFrame has a clean and continuous index,
# providing a structured view of our unified collection of treasures.

# As we navigate through the refined DataFrame, df3, we can be confident that each row represents a unique treasure,
# free from any duplications or redundancies, allowing us to focus on the true essence of our data story.

df3 = pd.concat([df1, df2]).drop_duplicates().reset_index(drop=True)
df3

Unnamed: 0,letter,number
0,A,1
1,B,2
2,C,3
3,D,4
4,E,5
5,F,6


## Concatenate dataframes horizontally

In [10]:
# Ah, it seems we're exploring a different approach to merging our datasets! We're combining DataFrame df1 with df2
# using the concat() function, but this time, we're concatenating along the columns axis.

# By specifying axis=1, we're concatenating the two datasets side by side, like stitching together two pieces of fabric,
# creating a unified canvas of treasures that spans across both datasets.

# Each column in the resulting DataFrame, df4, represents a unique attribute or feature from the original datasets,
# allowing us to compare and contrast the treasures from df1 and df2 in a single view.

# As we delve into the merged DataFrame, df4, we'll uncover new insights and connections between the treasures,
# enriching our understanding of the data landscape and paving the way for deeper analysis.

df4 = pd.concat([df1, df2], axis=1)
df4


Unnamed: 0,letter,number,letter.1,number.1
0,A,1,C,3
1,B,2,D,4
2,C,3,E,5
3,D,4,F,6


## Append new row to your dataframe

In [14]:
# Let's add a new row to our DataFrame df3. We're creating a new row represented by a Series,
# containing the values 'Z' and 26, with the corresponding column labels from df3 as the index.

new_row = pd.Series(['Z', 26], index=df3.columns)

# Now, let's append this new row to our DataFrame df3. By setting ignore_index=True,
# we're ensuring that the index of the appended row is reset, maintaining the continuity of the index.

df3 = df3._append(new_row, ignore_index=True)
df3

Unnamed: 0,letter,number
0,A,1
1,B,2
2,C,3
3,D,4
4,E,5
5,F,6
6,Z,26


## Join along your index

In [15]:
# Brace yourselves, folks! We're diving into the world of DataFrame shenanigans with join_df.
# This DataFrame isn't your ordinary run-of-the-mill table. Oh no, it's got some serious character!

# In the left corner, weighing in with letters of pure charm, we've got the 'letter' column.
# And in the right corner, ready to bring the numerical heat, we've got the 'number' column.

# Together, they form a dynamic duo like no other: ['F', 'G', 'H', 'I'] and [6, 7, 8, 9].
# Get ready to witness the magic as we unleash join_df onto the world!

join_df = pd.DataFrame({'letter': ['F', 'G', 'H', 'I'],
                        'number': [6, 7, 8, 9]})
join_df

Unnamed: 0,letter,number
0,F,6
1,G,7
2,H,8
3,I,9


In [16]:
# Imagine a captivating rendezvous unfolding in the realm of data as df2 and join_df set out on a delightful outing together.
# It's akin to a romantic rendezvous in the digital landscape, and you're cordially invited to witness the enchantment!

# With df2 taking the lead, join_df gracefully enters the scene, exuding an aura of elegance and sophistication.
# As they come together, there's an electric buzz of excitement, akin to two individuals discovering common ground
# and reveling in each other's company.

# But here's the twist: we're adding a touch of whimsy with a special suffix for the right DataFrame columns,
# like a playful inside joke between newfound acquaintances.

# So let's dive into the tech details: df2 and join_df are joining forces using the join method, with the rsuffix parameter
# adding a dash of charm by appending a special suffix to the column names of the right DataFrame.

# Sit back, relax, and enjoy the delightful camaraderie as df2 and join_df embark on a charming data date,
# forging connections and creating memories in the wondrous world of data!

df2.join(join_df, rsuffix='_right')

Unnamed: 0,letter,number,letter_right,number_right
0,C,3,F,6
1,D,4,G,7
2,E,5,H,8
3,F,6,I,9
