In [75]:
import pandas as pd

# First dataframe
data1 = {'C1': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
         'C2': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']}
df = pd.DataFrame(data1, index=range(1, 11))

# Second dataframe
data2 = {'C3': ['a', 'e']}
df2 = pd.DataFrame(data2, index=[1, 5])
df2_saved_index = df2.index


## The Amazing Index in Pandas

Here we have two data frames. How do I add C1 and C2 values in df to df2 by index? 

1. `merge`: and explicitly set left_index and right_index to be True
2. `concat`: index automatically matched
    * What if I reorder df2, would it mess things up? 🤪
    * No it would be fine 💃🏻
3. Just assign the values directly, and the assignment is done by index!😎

### 1. Merge

In [76]:
# Merge the two dataframes, explicitly based on the index
merged_df = df2.merge(df, left_index=True, right_index=True)

# Display the merged dataframe
print(merged_df)

  C3  C1 C2
1  a  10  A
5  e  50  E


### 2. Concat

In [78]:
# Didn't specify but notice that the rows were matched by index automatically
concatenated_df = pd.concat([df2, df], axis=1, join='inner')
print(concatenated_df)

  C3  C1 C2
1  a  10  A
5  e  50  E


In [79]:
# reorder df2, but notice the index is unchanged
df2.sort_values('C3', ascending = False, inplace = True)
print(df2)


  C3
5  e
1  a


In [80]:
pd.concat([df2, df], axis = 1, join = 'inner')

Unnamed: 0,C3,C1,C2
5,e,50,E
1,a,10,A


## 3. Assign directly

In [81]:
# Even if we simply assign the values like below, the assignment were based on index
df2['C1'] = df['C1']
df2['C2'] = df['C2']

print(df2)


  C3  C1 C2
5  e  50  E
1  a  10  A


## Mess Things Up

So even if we reorder the dataframe, the index will stay in place and help with matching. What could we do to mess things up?

Remember df2 has been reordered by C3 above, but the index are still the original index? What if we use `set_index()` and feed a saved index into it?

In [82]:
df2.set_index(df2_saved_index)

Unnamed: 0,C3,C1,C2
1,e,50,E
5,a,10,A


Now finally the index are no longer the original one but reassigned by order. So if we do concat again, it will be done using the newly assigned index and no longer consistent with the original. Pls see below. 

In [83]:
pd.concat([df2.set_index(df2_saved_index), df], axis = 1, join = 'inner')

Unnamed: 0,C3,C1,C2,C1.1,C2.1
1,e,50,E,10,A
5,a,10,A,50,E
