# 🐼 Pandas Handbook

## 06 - Data Modifying

Check out the official [Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/)  

This notebook uses the [Titanic - Machine Learning from Disaster dataset](https://www.kaggle.com/competitions/titanic/data) from Kaggle to demonstrate how to modify the data with pandas.

## 📚 Table of Contents

---

➕ **Inserting or Dropping Columns**  
🔪 **Splitting & Extracting Values**  
🔗 **Merging Columns**  
✏️ **Renaming and Formatting**  
🧵 **Modifying Column Strings**  
🧠 **Applying Custom Functions**  
🎯 **Conditional Modifications**  
✍️ **Modifying by Index or Label**  
👉 **Next Topic: Data Combining**  

---

In [1]:
import pandas as pd
import numpy as np
import os

In [2]:
data_processed = "../data/processed/"
csv_file = "clean_titanic.csv"
import_path = os.path.join(data_processed, csv_file)
df = pd.read_csv(import_path, index_col="PassengerId")

In [3]:
mrs_df = df.copy()
mrs_filter = mrs_df['Name'].str.contains('Mrs.')
mrs_df = mrs_df[mrs_filter]

### ➕ Inserting or Dropping Columns

```df.insert(POSITION, 'NEW_COLUMN', df['COLUMN1'] + df['COLUMN2'])``` – Inserts a new column at a specified position by summing two existing columns.  
```df.insert(POSITION, 'NEW_COLUMN', df['COLUMN'] < VALUE)``` – Inserts a new column based on a condition.  
```df.drop(columns='COLUMN', inplace=True)``` – Drops a single specified column in place.  
```df.drop(columns=['COLUMN_1', 'COLUMN_2', ...], inplace=True)``` – Drops multiple specified columns in place.

In [4]:
mrs_df.insert(5, 'Family Size', mrs_df['SibSp'] + mrs_df['Parch'])
mrs_df.head()

Unnamed: 0_level_0,Survived,Pclass,Name,Sex,Age,Family Size,SibSp,Parch,Ticket,Fare,Cabin,Embarked
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,1,0,PC 17599,71.2833,C85,Cherbourg
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,1,0,113803,53.1,C123,Southampton
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,2,0,2,347742,11.1333,Unknown,Southampton
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,1,0,237736,30.0708,Unknown,Cherbourg
16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,0,0,0,248706,16.0,Unknown,Southampton


In [5]:
mrs_df.insert(5, 'Minor', mrs_df['Age'] < 18)
mrs_df.head()

Unnamed: 0_level_0,Survived,Pclass,Name,Sex,Age,Minor,Family Size,SibSp,Parch,Ticket,Fare,Cabin,Embarked
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,False,1,1,0,PC 17599,71.2833,C85,Cherbourg
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,False,1,1,0,113803,53.1,C123,Southampton
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,False,2,0,2,347742,11.1333,Unknown,Southampton
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,True,1,1,0,237736,30.0708,Unknown,Cherbourg
16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,False,0,0,0,248706,16.0,Unknown,Southampton


In [6]:
mrs_df.drop(columns='Minor', inplace=True)
mrs_df.head()

Unnamed: 0_level_0,Survived,Pclass,Name,Sex,Age,Family Size,SibSp,Parch,Ticket,Fare,Cabin,Embarked
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,1,0,PC 17599,71.2833,C85,Cherbourg
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,1,0,113803,53.1,C123,Southampton
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,2,0,2,347742,11.1333,Unknown,Southampton
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,1,0,237736,30.0708,Unknown,Cherbourg
16,1,2,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,0,0,0,248706,16.0,Unknown,Southampton


In [7]:
mrs_df.drop(columns=['Survived', 'Pclass', 'Family Size', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'], inplace=True)
mrs_df.head()

Unnamed: 0_level_0,Name,Sex,Age
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0


### 🔪 Splitting & Extracting Values

```df['NEW_COLUMN'] = df['COLUMN'].str.extract(r'PATTERN')``` – Extracts substrings matching a regex pattern into a new column.  
```df[['NEW_COLUMN_1', 'NEW_COLUMN_2']] = df['COLUMN'].str.split('DELIMITER', n=1, expand=True)``` – Splits a column by a delimiter into multiple new columns.  

In [8]:
maiden_df = df.copy()
maiden_df['Maiden Name'] = maiden_df['Name'].str.extract(r'\(([^)]+)\)')
maiden_df[['Name', 'Maiden Name']].head()

Unnamed: 0_level_0,Name,Maiden Name
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1
1,"Braund, Mr. Owen Harris",
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",Florence Briggs Thayer
3,"Heikkinen, Miss. Laina",
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",Lily May Peel
5,"Allen, Mr. William Henry",


In [9]:
mrs_df[['Surname', 'Rest']] = mrs_df['Name'].str.split(',', expand=True)
mrs_df[['Title', 'Rest']] = mrs_df['Rest'].str.split('.', n=1, expand=True)
mrs_df[['Husband', 'Rest']] = mrs_df['Rest'].str.split('(', n=1, expand=True)
mrs_df.head()

Unnamed: 0_level_0,Name,Sex,Age,Surname,Rest,Title,Husband
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer),Mrs,John Bradley
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel),Mrs,Jacques Heath
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg),Mrs,Oscar W
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem),Mrs,Nicholas
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome),Mrs,


In [10]:
mrs_df['Rest'] = mrs_df['Rest'].str.replace(')', '')
mrs_df['Husband'] = mrs_df['Husband'].replace(r'^\s*$', 'Unknown', regex=True)
mrs_df.head()

Unnamed: 0_level_0,Name,Sex,Age,Surname,Rest,Title,Husband
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown


### 🔗 Merging Columns

```df['NEW_COLUMN'] = df['COLUMN_1'] + 'SEPARATOR' + df['COLUMN_2']``` – Creates a new column by concatenating two columns with a separator.  

In [11]:
mrs_df['Husband Fullname'] = mrs_df['Surname'] + ',' + mrs_df['Husband']
mrs_df.head()

Unnamed: 0_level_0,Name,Sex,Age,Surname,Rest,Title,Husband,Husband Fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### ✏️ Renaming and Formatting

```df.rename(columns={'OLD_NAME': 'NEW_NAME'}, inplace=True)``` – Renames a column.  
```df.columns = ['COLUMN_1', 'COLUMN_2', 'COLUMN_3', ...]``` – Sets new column names explicitly.

In [12]:
mrs_df.rename(columns={'Rest': 'Maiden Name'}, inplace=True)
mrs_df.head()

Unnamed: 0_level_0,Name,Sex,Age,Surname,Maiden Name,Title,Husband,Husband Fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [13]:
mrs_df.columns = ['Name', 'Gender', 'Age', 'Surname', 'Maiden Name', 'Title', 'Husband', 'Husband Fullname']
mrs_df.head()

Unnamed: 0_level_0,Name,Gender,Age,Surname,Maiden Name,Title,Husband,Husband Fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### 🧵 Modifying Column Strings

```df.columns = [x.lower() for x in df.columns]``` – Converts all column names to lowercase.  
```df.columns = df.columns.str.replace(" ", "_")``` – Replaces spaces with underscores in column names.  
```df['COLUMN'] = df['COLUMN'].str.upper()``` – Converts all values in a column to uppercase.  

In [14]:
mrs_df.columns = [x.lower() for x in mrs_df.columns]
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden name,title,husband,husband fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [15]:
mrs_df.columns = mrs_df.columns.str.replace(" ", "_")
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [16]:
mrs_df['gender'] = mrs_df['gender'].str.upper()
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",FEMALE,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",FEMALE,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",FEMALE,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",FEMALE,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",FEMALE,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### 🧠 Applying Custom Functions

```df['COLUMN'] = df['COLUMN'].apply(str.lower)``` – Applies the lowercase function to all entries in a column.   
```df['COLUMN'] = df['COLUMN'].apply(custom_function)``` – Applies a custom function to a column.  
```df['COLUMN'] = df['COLUMN'].apply(lambda x: x.lower())``` – Applies a lambda function to convert strings to lowercase.  
```df['COLUMN'] = df['COLUMN'].replace({'OLD_VALUE': 'NEW_VALUE'})``` – Replaces specific values in a column according to a mapping.  
```df['COLUMN'] = df['COLUMN'].map({'OLD_VALUE': 'NEW_VALUE'})``` – Maps old values to new values in a column.

In [17]:
mrs_df['gender'] = mrs_df['gender'].apply(str.lower)
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [18]:
def function_name(variable):
    return variable.title()

mrs_df['gender'] = mrs_df['gender'].apply(function_name)
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",Female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",Female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",Female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",Female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",Female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [19]:
mrs_df['gender'] = mrs_df['gender'].apply(lambda x: x.lower())
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,Cumings,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,Futrelle,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [20]:
mrs_df['surname'] = mrs_df['surname'].replace({'Cumings': 'C.', 'Futrelle': 'F.'})
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,C.,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",female,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [21]:
mrs_df['gender'] = mrs_df['gender'].map({'female': 'F'})
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,38.0,C.,Florence Briggs Thayer,Mrs,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### 🎯 Conditional Modifications

```filter = df['COLUMN'] == 'VALUE'``` – Creates a boolean mask for rows where a column equals a value.  
```df.loc[filter, 'COLUMN_TO_MODIFY'] = 'NEW_VALUE'``` – Updates values in a column based on the condition.  
```df['COLUMN'] = df['COLUMN'].where(df['COLUMN'] < VALUE, other=np.nan)``` – Sets values to `NaN` where a condition is not met.  
```df['COLUMN'] = df['COLUMN'].clip(lower=LOWER_BOUND, upper=UPPER_BOUND)``` – Restricts values in a column to a specified range.

In [22]:
a_filter = mrs_df['surname'] == 'C.'
mrs_df.loc[a_filter, 'title'] = 'Lady'
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,38.0,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,55.0,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [23]:
mrs_df['age'] = mrs_df['age'].where(mrs_df['age'] < 37, other=np.nan)
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,14.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [24]:
mrs_df['age'] = mrs_df['age'].clip(lower=25, upper=35)
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,25.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### ✍️ Modifying by Index or Label

```df.loc[INDEX] = ['VALUE_1', 'VALUE_2', ...]``` – Replaces all column values in a row at a specific index.  
```df.loc[INDEX, 'COLUMN'] = VALUE``` – Updates a single cell value by index and column label.  
```df.loc[INDEX, ['COLUMN_1', 'COLUMN_2']] = [VALUE_1, VALUE_2]``` – Updates multiple columns in a row by index.  
```df.at[INDEX, 'COLUMN'] = VALUE``` – Assigns a single value to a specific cell identified by label-based index and column.

In [25]:
mrs_df.loc[1] = ['Elisabth', 'female', 27, 'Queens', 'Mary Stone', 'Mrs', 'Jake', 'Jake Queen',]
mrs_df.tail()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
875,"Abelson, Mrs. Samuel (Hannah Wizosky)",F,28.0,Abelson,Hannah Wizosky,Mrs,Samuel,"Abelson, Samuel"
880,"Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)",F,,Potter,Lily Alexenia Wilson,Mrs,Thomas Jr,"Potter, Thomas Jr"
881,"Shelley, Mrs. William (Imanita Parrish Hall)",F,25.0,Shelley,Imanita Parrish Hall,Mrs,William,"Shelley, William"
886,"Rice, Mrs. William (Margaret Norton)",F,,Rice,Margaret Norton,Mrs,William,"Rice, William"
1,Elisabth,female,27.0,Queens,Mary Stone,Mrs,Jake,Jake Queen


In [26]:
mrs_df.loc[2, 'age'] = 24
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,24.0,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,35.0,F.,Lily May Peel,Mrs,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,25.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [27]:
mrs_df.loc[4, ['age', 'title']] = [37, 'Lady']
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,24.0,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,37.0,F.,Lily May Peel,Lady,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,Oscar W,"Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,25.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


In [28]:
mrs_df.at[9, 'husband'] = ['Oscar', 'W', 'ayne']
mrs_df.head()

Unnamed: 0_level_0,name,gender,age,surname,maiden_name,title,husband,husband_fullname
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2,"Cumings, Mrs. John Bradley (Florence Briggs Th...",F,24.0,C.,Florence Briggs Thayer,Lady,John Bradley,"Cumings, John Bradley"
4,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",F,37.0,F.,Lily May Peel,Lady,Jacques Heath,"Futrelle, Jacques Heath"
9,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",F,27.0,Johnson,Elisabeth Vilhelmina Berg,Mrs,"[Oscar, W, ayne]","Johnson, Oscar W"
10,"Nasser, Mrs. Nicholas (Adele Achem)",F,25.0,Nasser,Adele Achem,Mrs,Nicholas,"Nasser, Nicholas"
16,"Hewlett, Mrs. (Mary D Kingcome)",F,,Hewlett,Mary D Kingcome,Mrs,Unknown,"Hewlett,Unknown"


### 👉 Next Topic: [Data Combining](./07-data-combining.ipynb)

Learn how to combin data with pandas by Concatenation, Merging & Joining.