# Session 7: DataFrame Manipulations with Pandas  

## Overview  
This session focuses on performing various DataFrame manipulations using Python's pandas library. The tasks involve creating new columns, merging DataFrames, and extracting specific details from text columns. These exercises aim to enhance your data preprocessing and transformation skills.  

---

## Exercises  

1. **Creating Professor Initials**  
   - Extract the initials of each professor's first and last names and store them in a new column called `professor_initials`.  

2. **Join DataFrames on Professor Column**  
   - Combine the original DataFrame with a new DataFrame (`df_courses`) using the `professor` column as the key.  

3. **Merge DataFrames to Include Course Information**  
   - Merge the original DataFrame and `df_courses` based on the common `professor` column to add course information for each professor.  

4. **Extract Professor Last Names**  
   - Use string operations to extract the last name of each professor from the `professor` column and store it in a new column called `professor_last_name`.  

   ---

In [1]:
# Importing pandas and data

import pandas as pd

data = {
    'professor': ['Ludmila Kuncheva', 'Antonio Torralba', 'Manuel Gonzalez', 'Bastian Leibe'],
    'department': ['Computer Science', 'Computer Vision', 'AI & Robotics', 'Autonomous Systems'],
    'age': [45, 50, 47, 38]
}

df = pd.DataFrame(data)

#### Exercise 1:

In [2]:
# Creating a new column for professor initials using lambda function and apply method

df['professor_initials'] = df['professor'].apply(lambda name: ''.join([part[0].upper() for part in name.split()]))

print(df)

          professor          department  age professor_initials
0  Ludmila Kuncheva    Computer Science   45                 LK
1  Antonio Torralba     Computer Vision   50                 AT
2   Manuel Gonzalez       AI & Robotics   47                 MG
3     Bastian Leibe  Autonomous Systems   38                 BL


#### Exercise 2:

In [3]:
courses_data = {
    'professor': ['Ludmila Kuncheva', 'Antonio Torralba', 'Manuel Gonzalez', 'Bastian Leibe'],
    'courses': ['Machine Learning', 'Computer Vision', 'AI Programming', 'Self-Driving Cars']
}
df_courses = pd.DataFrame(courses_data)

# Using .join by aligning on the 'professor' column

df_combined = df.join(df_courses.set_index('professor'), on='professor')

print(df_combined)

          professor          department  age professor_initials  \
0  Ludmila Kuncheva    Computer Science   45                 LK   
1  Antonio Torralba     Computer Vision   50                 AT   
2   Manuel Gonzalez       AI & Robotics   47                 MG   
3     Bastian Leibe  Autonomous Systems   38                 BL   

             courses  
0   Machine Learning  
1    Computer Vision  
2     AI Programming  
3  Self-Driving Cars  


#### Exercise 3:

In [4]:
#rerun the origianl df to ger rid of the initial column
data = {
    'professor': ['Ludmila Kuncheva', 'Antonio Torralba', 'Manuel Gonzalez', 'Bastian Leibe'],
    'department': ['Computer Science', 'Computer Vision', 'AI & Robotics', 'Autonomous Systems'],
    'age': [45, 50, 47, 38]
}
df_original = pd.DataFrame(data)


# Courses DataFrame

courses_data = {
    'professor': ['Ludmila Kuncheva', 'Antonio Torralba', 'Manuel Gonzalez', 'Bastian Leibe'],
    'courses': ['Machine Learning', 'Computer Vision', 'AI Programming', 'Self-Driving Cars']
}
df_courses = pd.DataFrame(courses_data)

# Merging origanal df and new courses DataFrames with .merge method

combined_df = pd.merge(df_original, df_courses, on='professor')

print(combined_df)

          professor          department  age            courses
0  Ludmila Kuncheva    Computer Science   45   Machine Learning
1  Antonio Torralba     Computer Vision   50    Computer Vision
2   Manuel Gonzalez       AI & Robotics   47     AI Programming
3     Bastian Leibe  Autonomous Systems   38  Self-Driving Cars


#### Exercise 4:

In [5]:
# Creating new column for professor last names with lambda function and apply method

df['professor_last_name'] = df['professor'].apply(lambda name: name.split()[-1])

print(df)

          professor          department  age professor_initials  \
0  Ludmila Kuncheva    Computer Science   45                 LK   
1  Antonio Torralba     Computer Vision   50                 AT   
2   Manuel Gonzalez       AI & Robotics   47                 MG   
3     Bastian Leibe  Autonomous Systems   38                 BL   

  professor_last_name  
0            Kuncheva  
1            Torralba  
2            Gonzalez  
3               Leibe  
