<a href="https://colab.research.google.com/github/JonathanSosa-py/pandas_notebooks/blob/main/5_Updating%20Rows%20and%20Columns%20-%20Modifying%20Data%20Within%20DataFrames.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
people = {
    'first': ['Corey', 'Jane', 'Joe'],
    'last': ['Schafer', 'Doe', 'Doe'],
    'email': ['CoreyMSchafer@gmail.com', 'JaneDoe@email.com', 'JoeDoe@email.com']
}

In [2]:
import pandas as pd

In [3]:
df = pd.DataFrame(people)
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


In [4]:
# UPDATING COLUMNS
# Check the columns
df.columns

Index(['first', 'last', 'email'], dtype='object')

In [5]:
df.columns = ['first_name', 'last_name', 'email']
df

Unnamed: 0,first_name,last_name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


This way is almost never used because this is used when we are passing in different names for all of our columns. We'll usually need to change the names of a few different columns. One thing that is a lot more commmon is the need to change something specific about each column in our data frame, for example maybe our columns are all uppercase and we want them to be lowercase or viceversa or maybe our columns name have spaces and we want to replace the spaces in the column names with an underscore, so in this case we can use a list comprehension.

For example, let's say that we wanted to uppercase all of the column names here: 

In [6]:
df.columns = [x.upper() for x in df.columns]
df

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


Another thing that you might want to do is remove spaces and replace them with underscore, specially if you like the dot notation to access a column name:

In [7]:
df.columns = df.columns.str.replace('_', ' ')
df

Unnamed: 0,FIRST NAME,LAST NAME,EMAIL
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


In [10]:
df.columns = df.columns.str.replace(' ', '_')
df

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


In [11]:
# Lowercase the columns
df.columns = [x.lower() for x in df.columns]
df

Unnamed: 0,first_name,last_name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


Everything that we have seen so far applies to everyone of our columns. But what if we only wanted to change some columns? Well in this case we can use the ***rename*** method and just pass a dictionary of the columns that we want to change. So if I want to set the first_name and last_name back to what they were before, then we could say:

In [15]:
df.rename(columns= {'first_name': 'first', 'last_name': 'last'}, inplace=True)
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com
