# 05 - Updating Rows and Columns - Modifying Data Within DataFrames

https://youtu.be/DCDe29sIKcE?si=d4DUqMqXt7kSc3o-

In [1]:
import pandas as pd

#Setup for learning data

people = {
    "first": ["Corey", 'Jane', 'John'], 
    "last": ["Schafer", 'Doe', 'Doe'], 
    "email": ["CoreyMSchafer@gmail.com", 'JaneDoe@email.com', 'JohnDoe@email.com']
}
dft = pd.DataFrame(people)

---

#Setup for Real world data

In [None]:
df = pd.read_csv('data/survey_results_public.csv', index_col='Respondent' )
schema_df = pd.read_csv('data/survey_results_schema.csv', index_col='Column')
pd.set_option('display.max_columns', 85)


In [2]:
# pd.set_option('display.max_rows', 85)

---

#### Update columns/column names


How to update data within rows and columns

In [3]:
dft.columns

Index(['first', 'last', 'email'], dtype='object')

In [20]:
dft.columns = [ 'f', 'l', 'em']
dft

Unnamed: 0,f,l,em
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


Renames all the column in the dataframe.

In [9]:
# dft.columns = [ 'f', 'l']

But that method is only used when renaming all the columns in the dataframe and not when some of the columns are left

to rename specific column have to give a Dictionary (key:value pairs) like 'Old name' : 'new name'  to `.rename(columns = {})`

In [21]:
dft.rename( columns = {'f': 'first name', 'l': 'last name'})

Unnamed: 0,first name,last name,em
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


but the above wont change data in place so we have to use `inplace = True`

In [22]:
dft.rename( columns = {'f': 'first name', 'l': 'last name'}, inplace=True)
dft

Unnamed: 0,first name,last name,em
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


can use string methods to uppercase the column using list comprehension

In [23]:
dft.columns = [ x.upper() for x in dft.columns ]
dft

Unnamed: 0,FIRST NAME,LAST NAME,EM
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


In [28]:
dft.columns = dft.columns.str.lower() #can also use this method
dft

Unnamed: 0,first_name,last_name,em
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


Or remove the spaces in between column names for ease of use

In [26]:
dft.columns = dft.columns.str.replace( ' ', '_')
dft

Unnamed: 0,FIRST_NAME,LAST_NAME,EM
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com
