# Python Pandas Tutorial 5- Updating Rows and Columns
Welcome to the fifth module of the Panda's Tutorial! In this module, we will learn how to Modify Data Within DataFrames. 

In [113]:
import pandas as pd
people = { "first": ["Ben", "Han", "Luke", "Anakin", "Leia", "Chew", "Master"], 
"last": ["Kenobi", "Solo", "Skywalker", "Skywalker", "Organa", "Bacca", "Yoda"], 
"email": ["Benkenobi@email.com", "Hansolo@email.com", "Lukeskywalker@email.com", "Anakinskywalker@vader.com", "Leiaorgana@email.com", "Chewbacca@email.com", "Masteryoda@force.com"],
"salary": [10000, 7000, 200000, 1000000, 8000, 5000, 2000000],
"planet": ["Stewjon", "Corellia", "Tatooine", "Tatooine", "Alderaan", "Kashyyyk", "Dagobah"]}

df = pd.DataFrame(people)
df

Unnamed: 0,first,last,email,salary,planet
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Skywalker,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


In [114]:
#Let's take a look at the columns
df.columns

Index(['first', 'last', 'email', 'salary', 'planet'], dtype='object')

1. Say we wanted to change one of the column names. There are a few different ways we can do this. This first way is for when you want to change the names of all of your columns.

In [115]:
df.columns = ["first_name", "last_name", "email_address", "salary", "planet"]
df.columns

Index(['first_name', 'last_name', 'email_address', 'salary', 'planet'], dtype='object')

2. Maybe you need to change all of your column names to upper case. You can do this using a list comprehension.

In [116]:
df.columns = [x.upper() for x in df.columns]
df

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL_ADDRESS,SALARY,PLANET
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Skywalker,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


In [117]:
#changing underscores to spaces in column names
df.columns = df.columns.str.replace('_', ' ')
df

Unnamed: 0,FIRST NAME,LAST NAME,EMAIL ADDRESS,SALARY,PLANET
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Skywalker,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


3. If you only want to change some columns, you can use the rename method.

In [118]:
df.rename(columns={"FIRST NAME": "first", "LAST NAME": "last"}, inplace=True)
df
#The columns are now set back the way they were.

Unnamed: 0,first,last,EMAIL ADDRESS,SALARY,PLANET
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Skywalker,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


4. How to update a single value in a row. We can use loc and iloc as our starting points.

In [119]:
df.loc[2,['last']] = ['Smith']
df

Unnamed: 0,first,last,EMAIL ADDRESS,SALARY,PLANET
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Smith,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


In [120]:
#You can also use .at to change a single value
df.at[2, 'last'] = 'Skywalker'
df

Unnamed: 0,first,last,EMAIL ADDRESS,SALARY,PLANET
0,Ben,Kenobi,Benkenobi@email.com,10000,Stewjon
1,Han,Solo,Hansolo@email.com,7000,Corellia
2,Luke,Skywalker,Lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,Anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,Leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,Chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,Masteryoda@force.com,2000000,Dagobah


If you want to change one of your columns from upper cases to lowercase or vice versa, you can use the lower() method

In [121]:
df['EMAIL ADDRESS'].str.lower()

0          benkenobi@email.com
1            hansolo@email.com
2      lukeskywalker@email.com
3    anakinskywalker@vader.com
4         leiaorgana@email.com
5          chewbacca@email.com
6         masteryoda@force.com
Name: EMAIL ADDRESS, dtype: object

In [122]:
#You'll notice that if you run df again, those changes don't stay. In order to make them permanent, you can do this.
df['EMAIL ADDRESS'] = df['EMAIL ADDRESS'].str.lower()
df

Unnamed: 0,first,last,EMAIL ADDRESS,SALARY,PLANET
0,Ben,Kenobi,benkenobi@email.com,10000,Stewjon
1,Han,Solo,hansolo@email.com,7000,Corellia
2,Luke,Skywalker,lukeskywalker@email.com,200000,Tatooine
3,Anakin,Skywalker,anakinskywalker@vader.com,1000000,Tatooine
4,Leia,Organa,leiaorgana@email.com,8000,Alderaan
5,Chew,Bacca,chewbacca@email.com,5000,Kashyyyk
6,Master,Yoda,masteryoda@force.com,2000000,Dagobah


If you want to do something a little more advanced, there are four popular methods you can use:

Apply 

Map

Applymap

Replace 

In [123]:
#Apply - It can apply a function to every value in our series
def update_email(EMAILADDRESS): #Returns the emails in uppercase
    return EMAILADDRESS.upper()

df['EMAIL ADDRESS'].apply(update_email)


0          BENKENOBI@EMAIL.COM
1            HANSOLO@EMAIL.COM
2      LUKESKYWALKER@EMAIL.COM
3    ANAKINSKYWALKER@VADER.COM
4         LEIAORGANA@EMAIL.COM
5          CHEWBACCA@EMAIL.COM
6         MASTERYODA@FORCE.COM
Name: EMAIL ADDRESS, dtype: object

In [124]:
#You can also use apply with rows.
df.apply(len, axis='columns')

0    5
1    5
2    5
3    5
4    5
5    5
6    5
dtype: int64

In [125]:
df.apply(lambda x: x.min()) #X is a series not a value

first                               Anakin
last                                 Bacca
EMAIL ADDRESS    anakinskywalker@vader.com
SALARY                                5000
PLANET                            Alderaan
dtype: object

In [131]:
def Salary2String(SALARY): #Returns the salaries in string form
    return str(SALARY)

df['SALARY'] = df['SALARY'].apply(Salary2String)

In [132]:
#Apply map - Applies a function to every individual element in the dataframe
df.applymap(len) #Applies the length function to each value in our dataframe

Unnamed: 0,first,last,EMAIL ADDRESS,SALARY,PLANET
0,3,6,19,5,7
1,3,4,17,4,8
2,4,9,23,6,8
3,6,9,25,7,8
4,4,6,20,4,8
5,4,5,19,4,8
6,6,4,20,7,7


In [None]:
def Salary2Int(SALARY): #Returns the salaries in integer
    return int(SALARY)

df['SALARY'] = df['SALARY'].apply(Salary2Int)

In [128]:
#Map- The map method only works on a series. It's used for substituing each value in a series with another value.
#In this example we're going to substitute some of our first names.
df['first'].map({'Master': 'Baby', 'Ben': 'Obi-Wan'})
#The values we didn't changed get converted to NaN values(not a number)

0    Obi-Wan
1        NaN
2        NaN
3        NaN
4        NaN
5        NaN
6       Baby
Name: first, dtype: object

In [133]:
#In this example, we dont want to get rid of the other names. Instead of map, we can use the replace method.
df['first'].replace({'Master': 'Baby', 'Ben': 'Obi-Wan'})
#Notice how it doesn't change the other names.
#If you want this change to be permanent, you'd have to have df ['first'] = df['first'].replace({'Master': 'Baby', 'Ben': 'Obi-Wan'})

0    Obi-Wan
1        Han
2       Luke
3     Anakin
4       Leia
5       Chew
6       Baby
Name: first, dtype: object