# Pandas Tutorial. Modifying Data within Pandas

### Part 05: Modify data in a Pandas dataframe

Corey Schafer -Python tutorial (Pandas)
Python Pandas Tutorial (part 5): Updating Rows and Columns - Modifying Data Within DataFrames
YouTube video: https://www.youtube.com/watch?v=DCDe29sIKcE&t=150s


## 1.Load required libraries

In [14]:
import pandas as pd

## 2.Create a DataFrame from a Dictionary

Create a new **dictionary** of data to turn it into a DataFrame

In [25]:
people = {
    "first": ["Corey","Jane","John"],
    "last": ["Schafer",'Doe','Doe'],
    "email": ["CoreyMSchafer@gmail.com","JaneDoe@email.com","JohnDoe@email.com"]
    
}

In [26]:
people

{'first': ['Corey', 'Jane', 'John'],
 'last': ['Schafer', 'Doe', 'Doe'],
 'email': ['CoreyMSchafer@gmail.com',
  'JaneDoe@email.com',
  'JohnDoe@email.com']}

Using type() function we can get the object type.
The type() function returns the type of the specified object. Syntax: type(object, bases, dict) Parameter Values:

In [27]:
type(people)

dict

Then we can turn the Dictionary into a **dataframe** using DataFrame() functin in Pandas

In [28]:
df = pd.DataFrame(people)

In [29]:
df.head()

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


## 3.Updating cols and rows

### 3.1 Updating cols 

First we will make changes to our data. Updating information in our columns

In [32]:
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


Using the columns attribute

In [33]:
df.columns

Index(['first', 'last', 'email'], dtype='object')

#### 3.1.1 Updating cols using an assignment

We can use the previuos df.columns attribute using an assginment "=". And we pass a **list** containing the **new** column names we want to obtain. 

In [34]:
df.columns = ['first_name','last_name','email']

Now if we run again df.columns (after having introduce a list to change columns names). We can see how we have now different columns names

In [35]:
df.columns

Index(['first_name', 'last_name', 'email'], dtype='object')

Now these column names have changed and the same applies when we look into the dataframe

In [36]:
df

Unnamed: 0,first_name,last_name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


#### 3.1.2 Using list comprehension to update columns 

This is a list comprehension in Python

- List comprehension defintion: Basically, it's a simpler way to create a new list from the values in a list you already have.

- A list comprehension is a **single line of code** that is written inside **the square brackets**. It has three components:
- 1-3. For loop
- 2-3. Condition and expression
- 3-3. Output
- Template: List_comprehension[ x for x in list]

In [37]:
lst = [1,2,3,4,5,6,7,8,9,10]
# simple list comprehension
a = [x for x in lst]
print(a)


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


- a. Turn columns uppercase using list comprehensions

We can use *list comprehensions* to update columns information. For example, we can use a list comprehension to turn our **columns** **uppercase**

In [40]:
# This is the dataset we had previously 
df

Unnamed: 0,first_name,last_name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


Then I can use this **list comprehension below** to update my columns to **uppercase**

- Standard list comprehension: my_list = [ x for x in my_prev_list]
- This time I introduce a new method .upper() and also I apply it to columns *attribute*

In [43]:
df.columns = [x.upper() for x in df.columns]
df

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


I obtain my previous DataFrame with all its columns defined as **uppercase**

- b. Remove spaces from column names using list comprehensions

We can replace spaces with underscores using a list comprehension, but this time using a **str** method combined with the **replace** method from the string class.

We specify we want to replace *empty spaces* by *underscores*

In [46]:
office_people = {
    "first name": ["Corey","Jane","John"],
    "last name": ["Schafer",'Doe','Doe'],
    "email": ["CoreyMSchafer@gmail.com","JaneDoe@email.com","JohnDoe@email.com"]
    
}

In [47]:
office_df = pd.DataFrame(office_people)
office_df

Unnamed: 0,first name,last name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


We can use now these two methods **str** and **replace** to replace those empty spaces by underscores using a **list comprehension**

In [48]:
office_df.columns = office_df.columns.str.replace(' ','_')
office_df

Unnamed: 0,first_name,last_name,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,John,Doe,JohnDoe@email.com


List comprehension online resources: 
- List Comprehension in Python Expalined for Beginners:
https://www.freecodecamp.org/news/list-comprehension-in-python/

We can also use a **list comprehension** to update our columns in a DataFrame

In [None]:
df.columns = []