<a href="https://colab.research.google.com/github/JonathanSosa-py/pandas_notebooks/blob/main/6_Add_Remove_Rows_and_Columns_From_DataFrames.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [23]:
import pandas as pd

In [24]:
people = {
    'first': ['Corey', 'Jane', 'Joe'],
    'last': ['Schafer', 'Doe', 'Doe'],
    'email': ['CoreyMSchafer@gmail.com', 'JaneDoe@email.com', 'JoeDoe@email.com']
}

In [25]:
df = pd.DataFrame(people)
df

Unnamed: 0,first,last,email
0,Corey,Schafer,CoreyMSchafer@gmail.com
1,Jane,Doe,JaneDoe@email.com
2,Joe,Doe,JoeDoe@email.com


# Adding columns

It's basically the same thing that we did when we were updating values. We can simply create a column and pass in a series of values that we want that column to have.

In [26]:
# Combine our first and last name column into a single column called full name

df['full_name'] = df['first'] + ' ' + df['last']
df

Unnamed: 0,first,last,email,full_name
0,Corey,Schafer,CoreyMSchafer@gmail.com,Corey Schafer
1,Jane,Doe,JaneDoe@email.com,Jane Doe
2,Joe,Doe,JoeDoe@email.com,Joe Doe


We can't use the dot notation when assigning a column like this, we have to use the brackets in order to make these assignments because if we use dot notation then Python is going to think that you're trying to assign an attribute onto the DataFrame object and not a column.

```python
df['full_name'] =
```

# Removing columns

In [27]:
# Delete first and last (name) columns 
# Method: drop

df.drop(columns= ['first', 'last'])

Unnamed: 0,email,full_name
0,CoreyMSchafer@gmail.com,Corey Schafer
1,JaneDoe@email.com,Jane Doe
2,JoeDoe@email.com,Joe Doe


In [28]:
# use the inplace attribute to keep the changes
df.drop(columns= ['first', 'last'], inplace=True)
df

Unnamed: 0,email,full_name
0,CoreyMSchafer@gmail.com,Corey Schafer
1,JaneDoe@email.com,Jane Doe
2,JoeDoe@email.com,Joe Doe


If we wanted to reverse that process and split that full name column into two different columns then:

In [29]:
df['full_name'].str.split(' ')

0    [Corey, Schafer]
1         [Jane, Doe]
2          [Joe, Doe]
Name: full_name, dtype: object

In [31]:
# We get the first name and the last name in a list, if we want to assing these two to different columns then we need to expand this list
# so that they're actually in two different columns so to do this in Pandas we can use the expand argument

# df['full_name'].str.split(' ', expand=True)

df[['first', 'last']] = df['full_name'].str.split(' ', expand=True)
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe


# Adding rows

There are a couple of different ways that we might want to add to our DataFrame so first we might just want to add a single row to our DataFrame of new data and second maybe we want to combine two DataFrames together into a single DataFrame by appending the rows of one to another.

# Adding a single row of data.

In [32]:
df.append({'first': 'Tony'})

TypeError: ignored

In [33]:
df.append({'first': 'Tony'}, ignore_index= True)

# We can see that using ignore_index= True we're no longer getting an error and also down at the bottom we can see that this new name was appended.
# Now we only assign this row a first name value so we can see there that we assigned that as Tony and all of the other columns values are set to NaN which is Not a Number which is used for missing values.
# We can pass in an entire series or a list of information there in order to add a single row of data of any information that you want.

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,,,Tony,


In [37]:
columns = df.columns
new_user = ['TonyMontana@email.com', 'Tony Montana', 'Tony', 'Montana']
new_user_dict = dict(zip(columns, new_user))

df.append(new_user_dict, ignore_index=True)

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,TonyMontana@email.com,Tony Montana,Tony,Montana


Now if we have a DataFrame that we'd like to append to our existing Data Frame then we can do that as well:

In [38]:
# Creating a new DataFrame
people = {
    'first': ['Tony', 'Steve'],
    'last': ['Stark', 'Rogers'],
    'email': ['IronMan@avenge.com', 'Cap@avenge.com']
}
df2 = pd.DataFrame(people)
df2

Unnamed: 0,first,last,email
0,Tony,Stark,IronMan@avenge.com
1,Steve,Rogers,Cap@avenge.com


In [39]:
# Add df2 to df
df.append(df2, ignore_index=True)

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,,Tony,Stark
4,Cap@avenge.com,,Steve,Rogers


In [40]:
# Here we don't have an inplace argument so we just need to set our Data Frame to our result.

df = df.append(df2, ignore_index=True)
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,,Tony,Stark
4,Cap@avenge.com,,Steve,Rogers


In [41]:
# Personal Note: I think that here I can use a filter for those values that doesn't have a full name and work on them
# Something like:

'''
filt = df['full_name'] == NaN
df[filt, 'full_name'] = df['first'] + ' ' + df['last']
'''

df['full_name'] = df['first'] + ' ' + df['last']
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,Tony Stark,Tony,Stark
4,Cap@avenge.com,Steve Rogers,Steve,Rogers


# Removig rows

In [42]:
# Remove Steve Rogers from our DataFrame.
df.drop(index=4)

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,Tony Stark,Tony,Stark


In [43]:
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,Tony Stark,Tony,Stark
4,Cap@avenge.com,Steve Rogers,Steve,Rogers


In [44]:
# Applying changes with inplace argument

df.drop(index=4, inplace=True)
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
1,JaneDoe@email.com,Jane Doe,Jane,Doe
2,JoeDoe@email.com,Joe Doe,Joe,Doe
3,IronMan@avenge.com,Tony Stark,Tony,Stark


In [45]:
from pandas._libs import index
# Delete people with Doe as last name
filt = df['last'] == 'Doe'
df.drop(index= df[filt].index, inplace=True)
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
3,IronMan@avenge.com,Tony Stark,Tony,Stark


# My exercise:

In [46]:
people = {
    'first': ['Elizabeth', 'Jonathan'],
    'last': ['Mendez', 'Sosa'],
    'email': ['EliMendez@email.com', 'JonathanSosa@email.com']
}

df3 = pd.DataFrame(people)
df3

Unnamed: 0,first,last,email
0,Elizabeth,Mendez,EliMendez@email.com
1,Jonathan,Sosa,JonathanSosa@email.com


In [49]:
# Adding df3 to df, we don't have an inplace argument REMEMBER!

df = df.append(df3, ignore_index=False)
df

Unnamed: 0,email,full_name,first,last
0,CoreyMSchafer@gmail.com,Corey Schafer,Corey,Schafer
3,IronMan@avenge.com,Tony Stark,Tony,Stark
0,EliMendez@email.com,,Elizabeth,Mendez
1,JonathanSosa@email.com,,Jonathan,Sosa


In [58]:
filt = df['full_name'].isnull()
df.loc[filt]

Unnamed: 0,email,full_name,first,last
0,EliMendez@email.com,,Elizabeth,Mendez
1,JonathanSosa@email.com,,Jonathan,Sosa


In [60]:
df['full_name'].fillna(df['first'] + ' ' + df['last'])

0       Corey Schafer
3          Tony Stark
0    Elizabeth Mendez
1       Jonathan Sosa
Name: full_name, dtype: object