---
# Add/Remove Rows and Columns
Adding and removing rows and columns from DataFrames

---

In [None]:
import pandas as pd
import numpy as np
from IPython.display import display

In [None]:
people = {
    "first": ["Lorem", "John", "Jane"],
    "last": ["Ipsum", "Doe", "Doe"],
    "email": ["lorem@yahoo.com", "john@gmail.com", "jane@outlook.com"],
}
df = pd.DataFrame(people)
display(df)

---
## Adding New Columns
You can add a new column to an existing DataFrame by indexing to it (the name it will have) and assigning a Series to it.

---

In [None]:
# Add a new column that combines the first and last name

display(df)
full_name_series = df["first"] + " " + df["last"]
display(full_name_series)

# Create a new column named "full_name"
df["full_name"] = full_name_series
display(df)

---
## Removing Columns
Removing a column can be done by the `drop()` method.  
`drop()` can remove a Series (row or column, determined by the **axis** parameter) by passing in the label(s) of the Series to the **labels** parameter.  

**axis** can be set to 0, "rows", or "index" to remove a row Series, and set to 1, or "columns" to remove a column Series. **axis** defaults to 0 (rows Series).  

Columns can also be dropped directly by passing the column label(s) in an iterable to the **columns** parameter. This bypasses the need of specifying the **axis** parameter.  

**Note:** this method does not modify the original df. Modify the original by setting the inplace parameter to True.  

---

In [None]:
# Removing last name using labels and axis parameters.
# Note that we are not applying this change (inplace=False)

display(df)
a = df.drop(labels="last", axis=1)
display(a)


In [None]:
# Removing index 0 and 2 (Lorem and John) using labels
# Note that we are not applying this change (inplace=False)

display(df)
a = df.drop(labels=[0, 1])
display(a)

In [None]:
# Removing first and last names using columns parameter
# We are applying this change (inplace=True)

display(df)
df.drop(columns=["first", "last"], inplace=True)
display(df)

---
**Adding the first and last name columns back**  

We are going to return the first and last name columns back to the original df by using the `split()` method.  

If the **expand** parameter is set to True, `split()` assigns the split values in to their own Series in a DataFrame. If **expand** is False, the split values are stored in a list inside a Series.

---

In [None]:
## Create a DataFrame of 2 columns with the full_name column

# With expand=False (default)

display(df)
split_df = df["full_name"].str.split(" ")
display(split_df)

# We do not want this behavior. We want a DataFrame.

In [None]:
display(df)

# With expand=True
split_df = df["full_name"].str.split(" ", expand=True)
display(split_df)

# Assign the resulting df to the original df
df[["first", "last"]] = split_df
display(df)

---
## Adding Rows
Adding rows or Series and DataFrames in general, we can use the `concat()` function. `concat()` is explained in detail on **pandas_01a**. 

**Note:** this method does not modify the original df. Modify the original by setting the inplace parameter to True.  

---

In [36]:
# Add new row

display(df)
# new_row = pd.DataFrame(
#     {
#         "first": ["Ken"],
#         "last": ["Adams"],
#         "email": ["adams@gmail.com"],
#     }
# )
new_row = pd.Series({"Ken":"Adams"})
display(new_row)

df2 = pd.concat([df, new_row], ignore_index=True)
df2

# pd.concat([df, pd.Series(["Hello"])], axis=1)
# x = df["email"]
# display(x)
# a = pd.concat([df, x], axis="columns")
# display(a)

Unnamed: 0,email,full_name,first,last
0,lorem@yahoo.com,Lorem Ipsum,Lorem,Ipsum
1,john@gmail.com,John Doe,John,Doe
2,jane@outlook.com,Jane Doe,Jane,Doe


Ken    Adams
dtype: object

Unnamed: 0,email,full_name,first,last,0
0,lorem@yahoo.com,Lorem Ipsum,Lorem,Ipsum,
1,john@gmail.com,John Doe,John,Doe,
2,jane@outlook.com,Jane Doe,Jane,Doe,
3,,,,,Adams


In [None]:
# data = [{'col_1': 3},
#         {'col_1': 2, 'col_2': 'b'},
#         {'col_1': 1, 'col_2': 'c'},
#         {'col_1': 0, 'col_2': 'd'}]
# a = pd.DataFrame.from_records(data)
# b = pd.DataFrame(data)
# display(a)
# display(b)

In [43]:
df7 = pd.DataFrame({'a': 1, 'b': 2}, index=[0])
df7

Unnamed: 0,a,b
0,1,2


In [67]:
# Working example (why not make an append() substitute? This is convoluted)
new_row = pd.Series({'a': 3, 'b': 4})
new_row = new_row.to_frame().T
display(new_row)
xx = pd.concat([df7, new_row], ignore_index=True)
xx

Unnamed: 0,a,b
0,3,4


Unnamed: 0,a,b
0,1,2
1,3,4
