# Video: Column Management with Data Frames

This video shows how to add, subtract, and otherwise change the columns of a data frame.


## Column Management with Data Frames
* Changing data frames is just like changing key/values in dictionaries.
  * Key = column name
  * Value = pandas Series
* Adding new columns will not change index values.

Script:
* Adding and removing columns from a data frame is pretty straight forward.
* If you think of the data frame as a dictionary with series values, then you have most of what you need.

## Code Example: Adding Data Frame Columns

In [None]:
import pandas as pd

In [None]:
df = pd.DataFrame({"a": [3, 4, 5]})
df

Unnamed: 0,a
0,3
1,4
2,5


Script:
* I will start with a small data frame so you can see the changes easily.

In [None]:
df["b"] = pd.Series([1, 2, 3])
df

Unnamed: 0,a,b
0,3,1
1,4,2
2,5,3


Script:
* You can assign a series to a data frame column and that just works.

In [None]:
df["c"] = pd.Series([2,4,6,8,10])
df

Unnamed: 0,a,b,c
0,3,1,2
1,4,2,4
2,5,3,6


Script:
* If you assign a series with extra values, the extra ones are dropped.

In [None]:
df["d"] = pd.Series([6, 7, 8, 9, 10], index=[0, 2, 4, 6, 8])
df

Unnamed: 0,a,b,c,d
0,3,1,2,6.0
1,4,2,4,
2,5,3,6,7.0


Script:
* More specifically, only index values that are already in the data frame are kept.
* This time, I chose every other index value in the added series, so the data frame has not a number where the index values were skipped.

In [None]:
df2 = pd.DataFrame({"e2": [1, 2, 3, 4, 5]})
df["e"] = df2["e2"]
df

Unnamed: 0,a,b,c,d,e
0,3,1,2,6.0,1
1,4,2,4,,2
2,5,3,6,7.0,3


Script:
* Most of the time, you will be adding columns of one data frame to another data frame.
* The same index value behavior applies.
* Often, you will be working with a lot data frames with the same index, so they will all match up.
* But other times, pandas will take care of the matching automatically for you.


## Think About Your Index Values when Adding Columns
* It is convenient for pandas to match up data by index value.
* But the index value may not be meaningful if it was automatically assigned.
* Mixing up filtered data frames and independently created series may not behave as expected.

## Code Example: Removing Data Frame Columns

Script:
* The easiest way to remove a data frame column is to delete it.

In [None]:
del df["b"]
df

Unnamed: 0,a,c,d,e
0,3,2,6.0,1
1,4,4,,2
2,5,6,7.0,3


Script:
* That is the same syntax as deleting a key from a normal Python dictionary.
* There is also a data frame method drop which returns a copy of the data frame with a row or column removed.

In [None]:
df.drop("c", axis=1)

Unnamed: 0,a,d,e
0,3,6.0,1
1,4,,2
2,5,7.0,3


Script:
* The drop method's default behavior is to return a new data frame and not change the original data frame.

In [None]:
df

Unnamed: 0,a,c,d,e
0,3,2,6.0,1
1,4,4,,2
2,5,6,7.0,3


Script:
* If you want the drop method to change the data frame, then you can use the inplace option.

In [None]:
df.drop("c", axis=1, inplace=True)
df

Unnamed: 0,a,d,e
0,3,6.0,1
1,4,,2
2,5,7.0,3


Script:
* Most data frame methods have this inplace option if you want to update the original data frame instead of returning a new data frame or series.

## Column Management with Data Frames

TLDR: pretend the data frame is a dictionary of series.


Script:
* Now you know how to add and remove columns from data frames.
* This should be pretty easy with the mental model that the data frame acts like a dictionary of series or columns.
