# Setting Values In A DataFrame
## Notebook Outline:

* <a href='#AddNewColumns'>Adding New Columns To A DataFrame</a>
* <a href='#DroppingColumns'>Dropping Columns From DataFrames</a>
* <a href='#UpdatingColumnValues'>Updating Column Values</a>
* <a href='#SettingSpecificValues'>Setting Specific Values</a>

# How to use this Notebook

The best way to use this notebook is to follow along with the lecture and then to apply what you learn to your own data files, or (if you do not have any of your own data) to practice using this functions and methods on the provided data. A little practice goes a long way towards understand and retaining! It would be easy to just skim this notebook, but you will learn more by doing!

<a name='AddNewColumns'></a>
# Adding New Columns To A DataFrame
Adding new columns to a DataFrame is straightforward. We just need to use bracket notation, (without .loc or .iloc)
We need a dataset to practice on, so let's load the Philadelphia Airport Weather Dataset.

In [1]:
# In this cell we import pandas and load the datafile.
import pandas as pd
import os

filepath = os.path.join(os.getcwd(), 'data', 'Philadelphia_Pennsylvania_USA/724080-13739-2001')

headers = ['Year', 'Month', 'Day', 'Hour', 'Air Temp', 'Dew Point Temp', 'Sea Level Pressure',
           'Wind Direction', 'Wind Speed Rate',
           'Sky Condition Total Coverage Code',
           'Liquid Precipitation Depth Dimension - 1Hr Duration',
           'Liquid Precipitation Depth Dimension - Six Hour Duration']
weatherData = pd.read_csv(filepath, delim_whitespace=True,
                          names=headers)

In [None]:
weatherData.head()

#### Creating a copy of the Air Temp Column.
To do this, we will use the .loc method to select a _new_ column and set it equal to the airtemp column. Notice the new column at the end of the DataFrame after we complete this operation.

In [None]:
weatherData.loc[:, 'Air Temp Copy'] = weatherData['Air Temp'] 
weatherData.head()

<a name=DroppingColumns></a>
# Dropping Columns From a DataFrame
We can use the `.drop` method to drop columns from the dataframe. We just need to specify the we are dropping a label from axis 1. The `inplace` keyword argument allows us to update the dataframe itself, and not just output a new dataframe wit the column dropped.

In [5]:
weatherData.drop('Air Temp Copy', axis=1, inplace=True)

In [6]:
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration
0,2001,1,1,0,-6,-94,10146,280,57,2,0,-9999
1,2001,1,1,1,-11,-94,10153,280,57,4,0,-9999
2,2001,1,1,2,-17,-106,10161,290,62,2,0,-9999
3,2001,1,1,3,-28,-100,10169,260,57,0,0,-9999
4,2001,1,1,4,-28,-100,10177,260,52,0,0,-9999


<a name=UpdatingColumnValues></a>
# Updating Values of a DataFrame Column

#### Updating the Air Temp Column
This data actually need to be divided by 10 to be put in the proper decimal notation. That is, the -6 value is really -0.6 Celsius. This is described in the data documentation which can be found here <ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite/isd-lite-format.pdf>

So, let's go ahead and update that column by dividing all the values by 10! We can do this with the same mathematical operators that we learned in the lecture on mathematical operators.

In [7]:
weatherData.loc[:, 'Air Temp'] = weatherData['Air Temp']/10
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,-9999
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,-9999
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,-9999
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,-9999
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,-9999


#### Converting the Air Temp from Celsius to Fahrenheit - in a _new_ column
Now we will convert the Celsius air temp values to Fahrenheit values in a new column of data.

In [None]:
weatherData.loc[:, 'Air Temp F'] = weatherData['Air Temp'] * 1.8 + 32
weatherData.head()

<a name=SettingSpecificValues></a>
# Setting Specific Values in a DataFrame

#### Changing the -9999 values to a Null value that Pandas will recognize.
We can use the .loc notation to set specific values in a column.  For example, let's change the -9999 values in the 'Liquid Precipitation Depth Dimension - Six Hour Duration' column to None values.

In [8]:
weatherData.loc[weatherData['Liquid Precipitation Depth Dimension - Six Hour Duration'] == -9999,
                'Liquid Precipitation Depth Dimension - Six Hour Duration'] = None
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,


#### Run .info() again to now see how many non-null values there are in 'Liquid Precipitation Depth Dimension - Six Hour Duration'

In [9]:
weatherData.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8758 entries, 0 to 8757
Data columns (total 12 columns):
Year                                                        8758 non-null int64
Month                                                       8758 non-null int64
Day                                                         8758 non-null int64
Hour                                                        8758 non-null int64
Air Temp                                                    8758 non-null float64
Dew Point Temp                                              8758 non-null int64
Sea Level Pressure                                          8758 non-null int64
Wind Direction                                              8758 non-null int64
Wind Speed Rate                                             8758 non-null int64
Sky Condition Total Coverage Code                           8758 non-null int64
Liquid Precipitation Depth Dimension - 1Hr Duration         8758 non-null int64
Liquid Prec

In [10]:
weatherData.loc[weatherData['Liquid Precipitation Depth Dimension - Six Hour Duration'].isnull(), :]

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,
5,2001,1,1,5,-4.4,-100,10182,250,52,0,0,
6,2001,1,1,6,-4.4,-100,10188,260,62,0,0,
7,2001,1,1,7,-4.4,-94,10193,250,62,7,0,
8,2001,1,1,8,-5.6,-100,10198,250,46,0,0,
9,2001,1,1,9,-5.6,-94,10197,260,52,0,0,


## In Class Exercise
Please create a cell below and practice creating new column and updating values in existing columns.

# Lesson Summary:
In this lesson you learned:
* How to create new columns in a DataFrame.
* How to update the values of an entire column.
* How to update specific values in a DataFrame.

## Question or Comments About This Notebook?
Feel free to contact me via my LinkedIn: https://www.linkedin.com/in/william-j-henry <br>
You can also email me at will@henryanalytics.com <br>