# Setting Values In A DataFrame
## Notebook Outline:

* [Adding New Columns To A DataFrame](#AddNewColumns)
* [Updating Column Values](#UpdatingColumnValues)
* [Setting Specific Values](#SettingSpecificValues)

# How to use this Notebook

The best way to use this notebook is to follow along with the lecture and then to apply what you learn to your own data files, or (if you do not have any of your own data) to practice using this functions and methods on the provided data. A little practice goes a long way towards understand and retaining! It would be easy to just skim this notebook, but you will learn more by doing!

<a name='AddNewColumns'></a>
# Adding New Columns To A DataFrame
Adding new columns to a DataFrame is straightforward. We just need to use bracket notation, (without .loc or .iloc)
We need a dataset to practice on, so let's load the Philadelphia Airport Weather Dataset.

In [1]:
# In this cell we import pandas and load the datafile.
import pandas as pd
filepath = ('/Users/yuzhang/Dropbox/Academia/Lecturer/I&C_SCI_X426.62/724080-13739-2001')
headers = ['Year', 'Month', 'Day', 'Hour', 'Air Temp', 'Dew Point Temp', 'Sea Level Pressure',
           'Wind Direction', 'Wind Speed Rate',
           'Sky Condition Total Coverage Code',
           'Liquid Precipitation Depth Dimension - 1Hr Duration',
           'Liquid Precipitation Depth Dimension - Six Hour Duration']
weatherData = pd.read_csv(filepath, delim_whitespace=True,
                          names=headers)

In [2]:
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration
0,2001,1,1,0,-6,-94,10146,280,57,2,0,-9999
1,2001,1,1,1,-11,-94,10153,280,57,4,0,-9999
2,2001,1,1,2,-17,-106,10161,290,62,2,0,-9999
3,2001,1,1,3,-28,-100,10169,260,57,0,0,-9999
4,2001,1,1,4,-28,-100,10177,260,52,0,0,-9999


#### Creating a copy of the Air Temp Column.
To do this, we will use the .loc method to select a _new_ column and set it equal to the airtemp column. Notice the new column at the end of the DataFrame after we complete this operation.

In [3]:
weatherData.loc[:, 'Air Temp Copy'] = weatherData['Air Temp']
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration,Air Temp Copy
0,2001,1,1,0,-6,-94,10146,280,57,2,0,-9999,-6
1,2001,1,1,1,-11,-94,10153,280,57,4,0,-9999,-11
2,2001,1,1,2,-17,-106,10161,290,62,2,0,-9999,-17
3,2001,1,1,3,-28,-100,10169,260,57,0,0,-9999,-28
4,2001,1,1,4,-28,-100,10177,260,52,0,0,-9999,-28


<a name=UpdatingColumnValues></a>
# Updating Values of a DataFrame Column

#### Updating the Air Temp Column
This data actually need to be divided by 10 to be put in the proper decimal notation. That is, the -6 value is really -0.6 Celsius. This is described in the data documentation which can be found here <ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-lite/isd-lite-format.pdf>

So, let's go ahead and update that column by dividing all the values by 10! We can do this with the same mathematical operators that we learned in the lecture on mathematical operators.

In [4]:
weatherData.loc[:, 'Air Temp'] = weatherData['Air Temp']/10
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration,Air Temp Copy
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,-9999,-6
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,-9999,-11
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,-9999,-17
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,-9999,-28
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,-9999,-28


#### Converting the Air Temp from Celsius to Fahrenheit - in a _new_ column
Now we will convert the Celsius air temp values to Fahrenheit values in a new column of data.

In [5]:
weatherData.loc[:, 'Air Temp F'] = weatherData['Air Temp'] * 1.8 + 32
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration,Air Temp Copy,Air Temp F
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,-9999,-6,30.92
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,-9999,-11,30.02
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,-9999,-17,28.94
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,-9999,-28,26.96
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,-9999,-28,26.96


<a name=SettingSpecificValues></a>
# Setting Specific Values in a DataFrame

#### Changing the -9999 values to a Null value that Pandas will recognize.
We can use the .loc notation to set specific values in a column.  For example, let's change the -9999 values in the 'Liquid Precipitation Depth Dimension - Six Hour Duration' column to None values.

In [6]:
weatherData.loc[weatherData['Liquid Precipitation Depth Dimension - Six Hour Duration'] == -9999,
                'Liquid Precipitation Depth Dimension - Six Hour Duration'] = None
weatherData.head()

Unnamed: 0,Year,Month,Day,Hour,Air Temp,Dew Point Temp,Sea Level Pressure,Wind Direction,Wind Speed Rate,Sky Condition Total Coverage Code,Liquid Precipitation Depth Dimension - 1Hr Duration,Liquid Precipitation Depth Dimension - Six Hour Duration,Air Temp Copy,Air Temp F
0,2001,1,1,0,-0.6,-94,10146,280,57,2,0,,-6,30.92
1,2001,1,1,1,-1.1,-94,10153,280,57,4,0,,-11,30.02
2,2001,1,1,2,-1.7,-106,10161,290,62,2,0,,-17,28.94
3,2001,1,1,3,-2.8,-100,10169,260,57,0,0,,-28,26.96
4,2001,1,1,4,-2.8,-100,10177,260,52,0,0,,-28,26.96


#### Run .info() again to now see how many non-null values there are in 'Liquid Precipitation Depth Dimension - Six Hour Duration'

In [7]:
weatherData.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8758 entries, 0 to 8757
Data columns (total 14 columns):
 #   Column                                                    Non-Null Count  Dtype  
---  ------                                                    --------------  -----  
 0   Year                                                      8758 non-null   int64  
 1   Month                                                     8758 non-null   int64  
 2   Day                                                       8758 non-null   int64  
 3   Hour                                                      8758 non-null   int64  
 4   Air Temp                                                  8758 non-null   float64
 5   Dew Point Temp                                            8758 non-null   int64  
 6   Sea Level Pressure                                        8758 non-null   int64  
 7   Wind Direction                                            8758 non-null   int64  
 8   Wind Speed Rate   

![](Success!.png)

# Lesson Summary:
In this lesson you learned:
* How to create new columns in a DataFrame.
* How to update the values of an entire column.
* How to update specific values in a DataFrame.