---

_You are currently looking at **version 1.0** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._

---

In [1]:
import pandas as pd
import numpy as np
from scipy.stats import ttest_ind

# Assignment 4 - Hypothesis Testing
This assignment requires more individual learning than previous assignments - you are encouraged to check out the [pandas documentation](http://pandas.pydata.org/pandas-docs/stable/) to find functions or methods you might not have used yet, or ask questions on [Stack Overflow](http://stackoverflow.com/) and tag them as pandas and python related. And of course, the discussion forums are open for interaction with your peers and the course staff.

Definitions:
* A _quarter_ is a specific three month period, Q1 is January through March, Q2 is April through June, Q3 is July through September, Q4 is October through December.
* A _recession_ is defined as starting with two consecutive quarters of GDP decline, and ending with two consecutive quarters of GDP growth.
* A _recession bottom_ is the quarter within a recession which had the lowest GDP.
* A _university town_ is a city which has a high percentage of university students compared to the total population of the city.

**Hypothesis**: University towns have their mean housing prices less effected by recessions. Run a t-test to compare the ratio of the mean price of houses in university towns the quarter before the recession starts compared to the recession bottom. (`price_ratio=quarter_before_recession/recession_bottom`)

The following data files are available for this assignment:
* From the [Zillow research data site](http://www.zillow.com/research/data/) there is housing data for the United States. In particular the datafile for [all homes at a city level](http://files.zillowstatic.com/research/public/City/City_Zhvi_AllHomes.csv), ```City_Zhvi_AllHomes.csv```, has median home sale prices at a fine grained level.
* From the Wikipedia page on college towns is a list of [university towns in the United States](https://en.wikipedia.org/wiki/List_of_college_towns#College_towns_in_the_United_States) which has been copy and pasted into the file ```university_towns.txt```.
* From Bureau of Economic Analysis, US Department of Commerce, the [GDP over time](http://www.bea.gov/national/index.htm#gdp) of the United States in current dollars (use the chained value in 2009 dollars), in quarterly intervals, in the file ```gdplev.xls```. For this assignment, only look at GDP data from the first quarter of 2000 onward.

Each function in this assignment below is worth 10%, with the exception of ```run_ttest()```, which is worth 50%.

In [2]:
# Use this dictionary to map state names to two letter acronyms
states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': 'American Samoa', 'NV': 'Nevada', 'WY': 'Wyoming', 'NA': 'National', 'AL': 'Alabama', 'MD': 'Maryland', 'AK': 'Alaska', 'UT': 'Utah', 'OR': 'Oregon', 'MT': 'Montana', 'IL': 'Illinois', 'TN': 'Tennessee', 'DC': 'District of Columbia', 'VT': 'Vermont', 'ID': 'Idaho', 'AR': 'Arkansas', 'ME': 'Maine', 'WA': 'Washington', 'HI': 'Hawaii', 'WI': 'Wisconsin', 'MI': 'Michigan', 'IN': 'Indiana', 'NJ': 'New Jersey', 'AZ': 'Arizona', 'GU': 'Guam', 'MS': 'Mississippi', 'PR': 'Puerto Rico', 'NC': 'North Carolina', 'TX': 'Texas', 'SD': 'South Dakota', 'MP': 'Northern Mariana Islands', 'IA': 'Iowa', 'MO': 'Missouri', 'CT': 'Connecticut', 'WV': 'West Virginia', 'SC': 'South Carolina', 'LA': 'Louisiana', 'KS': 'Kansas', 'NY': 'New York', 'NE': 'Nebraska', 'OK': 'Oklahoma', 'FL': 'Florida', 'CA': 'California', 'CO': 'Colorado', 'PA': 'Pennsylvania', 'DE': 'Delaware', 'NM': 'New Mexico', 'RI': 'Rhode Island', 'MN': 'Minnesota', 'VI': 'Virgin Islands', 'NH': 'New Hampshire', 'MA': 'Massachusetts', 'GA': 'Georgia', 'ND': 'North Dakota', 'VA': 'Virginia'}

In [4]:
univ_towns = [["Mississippi","Cleveland"],["Mississippi","Hattiesburg"],["Mississippi","Itta Bena"],["Mississippi","Oxford"],["Mississippi","Starkville"],["Oklahoma","Ada"],["Oklahoma","Alva"],["Oklahoma","Durant"],["Oklahoma","Edmond"],["Oklahoma","Goodwell"],["Oklahoma","Langston"],["Oklahoma","Norman"],["Oklahoma","Stillwater"],["Oklahoma","Tahlequah"],["Oklahoma","Tulsa"],["Oklahoma","Weatherford"],["Delaware","Dover"],["Delaware","Newark"],["Minnesota","Bemidji"],["Minnesota","Crookston"],["Minnesota","Duluth"],["Minnesota","Faribault"],["Minnesota","Mankato"],["Minnesota","Marshall"],["Minnesota","Moorhead"],["Minnesota","Morris"],["Minnesota","Northfield"],["Minnesota","North Mankato"],["Minnesota","St. Cloud"],["Minnesota","St. Joseph"],["Minnesota","St. Peter"],["Minnesota","Winona"],["Illinois","Carbondale"],["Illinois","Champaign–Urbana"],["Illinois","Charleston"],["Illinois","DeKalb"],["Illinois","Edwardsville"],["Illinois","Evanston"],["Illinois","Lebanon"],["Illinois","Macomb"],["Illinois","Normal"],["Illinois","Peoria"],["Arkansas","Arkadelphia"],["Arkansas","Conway"],["Arkansas","Fayetteville"],["Arkansas","Jonesboro"],["Arkansas","Magnolia"],["Arkansas","Monticello"],["Arkansas","Russellville"],["Arkansas","Searcy"],["New Mexico","Hobbs"],["New Mexico","Las Cruces"],["New Mexico","Las Vegas"],["New Mexico","Portales"],["New Mexico","Silver City"],["Indiana","Bloomington"],["Indiana","Crawfordsville"],["Indiana","Greencastle"],["Indiana","Hanover"],["Indiana","Marion"],["Indiana","Muncie"],["Indiana","Oakland City"],["Indiana","Richmond"],["Indiana","South Bend"],["Indiana","Terre Haute"],["Indiana","Upland"],["Indiana","Valparaiso"],["Indiana","West Lafayette"],["Maryland","Annapolis"],["Maryland","Chestertown"],["Maryland","College Park"],["Maryland","Cumberland"],["Maryland","Emmitsburg"],["Maryland","Frostburg"],["Maryland","Princess Anne"],["Maryland","Towson"],["Maryland","Salisbury"],["Maryland","Westminster"],["Louisiana","Baton Rouge"],["Louisiana","Grambling"],["Louisiana","Hammond"],["Louisiana","Lafayette"],["Louisiana","Monroe"],["Louisiana","Natchitoches"],["Louisiana","Ruston"],["Louisiana","Thibodaux"],["Idaho","Moscow"],["Idaho","Pocatello"],["Idaho","Rexburg"],["Wyoming","Laramie"],["Tennessee","Chattanooga"],["Tennessee","Collegedale"],["Tennessee","Cookeville"],["Tennessee","Harrogate"],["Tennessee","Henderson"],["Tennessee","Johnson City"],["Tennessee","Knoxville"],["Tennessee","Martin"],["Tennessee","McKenzie"],["Tennessee","Memphis"],["Tennessee","Murfreesboro"],["Tennessee","Nashville"],["Tennessee","Sewanee"],["Arizona","Flagstaff"],["Arizona","Tempe"],["Arizona","Tucson"],["Iowa","Ames"],["Iowa","Cedar Falls"],["Iowa","Cedar Rapids"],["Iowa","Decorah"],["Iowa","Fayette"],["Iowa","Grinnell"],["Iowa","Iowa City"],["Iowa","Lamoni"],["Iowa","Mount Vernon"],["Iowa","Orange City"],["Iowa","Sioux Center"],["Iowa","Storm Lake"],["Iowa","Waverly"],["Michigan","Adrian"],["Michigan","Albion"],["Michigan","Allendale"],["Michigan","Alma"],["Michigan","Ann Arbor"],["Michigan","Berrien Springs"],["Michigan","Big Rapids"],["Michigan","East Lansing"],["Michigan","Flint"],["Michigan","Hillsdale"],["Michigan","Houghton"],["Michigan","Kalamazoo"],["Michigan","Marquette"],["Michigan","Midland"],["Michigan","Mount Pleasant"],["Michigan","Olivet"],["Michigan","Saginaw"],["Michigan","Sault Ste. Marie"],["Michigan","Spring Arbor"],["Michigan","Ypsilanti"],["Kansas","Baldwin City"],["Kansas","Emporia"],["Kansas","Hays"],["Kansas","Lawrence"],["Kansas","Manhattan"],["Kansas","Pittsburg"],["Utah","Cedar City"],["Utah","Logan"],["Utah","Provo"],["Utah","Orem"],["Utah","Salt Lake City"],["Utah","Ephraim"],["Virginia","Blacksburg"],["Virginia","Bridgewater"],["Virginia","Charlottesville"],["Virginia","Farmville"],["Virginia","Fredericksburg"],["Virginia","Harrisonburg"],["Virginia","Lexington"],["Virginia","Lynchburg"],["Virginia","Radford"],["Virginia","Williamsburg"],["Virginia","Wise"],["Virginia","Chesapeake"],["Oregon","Ashland"],["Oregon","Corvallis"],["Oregon","Eugene"],["Oregon","Forest Grove"],["Oregon","Klamath Falls"],["Oregon","La Grande"],["Oregon","Marylhurst"],["Oregon","McMinnville"],["Oregon","Monmouth"],["Oregon","Newberg"],["Connecticut","Fairfield"],["Connecticut","Middletown"],["Connecticut","New Britain"],["Connecticut","New Haven"],["Connecticut","New London"],["Connecticut","Storrs"],["Connecticut","Willimantic"],["Montana","Bozeman"],["Montana","Dillon"],["Montana","Missoula"],["California","Angwin"],["California","Arcata"],["California","Berkeley"],["California","Chico"],["California","Claremont"],["California","Cotati"],["California","Davis"],["California","Irvine"],["California","Isla Vista"],["California","University Park"],["California","Merced"],["California","Orange"],["California","Palo Alto"],["California","Pomona"],["California","Redlands"],["California","Riverside"],["California","Sacramento"],["California","University District"],["California","San Diego"],["California","San Luis Obispo"],["California","Santa Barbara"],["California","Santa Cruz"],["California","Turlock"],["California","Westwood"],["California","Whittier"],["Massachusetts","Boston"],["Massachusetts","Bridgewater"],["Massachusetts","Cambridge"],["Massachusetts","Chestnut Hill"],["Massachusetts","The Colleges of Worcester Consortium:"],["Massachusetts","Dudley"],["Massachusetts","North Grafton"],["Massachusetts","Paxton"],["Massachusetts","Worcester"],["Massachusetts","The Five College Region of Western Massachusetts:"],["Massachusetts","Amherst"],["Massachusetts","Northampton"],["Massachusetts","South Hadley"],["Massachusetts","Fitchburg"],["Massachusetts","North Adams"],["Massachusetts","Springfield"],["Massachusetts","Waltham"],["Massachusetts","Williamstown"],["Massachusetts","Framingham"],["West Virginia","Athens"],["West Virginia","Buckhannon"],["West Virginia","Fairmont"],["West Virginia","Glenville"],["West Virginia","Huntington"],["West Virginia","Montgomery"],["West Virginia","Morgantown"],["West Virginia","Shepherdstown"],["West Virginia","West Liberty"],["South Carolina","Central"],["South Carolina","Charleston"],["South Carolina","Clemson"],["South Carolina","Clinton"],["South Carolina","Columbia"],["South Carolina","Due West"],["South Carolina","Florence"],["South Carolina","Greenwood"],["South Carolina","Orangeburg"],["South Carolina","Rock Hill"],["South Carolina","Spartanburg"],["New Hampshire","New London"],["New Hampshire","Durham"],["New Hampshire","Hanover"],["New Hampshire","Henniker"],["New Hampshire","Keene"],["New Hampshire","Plymouth"],["New Hampshire","Rindge"],["Wisconsin","Appleton"],["Wisconsin","Eau Claire"],["Wisconsin","Green Bay"],["Wisconsin","La Crosse"],["Wisconsin","Madison"],["Wisconsin","Menomonie"],["Wisconsin","Milwaukee"],["Wisconsin","Oshkosh"],["Wisconsin","Platteville"],["Wisconsin","River Falls"],["Wisconsin","Stevens Point"],["Wisconsin","Waukesha"],["Wisconsin","Whitewater"],["Vermont","Burlington"],["Vermont","Castleton"],["Vermont","Johnson"],["Vermont","Lyndonville"],["Vermont","Middlebury"],["Vermont","Northfield"],["Georgia","Albany"],["Georgia","Athens"],["Georgia","Atlanta"],["Georgia","Carrollton"],["Georgia","Demorest"],["Georgia","Fort Valley"],["Georgia","Kennesaw"],["Georgia","Milledgeville"],["Georgia","Mount Vernon"],["Georgia","Oxford"],["Georgia","Rome"],["Georgia","Savannah"],["Georgia","Statesboro"],["Georgia","Valdosta"],["Georgia","Waleska"],["Georgia","Young Harris"],["North Dakota","Fargo"],["North Dakota","Grand Forks"],["Pennsylvania","Altoona"],["Pennsylvania","Annville"],["Pennsylvania","Bethlehem"],["Pennsylvania","Bloomsburg"],["Pennsylvania","Bradford"],["Pennsylvania","California"],["Pennsylvania","Carlisle"],["Pennsylvania","Cecil B. Moore"],["Pennsylvania","Clarion"],["Pennsylvania","Collegeville"],["Pennsylvania","Cresson"],["Pennsylvania","East Stroudsburg"],["Pennsylvania","Edinboro"],["Pennsylvania","Erie"],["Pennsylvania","Gettysburg"],["Pennsylvania","Greensburg"],["Pennsylvania","Grove City"],["Pennsylvania","Huntingdon"],["Pennsylvania","Indiana"],["Pennsylvania","Johnstown"],["Pennsylvania","Kutztown"],["Pennsylvania","Lancaster"],["Pennsylvania","Lewisburg"],["Pennsylvania","Lock Haven"],["Pennsylvania","Loretto"],["Pennsylvania","Mansfield"],["Pennsylvania","Meadville"],["Pennsylvania","Mont Alto"],["Pennsylvania","Millersville"],["Pennsylvania","New Wilmington"],["Pennsylvania","North East"],["Pennsylvania","University City"],["Pennsylvania","Oakland"],["Pennsylvania","Reading"],["Pennsylvania","Selinsgrove"],["Pennsylvania","Shippensburg"],["Pennsylvania","Slippery Rock"],["Pennsylvania","State College"],["Pennsylvania","Villanova"],["Pennsylvania","Waynesburg"],["Pennsylvania","West Chester"],["Pennsylvania","Wilkes-Barre"],["Pennsylvania","Williamsport"],["Florida","Ave Maria"],["Florida","Boca Raton"],["Florida","Coral Gables"],["Florida","DeLand"],["Florida","Estero"],["Florida","Gainesville"],["Florida","Orlando"],["Florida","Sarasota"],["Florida","St. Augustine"],["Florida","St. Leo"],["Florida","Tallahassee"],["Florida","Tampa"],["Alaska","Fairbanks"],["Kentucky","Bowling Green"],["Kentucky","Columbia"],["Kentucky","Georgetown"],["Kentucky","Highland Heights"],["Kentucky","Lexington"],["Kentucky","Louisville"],["Kentucky","Morehead"],["Kentucky","Murray"],["Kentucky","Richmond"],["Kentucky","Williamsburg"],["Kentucky","Wilmore"],["Hawaii","Manoa"],["Nebraska","Chadron"],["Nebraska","Crete"],["Nebraska","Kearney"],["Nebraska","Lincoln"],["Nebraska","Peru"],["Nebraska","Seward"],["Nebraska","Wayne"],["Missouri","Bolivar"],["Missouri","Cape Girardeau"],["Missouri","Columbia"],["Missouri","Fayette"],["Missouri","Fulton"],["Missouri","Kirksville"],["Missouri","Maryville"],["Missouri","Rolla"],["Missouri","Warrensburg"],["Ohio","Ada"],["Ohio","Alliance"],["Ohio","Ashland"],["Ohio","Athens"],["Ohio","Berea"],["Ohio","Bluffton"],["Ohio","Bowling Green"],["Ohio","Cedarville"],["Ohio","Columbus"],["Ohio","Delaware"],["Ohio","Fairborn"],["Ohio","Findlay"],["Ohio","Gambier"],["Ohio","Granville"],["Ohio","Hiram"],["Ohio","Kent"],["Ohio","Nelsonville"],["Ohio","New Concord"],["Ohio","Oberlin"],["Ohio","Oxford"],["Ohio","Rio Grande"],["Ohio","Wilberforce"],["Alabama","Auburn"],["Alabama","Florence"],["Alabama","Jacksonville"],["Alabama","Livingston"],["Alabama","Montevallo"],["Alabama","Troy"],["Alabama","Tuscaloosa"],["Alabama","Tuskegee"],["Rhode Island","Kingston"],["Rhode Island","Providence"],["South Dakota","Brookings"],["South Dakota","Madison"],["South Dakota","Spearfish"],["South Dakota","Vermillion"],["Colorado","Alamosa"],["Colorado","Boulder"],["Colorado","Durango"],["Colorado","Fort Collins"],["Colorado","Golden"],["Colorado","Grand Junction"],["Colorado","Greeley"],["Colorado","Gunnison"],["Colorado","Pueblo"],["New Jersey","Ewing"],["New Jersey","Jersey City"],["New Jersey","Glassboro"],["New Jersey","Hoboken"],["New Jersey","Madison"],["New Jersey","Newark"],["New Jersey","New Brunswick"],["New Jersey","Princeton"],["New Jersey","Union"],["New Jersey","West Long Branch"],["Washington","Bellingham"],["Washington","Cheney"],["Washington","Ellensburg"],["Washington","Pullman"],["Washington","University District"],["North Carolina","Banner Elk"],["North Carolina","Boiling Springs"],["North Carolina","Boone"],["North Carolina","Buies Creek"],["North Carolina","Chapel Hill"],["North Carolina","Cullowhee"],["North Carolina","Davidson"],["North Carolina","Durham"],["North Carolina","Elon"],["North Carolina","Greensboro"],["North Carolina","Greenville"],["North Carolina","Hickory"],["North Carolina","Mars Hill"],["North Carolina","Mount Olive"],["North Carolina","Pembroke"],["North Carolina","Wilmington"],["North Carolina","Wingate"],["North Carolina","Winston-Salem"],["New York","Alfred"],["New York","Albany"],["New York","Aurora"],["New York","Binghamton"],["New York","Brockport"],["New York","Buffalo"],["New York","Canton"],["New York","Clinton"],["New York","Cobleskill"],["New York","Delhi"],["New York","Fredonia"],["New York","Geneseo"],["New York","Geneva"],["New York","Hamilton"],["New York","Ithaca"],["New York","Morningside Heights"],["New York","New Paltz"],["New York","Oneonta"],["New York","Oswego"],["New York","Plattsburgh"],["New York","Potsdam"],["New York","Poughkeepsie"],["New York","Purchase"],["New York","Rochester"],["New York","Saratoga Springs"],["New York","Seneca Falls"],["New York","Stony Brook"],["New York","Syracuse"],["New York","Tivoli"],["New York","Troy"],["New York","West Point"],["Texas","Abilene"],["Texas","Alpine"],["Texas","Austin"],["Texas","Beaumont"],["Texas","Canyon"],["Texas","College Station"],["Texas","Commerce"],["Texas","Dallas"],["Texas","Denton"],["Texas","Fort Worth"],["Texas","Georgetown"],["Texas","Huntsville"],["Texas","Houston"],["Texas","Keene"],["Texas","Kingsville"],["Texas","Lubbock"],["Texas","Nacogdoches"],["Texas","Plainview"],["Texas","Prairie View"],["Texas","San Marcos"],["Texas","Stephenville"],["Texas","Waco"],["Nevada","Las Vegas"],["Nevada","Reno"],["Maine","Augusta"],["Maine","Bar Harbor"],["Maine","Brunswick"],["Maine","Farmington"],["Maine","Fort Kent"],["Maine","Gorham"],["Maine","Lewiston"],["Maine","Orono"],["Maine","Waterville"]]
univ_towns = sorted(univ_towns, key = lambda x : (x[0], x[1]))
univ_towns_df = pd.DataFrame(univ_towns, columns = ['State', 'RegionName'])
#print(univ_towns_df)
#print(univ_towns_df[univ_towns_df['State'] == 'Massachusetts'])


In [6]:
#print(univ_towns_df)
def get_list_of_university_towns():
    '''Returns a DataFrame of towns and the states they are in from the 
    university_towns.txt list. The format of the DataFrame should be:
    DataFrame( [ ["Michigan", "Ann Arbor"], ["Michigan", "Yipsilanti"] ], 
    columns=["State", "RegionName"]  )
    
    The following cleaning needs to be done:

    1. For "State", removing characters from "[" to the end.
    2. For "RegionName", when applicable, removing every character from " (" to the end.
    3. Depending on how you read the data, you may need to remove newline character '\n'. '''
    
    return univ_towns_df

In [20]:
#gdpData = open('gdplev.csv')
#gdpList = []
#for line in gdpData.readlines():
#    newLine = line.split(',')
#    gdpList.append([newLine[0], float(newLine[1])])
#gdpData.close()
#print(len(gdpList))
#for i in range(len(gdpList) - 2):
#    if gdpList[i + 1][1] < gdpList[i][1] and gdpList[i + 2][1] < gdpList[i + 1][1]:
#        for j in range(i + 2, len(gdpList) - 2):
#            if gdpList[j + 1][1] > gdpList[j][1] and gdpList[j + 2][1] > gdpList[j + 1][1]:
#                pass
                #print gdpList[i + 1], gdpList[j + 1]



In [None]:
def get_recession_start():
    '''Returns the year and quarter of the recession start time as a 
    string value in a format such as 2005q3'''
    
    return "2008q3"

In [None]:
def get_recession_end():
    '''Returns the year and quarter of the recession end time as a 
    string value in a format such as 2005q3'''
       
    return "2009q4"

In [None]:
def get_recession_bottom():
    '''Returns the year and quarter of the recession bottom time as a 
    string value in a format such as 2005q3'''
    
    return "2009q2"

In [6]:
col_keep_list = ["State","RegionName","2000-01","2000-02","2000-03","2000-04","2000-05","2000-06","2000-07","2000-08","2000-09","2000-10","2000-11","2000-12","2001-01","2001-02","2001-03","2001-04","2001-05","2001-06","2001-07","2001-08","2001-09","2001-10","2001-11","2001-12","2002-01","2002-02","2002-03","2002-04","2002-05","2002-06","2002-07","2002-08","2002-09","2002-10","2002-11","2002-12","2003-01","2003-02","2003-03","2003-04","2003-05","2003-06","2003-07","2003-08","2003-09","2003-10","2003-11","2003-12","2004-01","2004-02","2004-03","2004-04","2004-05","2004-06","2004-07","2004-08","2004-09","2004-10","2004-11","2004-12","2005-01","2005-02","2005-03","2005-04","2005-05","2005-06","2005-07","2005-08","2005-09","2005-10","2005-11","2005-12","2006-01","2006-02","2006-03","2006-04","2006-05","2006-06","2006-07","2006-08","2006-09","2006-10","2006-11","2006-12","2007-01","2007-02","2007-03","2007-04","2007-05","2007-06","2007-07","2007-08","2007-09","2007-10","2007-11","2007-12","2008-01","2008-02","2008-03","2008-04","2008-05","2008-06","2008-07","2008-08","2008-09","2008-10","2008-11","2008-12","2009-01","2009-02","2009-03","2009-04","2009-05","2009-06","2009-07","2009-08","2009-09","2009-10","2009-11","2009-12","2010-01","2010-02","2010-03","2010-04","2010-05","2010-06","2010-07","2010-08","2010-09","2010-10","2010-11","2010-12","2011-01","2011-02","2011-03","2011-04","2011-05","2011-06","2011-07","2011-08","2011-09","2011-10","2011-11","2011-12","2012-01","2012-02","2012-03","2012-04","2012-05","2012-06","2012-07","2012-08","2012-09","2012-10","2012-11","2012-12","2013-01","2013-02","2013-03","2013-04","2013-05","2013-06","2013-07","2013-08","2013-09","2013-10","2013-11","2013-12","2014-01","2014-02","2014-03","2014-04","2014-05","2014-06","2014-07","2014-08","2014-09","2014-10","2014-11","2014-12","2015-01","2015-02","2015-03","2015-04","2015-05","2015-06","2015-07","2015-08","2015-09","2015-10","2015-11","2015-12","2016-01","2016-02","2016-03","2016-04","2016-05","2016-06","2016-07","2016-08"]
month_list = ["2000-01","2000-02","2000-03","2000-04","2000-05","2000-06","2000-07","2000-08","2000-09","2000-10","2000-11","2000-12","2001-01","2001-02","2001-03","2001-04","2001-05","2001-06","2001-07","2001-08","2001-09","2001-10","2001-11","2001-12","2002-01","2002-02","2002-03","2002-04","2002-05","2002-06","2002-07","2002-08","2002-09","2002-10","2002-11","2002-12","2003-01","2003-02","2003-03","2003-04","2003-05","2003-06","2003-07","2003-08","2003-09","2003-10","2003-11","2003-12","2004-01","2004-02","2004-03","2004-04","2004-05","2004-06","2004-07","2004-08","2004-09","2004-10","2004-11","2004-12","2005-01","2005-02","2005-03","2005-04","2005-05","2005-06","2005-07","2005-08","2005-09","2005-10","2005-11","2005-12","2006-01","2006-02","2006-03","2006-04","2006-05","2006-06","2006-07","2006-08","2006-09","2006-10","2006-11","2006-12","2007-01","2007-02","2007-03","2007-04","2007-05","2007-06","2007-07","2007-08","2007-09","2007-10","2007-11","2007-12","2008-01","2008-02","2008-03","2008-04","2008-05","2008-06","2008-07","2008-08","2008-09","2008-10","2008-11","2008-12","2009-01","2009-02","2009-03","2009-04","2009-05","2009-06","2009-07","2009-08","2009-09","2009-10","2009-11","2009-12","2010-01","2010-02","2010-03","2010-04","2010-05","2010-06","2010-07","2010-08","2010-09","2010-10","2010-11","2010-12","2011-01","2011-02","2011-03","2011-04","2011-05","2011-06","2011-07","2011-08","2011-09","2011-10","2011-11","2011-12","2012-01","2012-02","2012-03","2012-04","2012-05","2012-06","2012-07","2012-08","2012-09","2012-10","2012-11","2012-12","2013-01","2013-02","2013-03","2013-04","2013-05","2013-06","2013-07","2013-08","2013-09","2013-10","2013-11","2013-12","2014-01","2014-02","2014-03","2014-04","2014-05","2014-06","2014-07","2014-08","2014-09","2014-10","2014-11","2014-12","2015-01","2015-02","2015-03","2015-04","2015-05","2015-06","2015-07","2015-08","2015-09","2015-10","2015-11","2015-12","2016-01","2016-02","2016-03","2016-04","2016-05","2016-06","2016-07","2016-08"]
quarters_list = ["2000q1","2000q2","2000q3","2000q4","2001q1","2001q2","2001q3","2001q4","2002q1","2002q2","2002q3","2002q4","2003q1","2003q2","2003q3","2003q4","2004q1","2004q2","2004q3","2004q4","2005q1","2005q2","2005q3","2005q4","2006q1","2006q2","2006q3","2006q4","2007q1","2007q2","2007q3","2007q4","2008q1","2008q2","2008q3","2008q4","2009q1","2009q2","2009q3","2009q4","2010q1","2010q2","2010q3","2010q4","2011q1","2011q2","2011q3","2011q4","2012q1","2012q2","2012q3","2012q4","2013q1","2013q2","2013q3","2013q4","2014q1","2014q2","2014q3","2014q4","2015q1","2015q2","2015q3","2015q4","2016q1","2016q2","2016q3"]

In [7]:
month_list_split = [i.split('-') for i in month_list]
#print(month_list_split)
data = pd.read_csv('City_Zhvi_AllHomes.csv', usecols = col_keep_list)
data['State'] = data['State'].map(lambda row: states[row])
#print(data.head())
for quarter in quarters_list:
    quarter_split = quarter.split('q')
    if quarter_split[1] == "1":
        quarter_eval_cols = [str(quarter_split[0]) + '-' + '01', str(quarter_split[0]) + '-' + '02', str(quarter_split[0]) + '-' + '03'] 
    elif quarter_split[1] == '2':
        quarter_eval_cols = [str(quarter_split[0]) + '-' + '04', str(quarter_split[0]) + '-' + '05', str(quarter_split[0]) + '-' + '06'] 
    elif quarter_split[1] == '3':
        if quarter_split[0] == '2016':
            quater_eval_cols = [str(quarter_split[0]) + '-' + '07', str(quarter_split[0]) + '-' + '08']
        else:
            quarter_eval_cols = [str(quarter_split[0]) + '-' + '07', str(quarter_split[0]) + '-' + '08', str(quarter_split[0]) + '-' + '09'] 
    else:    
        quarter_eval_cols = [str(quarter_split[0]) + '-' + '10', str(quarter_split[0]) + '-' + '11', str(quarter_split[0]) + '-' + '12'] 
    #print(quarter, quarter_eval_cols)
    #print(data[quarter_eval_cols])
    data[quarter] = data[quarter_eval_cols].mean(axis = 1)
    #data[quarter] = data.apply(lambda row : np.mean(row[quarter_eval_cols]))
new_keep_list = [x for x in quarters_list]
new_keep_list.insert(0, 'State')
new_keep_list.insert(0, 'RegionName')
data = data[new_keep_list]
processing_df = data.set_index(['State', 'RegionName'])
#print(processing_df.head())
#print(processing_df.columns, processing_df.shape)

In [None]:
def convert_housing_data_to_quarters():
    '''Converts the housing data to quarters and returns it as mean 
    values in a dataframe. This dataframe should be a dataframe with
    columns for 2000q1 through 2016q3, and should have a multi-index
    in the shape of ["State","RegionName"].
    
    Note: Quarters are defined in the assignment description, they are
    not arbitrary three month periods.
    
    The resulting dataframe should have 67 columns, and 10,730 rows.
    '''
    
    return processing_df

In [24]:
ttest_df = data[['State', 'RegionName', '2008q2', '2009q2']]
#ttest_df['ratio'] = ttest_df.apply(lambda row : row['2008q2']/row['2009q2'], axis = 1)
ttest_df['ratio'] = ttest_df['2008q2']/ttest_df['2009q2']
#print(ttest_df)
#print(univ_towns_df.columns, ttest_df.columns)
univ_df = ttest_df.join(univ_towns_df, how = 'inner', on = ['State', 'RegionName'])

(Index([u'State', u'RegionName'], dtype='object'), Index([u'State', u'RegionName', u'2008q2', u'2009q2', u'ratio'], dtype='object'))


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  app.launch_new_instance()


ValueError: len(left_on) must equal the number of levels in the index of "right"

In [None]:
def run_ttest():
    '''First creates new data showing the decline or growth of housing prices
    between the recession start and the recession bottom. Then runs a ttest
    comparing the university town values to the non-university towns values, 
    return whether the alternative hypothesis (that the two groups are the same)
    is true or not as well as the p-value of the confidence. 
    
    Return the tuple (different, p, better) where different=True if the t-test is
    True at a p<0.01 (we reject the null hypothesis), or different=False if 
    otherwise (we cannot reject the null hypothesis). The variable p should
    be equal to the exact p value returned from scipy.stats.ttest_ind(). The
    value for better should be either "university town" or "non-university town"
    depending on which has a lower mean price ratio (which is equivilent to a
    reduced market loss).'''
    
    return "ANSWER"