In [1]:
import pandas as pd
import numpy as np
import requests
from collections import deque
from functools import reduce
import matplotlib.pyplot as plt
#pd.options.display.float_format = '{:,.0f}'.format
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 150)

# This notebook outlines the download and formatting process for the HUD residential building permits dataset for counties and places in the GNRC operating region.

There are two figures needed from this information, one is a history of all permits and one is a current snapshot of the diversity of residential permits. The downloading interface is kind of spotty, so you have to download in two separate batches. First, historical data for all single and multifamily (we can add these for total permits and it's interesting to see both). Next, current year for single, multi, and then all of the unit number options.

Go to this page: https://socds.huduser.gov/permits/  
##### Download 1
+ Under "Main Criteria" select "States and Counties"  
+ Under "Periodicity" select "Annual"  
+ Under "Select State(s)" select "Tennessee", then select "Show Counties and Jurisdictions for Selected States"  
+ Under "Select Counties", highlight all counties in the GNRC region, check the box for "county total", and show/select all under "Select Permitting Jurisdictions" 
+ Under "Select Year(s), select all available years  
+ Under "Select Series", select multifamily and single family  
+ Finally, select the "Get Data" button  
This will populate a page with the information in table format. Scroll to the bottom and download in excel or csv format. 

Save as a csv in the "Data Downloads" folder and import as downloaded.

##### Download 2
+ Select the geographies the same way as above  
+ Select the most recent year  
+ "Get Data"  
+ Download and save

In [2]:
data = pd.read_csv('../../Data Downloads/BuildingPermits_HUD_AllYears.csv')
data2 = pd.read_csv('../../Data Downloads/BuildingPermits_HUD_RecentYear.csv')

In [3]:
data.head()

Unnamed: 0,Location,Year,Series,Series Code,Permits
0,ADAMS,1980,Units in Single-Family Structures,2,0
1,ADAMS,1981,Units in Single-Family Structures,2,0
2,ADAMS,1982,Units in Single-Family Structures,2,0
3,ADAMS,1983,Units in Single-Family Structures,2,0
4,ADAMS,1984,Units in Single-Family Structures,2,0


In [4]:
data2.head()

Unnamed: 0,Location,Year,Series,Series Code,Permits
0,Cheatham County,2021,Total Units,1,407
1,Cheatham County,2021,Units in Single-Family Structures,2,260
2,Cheatham County,2021,Units in All Multi-Family Structures,3,147
3,Cheatham County,2021,Units in 2-unit Multi-Family Structures,4,0
4,Cheatham County,2021,Units in 3- and 4-unit Multi-Family Structures,5,3


When you examine the data download in excel, you can see that the headers repeat every few records, delete these records. 

In [5]:
data = data.loc[data['Location'] != 'Location']
data2 = data2.loc[data2['Location'] != 'Location']

In [6]:
#strip extra spaces
data['Location'] = data['Location'].str.strip()
data2['Location'] = data2['Location'].str.strip()

We have three different target geographies that are downloaded: counties, unincorporated areas, and places. We can split these up to concatenate later as the data cleaning processes will be different. For now, we are only taking geographies within the GNRC 14 county operating region. You can see from the download that all counties contain the string "County", and that unincorporated areas contain the string "COUNTY".

In [7]:
counties = data[data['Location'].str.contains('County')]
counties2 = data2[data2['Location'].str.contains('County')]

In [8]:
counties['Location'] = counties['Location'] + ", Tennessee"
counties2['Location'] = counties2['Location'] + ", Tennessee"

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  counties['Location'] = counties['Location'] + ", Tennessee"
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  counties2['Location'] = counties2['Location'] + ", Tennessee"


Counties are ready to go, now for Unincorporated Areas

In [9]:
unincorporated = data[data['Location'].str.contains('COUNTY')]
unincorporated2 = data2[data2['Location'].str.contains('COUNTY')]

In [10]:
region = ['CHEATHAM COUNTY UNINCORPORATED AREA', 'DAVIDSON COUNTY UNINCORPORATED AREA', 'DICKSON COUNTY UNINCORPORATED AREA', 
          'HOUSTON COUNTY UNINCORPORATED AREA', 'HUMPHREYS COUNTY UNINCORPORATED AREA', 'MAURY COUNTY UNINCORPORATED AREA', 
          'MONTGOMERY COUNTY UNINCORPORATED AREA', 'ROBERTSON COUNTY UNINCORPORATED AREA', 'RUTHERFORD COUNTY UNINCORPORATED AREA', 
          'STEWART COUNTY UNINCORPORATED AREA', 'SUMNER COUNTY UNINCORPORATED AREA', 'TROUSDALE COUNTY UNINCORPORATED AREA', 
          'WILLIAMSON COUNTY UNINCORPORATED AREA', 'WILSON COUNTY UNINCORPORATED AREA']

#Davidson, Houston, Humphreys nor Stewart have unincorporated lines

In [11]:
unincorporated['Location'] = unincorporated['Location'].replace({'CHEATHAM COUNTY UNINCORPORATED AREA': 'Cheatham Unincorporated', 
                                                                 'DICKSON COUNTY UNINCORPORATED AREA': "Dickson Unincorporated", 
                                                                 'MAURY COUNTY UNINCORPORATED AREA': 'Maury Unincorporated', 
                                                                 'MONTGOMERY COUNTY UNINCORPORATED AREA': 'Montgomery Unincorporated', 
                                                                 'ROBERTSON COUNTY UNINCORPORATED AREA': 'Robertson Unincorporated', 
                                                                 'RUTHERFORD COUNTY UNINCORPORATED AREA': 'Rutherford Unincorporated',
                                                                 'SUMNER COUNTY UNINCORPORATED AREA': 'Sumner Unincorporated', 
                                                                 'TROUSDALE COUNTY UNINCORPORATED AREA': 'Trousdale Unincorporated', 
                                                                 'WILLIAMSON COUNTY UNINCORPORATED AREA': 'Williamson Unincorporated', 
                                                                 'WILSON COUNTY UNINCORPORATED AREA': 'Wilson Unincorporated'})
unincorporated2['Location'] = unincorporated2['Location'].replace({'CHEATHAM COUNTY UNINCORPORATED AREA': 'Cheatham Unincorporated', 
                                                                 'DICKSON COUNTY UNINCORPORATED AREA': "Dickson Unincorporated", 
                                                                 'MAURY COUNTY UNINCORPORATED AREA': 'Maury Unincorporated', 
                                                                 'MONTGOMERY COUNTY UNINCORPORATED AREA': 'Montgomery Unincorporated', 
                                                                 'ROBERTSON COUNTY UNINCORPORATED AREA': 'Robertson Unincorporated', 
                                                                 'RUTHERFORD COUNTY UNINCORPORATED AREA': 'Rutherford Unincorporated',
                                                                 'SUMNER COUNTY UNINCORPORATED AREA': 'Sumner Unincorporated', 
                                                                 'TROUSDALE COUNTY UNINCORPORATED AREA': 'Trousdale Unincorporated', 
                                                                 'WILLIAMSON COUNTY UNINCORPORATED AREA': 'Williamson Unincorporated', 
                                                                 'WILSON COUNTY UNINCORPORATED AREA': 'Wilson Unincorporated'})

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  unincorporated['Location'] = unincorporated['Location'].replace({'CHEATHAM COUNTY UNINCORPORATED AREA': 'Cheatham Unincorporated',
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  unincorporated2['Location'] = unincorporated2['Location'].replace({'CHEATHAM COUNTY UNINCORPORATED AREA': 'Cheatham Unincorporated',


Places are going to be more difficult, we're looking for the following: 
Adams city, Tennessee: Robertson  
Ashland City town, Tennessee: Cheatham  
Belle Meade city, Tennessee: Davidson  
Berry Hill city, Tennessee: Davidson  
Brentwood city, Tennessee: Williamson  
Burns town, Tennessee: Dickson  
Cedar Hill city, Tennessee: Robertson  
Charlotte town, Tennessee: Dickson  
Clarksville city, Tennessee: Montgomery  
Columbia city, Tennessee: Maury  
Coopertown town, Tennessee: Robertson  
Cross Plains city, Tennessee: Robertson  
Cumberland City town, Tennessee: Stewart  
Dickson city, Tennessee: Dickson  
Dover city, Tennessee: Stewart  
Eagleville city, Tennessee: Rutherford  
Erin city, Tennessee: Houston  
Fairview city, Tennessee: Williamson  
Forest Hills city, Tennessee: Davidson  
Franklin city, Tennessee: Williamson  
Gallatin city, Tennessee: Sumner  
Goodlettsville city, Tennessee: Davidson/Sumner  
Greenbrier town, Tennessee: Robertson  
Hendersonville city, Tennessee: Sumner  
Kingston Springs town, Tennessee: Cheatham  
La Vergne city, Tennessee: Rutherford  
Lafayette city, Tennessee: Macon  
Lebanon city, Tennessee: Wilson  
McEwen city, Tennessee: Humphreys  
Millersville city, Tennessee: Robertson/Sumner  
Mitchellville city, Tennessee: Sumner  
Mount Juliet city, Tennessee: Wilson  
Mount Pleasant city, Tennessee: Maury  
Murfreesboro city, Tennessee: Rutherford  
Nashville-Davidson metropolitan government (balance): Davidson  
New Johnsonville city, Tennessee: Humphreys  
Nolensville town, Tennessee: Williamson  
Oak Hill city, Tennessee: Davidson  
Pegram town, Tennessee: Cheatham  
Pleasant View city, Tennessee: Cheatham  
Portland city, Tennessee: Robertson/Sumner  
Ridgetop city, Tennessee: Davidson/Robertson  
Slayden town, Tennessee: Dickson  
Smyrna town, Tennessee: Rutherford  
Spring Hill city, Tennessee: Maury/Williamson  
Springfield city, Tennessee: Robertson  
Tennessee Ridge town, Tennessee: Houston/Stewart  
Thompson's Station town, Tennessee: Williamson  
Vanleer town, Tennessee: Dickson  
Watertown city, Tennessee: Wilson  
Waverly city, Tennessee: Humphreys  
Westmoreland town, Tennessee: Sumner  
White Bluff town, Tennessee: Dickson  
White House city, Tennessee: Robertson/Sumner    

The downloaded data is in all caps, and doesn't have the ", Tennessee" of the "town.. city... etc.". There are 54 records. We can reverse index into the geographies that are *not* counties or unincorporated areas, reformat the capital letters, and then I'll make a list of only the first word of the place to run through and see how close we can get that way.


In [12]:
place = data[~data['Location'].str.contains('County')]
place = place[~place['Location'].str.contains('COUNTY')]
place2 = data2[~data2['Location'].str.contains('County')]
place2 = place2[~place2['Location'].str.contains('COUNTY')]

In [13]:
place['Location'] = place['Location'].str.title()
place2['Location'] = place2['Location'].str.title()

In [14]:
#I've gone through the excel document to see what these are named - hopefully it is consistent year to year so that this saves time.
places = ["Adams", 
          "Ashland City Town", 
          "Belle Meade", 
          "Berry Hill", 
          "Brentwood", 
          "Burns Town",  
          "Charlotte Town", 
          "Clarksville", 
          "Columbia", 
          "Coopertown Town", 
          "Cross Plains", 
          "Cumberland City Town", 
          "Dickson", 
          "Dover", 
          "Eagleville", 
          "Erin", 
          "Fairview", 
          "Forest Hills", 
          "Franklin", 
          "Gallatin", 
          "Goodlettsville", 
          "Greenbrier Town",
          "Hendersonville", 
          "Kingston Springs Town", 
          "La Vergne", 
          "Lafayette", 
          "Lebanon", 
          "Mcewen", 
          "Millersville", 
          "Mitchellville Town", 
          "Mount Juliet", 
          "Mount Pleasant", 
          "Murfreesboro", 
          "Nashville-Davidson", 
          "New Johnsonville", 
          "Nolensville Town", 
          "Oak Hill", 
          "Pegram Town", 
          "Pleasant View", 
          "Portland", 
          "Ridgetop Town",
          "Smyrna Town", 
          "Spring Hill Town", 
          "Springfield", 
          "Tennessee Ridge Town", 
          "Thompsons Station Town",
          "Watertown", 
          "Waverly", 
          "Westmoreland Town", 
          "White Bluff Town", 
          "White House"]

In [15]:
place = place.loc[place['Location'].isin(places)].reset_index(drop = True)
place2 = place2.loc[place2['Location'].isin(places)].reset_index(drop = True)

In [16]:
transp = place.set_index('Location').transpose()
transp2 = place2.set_index('Location').transpose()

In [17]:
transp = transp.rename(columns = {"Adams": 'Adams city, Tennessee', "Ashland City Town":'Ashland City town, Tennessee',
                                  "Belle Meade": 'Belle Meade city, Tennessee', "Berry Hill": 'Berry Hill city, Tennessee', 
                                  "Brentwood": 'Brentwood city, Tennessee', "Burns Town": 'Burns town, Tennessee',  
                                  "Charlotte Town": 'Charlotte town, Tennessee',  "Clarksville": 'Clarksville city, Tennessee', 
                                  "Columbia": 'Columbia city, Tennessee', "Coopertown Town": 'Coopertown town, Tennessee', 
                                  "Cross Plains": 'Cross Plains city, Tennessee', "Cumberland City Town":'Cumberland City town, Tennessee', 
                                  "Dickson": 'Dickson city, Tennessee', "Dover": 'Dover city, Tennessee', 
                                  "Eagleville": 'Eagleville city, Tennessee', "Erin": 'Erin city, Tennessee', 
                                  "Fairview": 'Fairview city, Tennessee', "Forest Hills": 'Forest Hills city, Tennessee', 
                                  "Franklin": 'Franklin city, Tennessee', "Gallatin": 'Gallatin city, Tennessee', 
                                  "Goodlettsville": 'Goodlettsville city, Tennessee', "Greenbrier Town": 'Greenbrier town, Tennessee',
                                  "Hendersonville": 'Hendersonville city, Tennessee', "Kingston Springs Town": 'Kingston Springs town, Tennessee', 
                                  "La Vergne": 'La Vergne city, Tennessee', "Lafayette": 'La Vergne city, Tennessee', 
                                  "Lebanon": 'Lebanon city, Tennessee', "Mcewen": 'McEwen city, Tennessee', 
                                  "Millersville": 'Millersville city, Tennessee', "Mitchellville Town": 'Mitchellville city, Tennessee', 
                                  "Mount Juliet": 'Mount Juliet city, Tennessee', "Mount Pleasant": 'Mount Pleasant city, Tennessee', 
                                  "Murfreesboro": 'Murfreesboro city, Tennessee', "Nashville-Davidson": "Nashville-Davidson metropolitan government (balance)", 
                                  "New Johnsonville": 'New Johnsonville city, Tennessee', "Nolensville Town": 'Nolensville town, Tennessee', 
                                  "Oak Hill": 'Oak Hill city, Tennessee', "Pegram Town": 'Pegram town, Tennessee', 
                                  "Pleasant View": 'Pleasant View city, Tennessee', "Portland": 'Portland city, Tennessee', 
                                  "Ridgetop Town": 'Ridgetop city, Tennessee',"Smyrna Town": 'Smyrna town, Tennessee', 
                                  "Spring Hill Town": 'Spring Hill city, Tennessee', "Springfield": 'Springfield city, Tennessee', 
                                  "Tennessee Ridge Town": 'Tennessee Ridge town, Tennessee', "Thompsons Station Town": "Thompson's Station town, Tennessee",
                                  "Watertown": 'Watertown city, Tennessee', "Waverly": 'Waverly city, Tennessee', 
                                  "Westmoreland Town": 'Westmoreland town, Tennessee', "White Bluff Town": 'White Bluff town, Tennessee', 
                                  "White House": 'White House city, Tennessee'})
transp2 = transp2.rename(columns = {"Adams": 'Adams city, Tennessee', "Ashland City Town":'Ashland City town, Tennessee',
                                  "Belle Meade": 'Belle Meade city, Tennessee', "Berry Hill": 'Berry Hill city, Tennessee', 
                                  "Brentwood": 'Brentwood city, Tennessee', "Burns Town": 'Burns town, Tennessee',  
                                  "Charlotte Town": 'Charlotte town, Tennessee',  "Clarksville": 'Clarksville city, Tennessee', 
                                  "Columbia": 'Columbia city, Tennessee', "Coopertown Town": 'Coopertown town, Tennessee', 
                                  "Cross Plains": 'Cross Plains city, Tennessee', "Cumberland City Town":'Cumberland City town, Tennessee', 
                                  "Dickson": 'Dickson city, Tennessee', "Dover": 'Dover city, Tennessee', 
                                  "Eagleville": 'Eagleville city, Tennessee', "Erin": 'Erin city, Tennessee', 
                                  "Fairview": 'Fairview city, Tennessee', "Forest Hills": 'Forest Hills city, Tennessee', 
                                  "Franklin": 'Franklin city, Tennessee', "Gallatin": 'Gallatin city, Tennessee', 
                                  "Goodlettsville": 'Goodlettsville city, Tennessee', "Greenbrier Town": 'Greenbrier town, Tennessee',
                                  "Hendersonville": 'Hendersonville city, Tennessee', "Kingston Springs Town": 'Kingston Springs town, Tennessee', 
                                  "La Vergne": 'La Vergne city, Tennessee', "Lafayette": 'La Vergne city, Tennessee', 
                                  "Lebanon": 'Lebanon city, Tennessee', "Mcewen": 'McEwen city, Tennessee', 
                                  "Millersville": 'Millersville city, Tennessee', "Mitchellville Town": 'Mitchellville city, Tennessee', 
                                  "Mount Juliet": 'Mount Juliet city, Tennessee', "Mount Pleasant": 'Mount Pleasant city, Tennessee', 
                                  "Murfreesboro": 'Murfreesboro city, Tennessee', "Nashville-Davidson": "Nashville-Davidson metropolitan government (balance)", 
                                  "New Johnsonville": 'New Johnsonville city, Tennessee', "Nolensville Town": 'Nolensville town, Tennessee', 
                                  "Oak Hill": 'Oak Hill city, Tennessee', "Pegram Town": 'Pegram town, Tennessee', 
                                  "Pleasant View": 'Pleasant View city, Tennessee', "Portland": 'Portland city, Tennessee', 
                                  "Ridgetop Town": 'Ridgetop city, Tennessee',"Smyrna Town": 'Smyrna town, Tennessee', 
                                  "Spring Hill Town": 'Spring Hill city, Tennessee', "Springfield": 'Springfield city, Tennessee', 
                                  "Tennessee Ridge Town": 'Tennessee Ridge town, Tennessee', "Thompsons Station Town": "Thompson's Station town, Tennessee",
                                  "Watertown": 'Watertown city, Tennessee', "Waverly": 'Waverly city, Tennessee', 
                                  "Westmoreland Town": 'Westmoreland town, Tennessee', "White Bluff Town": 'White Bluff town, Tennessee', 
                                  "White House": 'White House city, Tennessee'})

In [18]:
place = transp.transpose()
place2 = transp2.transpose()

In [19]:
place = place.reset_index()
place2 = place2.reset_index()

In [20]:
dfs = [counties, unincorporated, place]
df = pd.concat(dfs)
dfs2 = [counties2, unincorporated2, place2]
df2 = pd.concat(dfs2)

In [21]:
df = df.rename(columns = {'Location':'NAME'})
df2 = df2.rename(columns = {'Location':'NAME'})

In [22]:
geos = pd.read_csv('../../Data Downloads/geofips.csv')

In [23]:
df = df.merge(geos, how = 'outer')
df2 = df2.merge(geos, how = 'outer')

In [24]:
df = df.drop(columns = 'Series Code')
df2 = df2.drop(columns = 'Series Code')

In [25]:
df = df.pivot(index = ['NAME', 'Year', 'GEO_ID'], columns = 'Series', values = 'Permits')
df2 = df2.pivot(index = ['NAME', 'Year', 'GEO_ID'], columns = 'Series', values = 'Permits')

  uniques = Index(uniques)


In [26]:
df2

Unnamed: 0_level_0,Unnamed: 1_level_0,Series,NaN,Total Units,Units in 2-unit Multi-Family Structures,Units in 3- and 4-unit Multi-Family Structures,Units in 5+ Unit Multi-Family Structures,Units in All Multi-Family Structures,Units in Single-Family Structures
NAME,Year,GEO_ID,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
"Adams city, Tennessee",2021.0,1600000US4700200,,0.0,0.0,0.0,0.0,0.0,0.0
"Allen County, Kentucky",,0500000US21003,,,,,,,
"Ashland City town, Tennessee",2021.0,1600000US4702180,,193.0,0.0,3.0,144.0,147.0,46.0
"Belle Meade city, Tennessee",2021.0,1600000US4704620,,14.0,0.0,0.0,0.0,0.0,14.0
"Berry Hill city, Tennessee",2021.0,1600000US4705140,,0.0,0.0,0.0,0.0,0.0,0.0
"Brentwood city, Tennessee",2021.0,1600000US4708280,,147.0,0.0,0.0,0.0,0.0,147.0
"Burns town, Tennessee",2021.0,1600000US4709880,,51.0,0.0,0.0,0.0,0.0,51.0
"Cedar Hill city, Tennessee",,1600000US4711980,,,,,,,
"Charlotte town, Tennessee",2021.0,1600000US4713080,,0.0,0.0,0.0,0.0,0.0,0.0
"Cheatham County, Tennessee",2021.0,0500000US47021,,407.0,0.0,3.0,144.0,147.0,260.0


In [27]:
df = df.reset_index(drop = False)
cols = ['Year', 'Units in All Multi-Family Structures', 'Units in Single-Family Structures']
df[cols] = df[cols].astype(float)

df2 = df2.reset_index(drop = False)
cols = ['Year', 'Units in All Multi-Family Structures', 'Units in Single-Family Structures', 'Units in 5+ Unit Multi-Family Structures', 
        'Units in 3- and 4-unit Multi-Family Structures', 'Units in 2-unit Multi-Family Structures', 'Total Units']
df2[cols] = df2[cols].astype(float)

In [28]:
df['Units in All Structures'] = df['Units in All Multi-Family Structures'] + df['Units in Single-Family Structures']
df2 = df2.rename(columns = {'Units in 3- and 4-unit Multi-Family Structures': 'Units in 3 and 4 Unit Multi-Family Structures', 
                            'Units in 2-unit Multi-Family Structures': 'Units in 2 Unit Multi-Family Structures', 
                            'Total Units': 'Units in All Structures'})

In [29]:
df2

Series,NAME,Year,GEO_ID,NaN,Units in All Structures,Units in 2 Unit Multi-Family Structures,Units in 3 and 4 Unit Multi-Family Structures,Units in 5+ Unit Multi-Family Structures,Units in All Multi-Family Structures,Units in Single-Family Structures
0,"Adams city, Tennessee",2021.0,1600000US4700200,,0.0,0.0,0.0,0.0,0.0,0.0
1,"Allen County, Kentucky",,0500000US21003,,,,,,,
2,"Ashland City town, Tennessee",2021.0,1600000US4702180,,193.0,0.0,3.0,144.0,147.0,46.0
3,"Belle Meade city, Tennessee",2021.0,1600000US4704620,,14.0,0.0,0.0,0.0,0.0,14.0
4,"Berry Hill city, Tennessee",2021.0,1600000US4705140,,0.0,0.0,0.0,0.0,0.0,0.0
5,"Brentwood city, Tennessee",2021.0,1600000US4708280,,147.0,0.0,0.0,0.0,0.0,147.0
6,"Burns town, Tennessee",2021.0,1600000US4709880,,51.0,0.0,0.0,0.0,0.0,51.0
7,"Cedar Hill city, Tennessee",,1600000US4711980,,,,,,,
8,"Charlotte town, Tennessee",2021.0,1600000US4713080,,0.0,0.0,0.0,0.0,0.0,0.0
9,"Cheatham County, Tennessee",2021.0,0500000US47021,,407.0,0.0,3.0,144.0,147.0,260.0


In [30]:
df = df[['NAME', 'Year', 'GEO_ID', 'Units in All Multi-Family Structures', 'Units in Single-Family Structures', 'Units in All Structures']]
df2 = df2[['NAME', 'GEO_ID', 'Units in All Structures', 'Units in Single-Family Structures', 
           'Units in All Multi-Family Structures', 'Units in 2 Unit Multi-Family Structures', 
           'Units in 3 and 4 Unit Multi-Family Structures', 'Units in 5+ Unit Multi-Family Structures']]

In [31]:
df = df.set_index('NAME').transpose()
df = df.rename(columns = {'Allen County, Kentucky': 'Allen County, KY', 'Cheatham County, Tennessee': 'Cheatham County', 
                              'Davidson County, Tennessee': 'Davidson County', 'Dickson County, Tennessee': 'Dickson County', 
                              'Houston County, Tennessee': 'Houston County', 'Humphreys County, Tennessee': 'Humphreys County', 
                              'Maury County, Tennessee': 'Maury County', 'Montgomery County, Tennessee': 'Montgomery County', 
                              'Robertson County, Tennessee': 'Robertson County', 'Rutherford County, Tennessee': 'Rutherford County', 
                              'Simpson County, Kentucky': 'Simpson County, KY', 'Stewart County, Tennessee': 'Stewart County', 
                              'Sumner County, Tennessee': 'Sumner County', 'Trousdale County, Tennessee': 'Trousdale County', 
                              'Williamson County, Tennessee': 'Williamson County', 'Wilson County, Tennessee': 'Wilson County', 
                              'Adams city, Tennessee': 'Adams', 'Ashland City town, Tennessee': 'Ashland City', 'Belle Meade city, Tennessee': 'Belle Meade', 
                              'Berry Hill city, Tennessee': 'Berry Hill', 'Brentwood city, Tennessee': 'Brentwood', 'Burns town, Tennessee': 'Burns', 
                              'Cedar Hill city, Tennessee': 'Cedar Hill', 'Charlotte town, Tennessee': 'Charlotte', 'Clarksville city, Tennessee': 'Clarksville', 
                              'Columbia city, Tennessee': 'Columbia', 'Coopertown town, Tennessee': 'Coopertown', 'Cross Plains city, Tennessee': 'Cross Plains', 
                              'Cumberland City town, Tennessee': 'Cumberland City', 'Dickson city, Tennessee': 'Dickson', 'Dover city, Tennessee': 'Dover', 
                              'Eagleville city, Tennessee': 'Eagleville', 'Erin city, Tennessee': 'Erin', 'Fairview city, Tennessee': 'Fairview', 
                              'Forest Hills city, Tennessee': 'Forest Hills', 'Franklin city, Tennessee': 'Franklin', 'Gallatin city, Tennessee': 'Gallatin', 
                              'Goodlettsville city, Tennessee': 'Goodlettsville', 'Greenbrier town, Tennessee': 'Greenbrier', 
                              'Hendersonville city, Tennessee': 'Hendersonville', 'Kingston Springs town, Tennessee': 'Kingston Springs', 
                              'La Vergne city, Tennessee': 'La Vergne', 'Lafayette city, Tennessee': 'Lafayette', 'Lebanon city, Tennessee': 'Lebanon', 
                              'McEwen city, Tennessee': 'McEwen', 'Millersville city, Tennessee': 'Millersville', 'Mitchellville city, Tennessee': 'Mitchellville', 
                              'Mount Juliet city, Tennessee': 'Mount Juliet', 'Mount Pleasant city, Tennessee': 'Mount Pleasant', 
                              'Murfreesboro city, Tennessee': 'Murfreesboro', 'Nashville-Davidson metropolitan government (balance)': 'Nashville', 
                              'New Johnsonville city, Tennessee': 'New Johnsonville', 'Nolensville town, Tennessee': 'Nolensville', 
                              'Oak Hill city, Tennessee': 'Oak Hill', 'Pegram town, Tennessee': 'Pegram', 'Pleasant View city, Tennessee': 'Pleasant View', 
                              'Portland city, Tennessee': 'Portland', 'Ridgetop city, Tennessee': 'Ridgetop', 'Slayden town, Tennessee': 'Slayden', 
                              'Smyrna town, Tennessee': 'Smyrna', 'Spring Hill city, Tennessee': 'Spring Hill', 'Springfield city, Tennessee': 'Springfield', 
                              'Tennessee Ridge town, Tennessee': 'Tennessee Ridge', "Thompson's Station town, Tennessee": "Thompson's Station", 
                              'Vanleer town, Tennessee': 'Vanleer', 'Watertown city, Tennessee': 'Watertown', 'Waverly city, Tennessee': 'Waverly', 
                              'Westmoreland town, Tennessee': 'Westmoreland', 'White Bluff town, Tennessee': 'White Bluff', 
                              'White House city, Tennessee': 'White House', 'Franklin city, Kentucky': 'Franklin, KY', 
                              'Scottsville city, Kentucky': 'Scottsville, KY', 'United States': 'US'})
df = df.transpose().reset_index(drop = False)

In [32]:
df2 = df2.set_index('NAME').transpose()
df2 = df2.rename(columns = {'Allen County, Kentucky': 'Allen County, KY', 'Cheatham County, Tennessee': 'Cheatham County', 
                              'Davidson County, Tennessee': 'Davidson County', 'Dickson County, Tennessee': 'Dickson County', 
                              'Houston County, Tennessee': 'Houston County', 'Humphreys County, Tennessee': 'Humphreys County', 
                              'Maury County, Tennessee': 'Maury County', 'Montgomery County, Tennessee': 'Montgomery County', 
                              'Robertson County, Tennessee': 'Robertson County', 'Rutherford County, Tennessee': 'Rutherford County', 
                              'Simpson County, Kentucky': 'Simpson County, KY', 'Stewart County, Tennessee': 'Stewart County', 
                              'Sumner County, Tennessee': 'Sumner County', 'Trousdale County, Tennessee': 'Trousdale County', 
                              'Williamson County, Tennessee': 'Williamson County', 'Wilson County, Tennessee': 'Wilson County', 
                              'Adams city, Tennessee': 'Adams', 'Ashland City town, Tennessee': 'Ashland City', 'Belle Meade city, Tennessee': 'Belle Meade', 
                              'Berry Hill city, Tennessee': 'Berry Hill', 'Brentwood city, Tennessee': 'Brentwood', 'Burns town, Tennessee': 'Burns', 
                              'Cedar Hill city, Tennessee': 'Cedar Hill', 'Charlotte town, Tennessee': 'Charlotte', 'Clarksville city, Tennessee': 'Clarksville', 
                              'Columbia city, Tennessee': 'Columbia', 'Coopertown town, Tennessee': 'Coopertown', 'Cross Plains city, Tennessee': 'Cross Plains', 
                              'Cumberland City town, Tennessee': 'Cumberland City', 'Dickson city, Tennessee': 'Dickson', 'Dover city, Tennessee': 'Dover', 
                              'Eagleville city, Tennessee': 'Eagleville', 'Erin city, Tennessee': 'Erin', 'Fairview city, Tennessee': 'Fairview', 
                              'Forest Hills city, Tennessee': 'Forest Hills', 'Franklin city, Tennessee': 'Franklin', 'Gallatin city, Tennessee': 'Gallatin', 
                              'Goodlettsville city, Tennessee': 'Goodlettsville', 'Greenbrier town, Tennessee': 'Greenbrier', 
                              'Hendersonville city, Tennessee': 'Hendersonville', 'Kingston Springs town, Tennessee': 'Kingston Springs', 
                              'La Vergne city, Tennessee': 'La Vergne', 'Lafayette city, Tennessee': 'Lafayette', 'Lebanon city, Tennessee': 'Lebanon', 
                              'McEwen city, Tennessee': 'McEwen', 'Millersville city, Tennessee': 'Millersville', 'Mitchellville city, Tennessee': 'Mitchellville', 
                              'Mount Juliet city, Tennessee': 'Mount Juliet', 'Mount Pleasant city, Tennessee': 'Mount Pleasant', 
                              'Murfreesboro city, Tennessee': 'Murfreesboro', 'Nashville-Davidson metropolitan government (balance)': 'Nashville', 
                              'New Johnsonville city, Tennessee': 'New Johnsonville', 'Nolensville town, Tennessee': 'Nolensville', 
                              'Oak Hill city, Tennessee': 'Oak Hill', 'Pegram town, Tennessee': 'Pegram', 'Pleasant View city, Tennessee': 'Pleasant View', 
                              'Portland city, Tennessee': 'Portland', 'Ridgetop city, Tennessee': 'Ridgetop', 'Slayden town, Tennessee': 'Slayden', 
                              'Smyrna town, Tennessee': 'Smyrna', 'Spring Hill city, Tennessee': 'Spring Hill', 'Springfield city, Tennessee': 'Springfield', 
                              'Tennessee Ridge town, Tennessee': 'Tennessee Ridge', "Thompson's Station town, Tennessee": "Thompson's Station", 
                              'Vanleer town, Tennessee': 'Vanleer', 'Watertown city, Tennessee': 'Watertown', 'Waverly city, Tennessee': 'Waverly', 
                              'Westmoreland town, Tennessee': 'Westmoreland', 'White Bluff town, Tennessee': 'White Bluff', 
                              'White House city, Tennessee': 'White House', 'Franklin city, Kentucky': 'Franklin, KY', 
                              'Scottsville city, Kentucky': 'Scottsville, KY', 'United States': 'US'})
df2 = df2.transpose().reset_index(drop = False)

In [36]:
df.to_csv('../../Outputs/HUDSOC_ResidentialPermits1980_2021.csv', index = False)
df2.to_csv('../../Outputs/HUDSOC_ResidentialPermitscurrent.csv', index = False)

In [35]:
df

Series,NAME,Year,GEO_ID,Units in All Multi-Family Structures,Units in Single-Family Structures,Units in All Structures
0,Adams,1980.0,1600000US4700200,0.0,0.0,0.0
1,Adams,1981.0,1600000US4700200,0.0,0.0,0.0
2,Adams,1982.0,1600000US4700200,0.0,0.0,0.0
3,Adams,1983.0,1600000US4700200,0.0,0.0,0.0
4,Adams,1984.0,1600000US4700200,0.0,0.0,0.0
5,Adams,1985.0,1600000US4700200,0.0,0.0,0.0
6,Adams,1986.0,1600000US4700200,0.0,0.0,0.0
7,Adams,1987.0,1600000US4700200,0.0,0.0,0.0
8,Adams,1988.0,1600000US4700200,0.0,0.0,0.0
9,Adams,1989.0,1600000US4700200,0.0,0.0,0.0
