<a href="https://colab.research.google.com/github/BrianKEverett/County-Line/blob/main/Dissertation2_Everett.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install ydata-profiling
from ydata_profiling import ProfileReport
import time, os, sys, re
import zipfile, json, datetime, string
import numpy as np
from statistics import *

import matplotlib.pyplot as plt

import pandas as pd
import pandas_datareader as pdr
from pandas_datareader import wb
from pandas.io.formats.style import Styler
import plotly.express as px

import missingno as msno

from google.colab import files

import seaborn as sns

from google.colab import data_table
data_table.enable_dataframe_formatter()
data_table.max_columns = 50

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

%matplotlib inline

plt.style.use('classic')

Collecting ydata-profiling
  Downloading ydata_profiling-4.6.4-py2.py3-none-any.whl (357 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m357.8/357.8 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Collecting pydantic>=2 (from ydata-profiling)
  Downloading pydantic-2.5.3-py3-none-any.whl (381 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m381.9/381.9 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
Collecting visions[type_image_path]==0.7.5 (from ydata-profiling)
  Downloading visions-0.7.5-py3-none-any.whl (102 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m102.7/102.7 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
Collecting htmlmin==0.1.12 (from ydata-profiling)
  Downloading htmlmin-0.1.12.tar.gz (19 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting phik<0.13,>=0.11.1 (from ydata-profiling)
  Downloading phik-0.12.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (686 kB)
[2K     [90m━━━━━━━━━━━━

In [2]:
permits=pd.read_csv("https://raw.githubusercontent.com/BrianKEverett/County-Line/main/Permits.csv")
# dataset can be found here: https://njdca.maps.arcgis.com/home/item.html?id=c754e8f800424bcbb6ad4e6e85b9f736 from NJ Dept of Community Affairs Website.
#this dataset was chosen mostly to assist with my dissertation proposal - I am seeking to explore the behaviors of planning boards in New Jersey, and one hypothesis I have is that planning board decisions are influenced by the effects of the County Line Balloting system, which is unique to 19 out of 21 counties in New Jersey. No other state in the US runs primary elections in this way.
#More on the County Line can be found here by Julia Sass Rubin: https://www.njpp.org/wp-content/uploads/2021/01/NJPP-Report-Does-the-County-Line-Matter-Update-wiht-Final-Vote-Counts.pdf

taxes=pd.read_csv ("https://raw.githubusercontent.com/BrianKEverett/County-Line/main/mediantax.csv")
# dataset can be retireved via: https://njdca.maps.arcgis.com/apps/webappviewer/index.html?id=96ec274c50a34890b23263f101e4ad9b from NJ Department of Community Affairs
# Another hypothesis I have is that the public narrative put forth by planning board members when approving controversial permits, "this will increase rateables for the township, lowering your taxes", does not actually come to fruition
#This data set is helpful for exploring that narrative, and seeing if the opposite is occurring, i.e. more development actually yields higher property taxes

#health=pd.read_csv ("https://raw.githubusercontent.com/BrianKEverett/County-Line/main/countyhealth.csv")
#dataset can be found here: https://www.countyhealthrankings.org/explore-health-rankings/rankings-data-documentation from County Health Rankings and Roadmaps, for 2022 to match same year of data for permits dataset
#With County Health Data, we can infer hypotheses about rates of development and the effect on well-being, liveability.

#Problem with health data set == only 22 observations for New Jersey, not a good sample.

municodes=pd.read_csv ("https://raw.githubusercontent.com/BrianKEverett/County-Line/main/Municodes.csv")
#Data file of all NJ municiaplities, and counties, with the corresponding municipalitiy DCA code. This data file will be most helpful for matching and merging.

#crime=pd.read_csv ("https://raw.githubusercontent.com/BrianKEverett/County-Line/main/CamdenCrime.csv") #not a good format for reading data!
#Dataset can be retireved here: https://www.nj.gov/njsp/ucr/uniform-crime-reports.shtml on the NJ Office of the Attorney General's website
#This crime data is important to consider when analyzing planning and zoning. Does any specific type of development correlate with increased crime? Can liveability theory be worked in here for whether or not communities have what they need to prevent crime?

jobs=pd.read_csv('https://raw.githubusercontent.com/BrianKEverett/County-Line/main/jobsdensity.csv')
#dataset can be built via the table selections on the NJ Community Affairs website - https://njdca.maps.arcgis.com/apps/webappviewer/index.html?id=96ec274c50a34890b23263f101e4ad9b
#Job density is a good variable to consider regarding new large dollar permits. Are some places growing more than others? Can this be attribute to the phenomenon of the County Line?

countysize=pd.read_csv('https://raw.githubusercontent.com/BrianKEverett/County-Line/main/NJCountySize.csv')
#dataset can be found via Wikipedia via 2020 census data - https://en.wikipedia.org/wiki/List_of_counties_in_New_Jersey

njtowns=pd.read_csv('https://raw.githubusercontent.com/BrianKEverett/County-Line/main/NJMunicipalities.csv')
#Dataset can be found at: https://en.wikipedia.org/wiki/List_of_municipalities_in_New_Jersey#:~:text=The%20largest%20municipality%20by%20population,most%20populous%20being%20South%20Carolina.

In [3]:
permits = permits.rename(columns={'DCA MUNI CODE': 'DCA'})
permits = permits.rename(columns={'MUNICIPALITY': 'Municipality'})
permits["Municipality"]= permits["Municipality"].str.title()
del permits['ID']
del permits['BLOCK NUMBER']
del permits['PAMS PIN']
del permits['USE GROUP']
del permits['YCOORD']
del permits['XCOORD']
del permits['MATCH TYPE']
del permits['LOT NUMBER']
del permits['DATE ISSUED']
del permits['TAX CODE']

del municodes['MUNICIPALITY_NAME_NJ-1040']
del municodes['MUNICIPALITY_CODE_DCA']
del municodes['MUNICIPALITY_NAME_DCA']
del municodes['MUNICIPALITY_CODE_GNIS']
del municodes['MUNICIPALITY_NAME_GNIS']
del municodes['MUNICIPALITY_CODE_FIPS']
municodes = municodes.rename(columns={'MUNICIPALITY_NAME_COMMON': 'Municipality'})
municodes = municodes.rename(columns={'MUNICIPALITY_CODE_NJ-1040': 'DCA'})
municodes = municodes.rename(columns={'COUNTY_NAME_COMMON': 'County'})
municodes['County'] = municodes['County'].str.replace(' County', '')
municodes = municodes.set_index('Municipality')

countysize['Largest City Population']=countysize['Largest City Population'].str.replace(',','')

del njtowns['Municipality Type']
del njtowns['Incorporated[5]']
del njtowns['Form of government']
njtowns = njtowns.set_index('Municipality')
permits = permits.rename(columns={'TYPE': 'Permits'})
permits = permits.set_index('Municipality')

del njtowns['Population density']
del njtowns['Land Area (km^2)']
del njtowns['Pop. Change']

del permits['Use Group Label']
del permits['WORK VALUE']

jobs = jobs.set_index('Municipality')
del jobs['JobsVintage']
del jobs['Blk_Grp_Name']
del jobs['JobsDensity']

taxes = taxes.set_index('Municipality')
del taxes['Tract_Name']
del taxes['Data_Vintage']
del taxes[' ']


njtowns
permits
jobs
taxes
municodes

Unnamed: 0_level_0,County,Population (2020),Population (2010),Land area (mi^2)
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Aberdeen Township,Monmouth,19329,18157,5.444
Absecon,Atlantic,9137,8411,5.468
Alexandria Township,Hunterdon,4809,4938,27.534
Allamuchy Township,Warren,5335,4323,19.992
Allendale,Bergen,6848,6505,3.097
...,...,...,...,...
Woodlynne,Camden,2902,2978,0.218
Woodstown,Salem,3678,3505,1.575
Woolwich Township,Gloucester,12577,10200,21.072
Wrightstown,Burlington,720,802,1.850


Unnamed: 0_level_0,DCA,Permits
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1
Absecon City,101,NEW
Absecon City,101,ALT
Atlantic City,102,ALT
Atlantic City,102,ALT
Atlantic City,102,ALT
...,...,...
Washington Boro,2121,ALT
Washington Twp,2122,ALT
Washington Twp,2122,ALT
Washington Twp,2122,ALT


Unnamed: 0_level_0,County,Jobs
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1
West Caldwell Township,Essex,4376
West Caldwell Township,Essex,28
West Caldwell Township,Essex,1939
Monroe Township,Gloucester,226
West Deptford Township,Gloucester,156
...,...,...
Marlboro Township,Monmouth,90
Marlboro Township,Monmouth,243
Marlboro Township,Monmouth,2738
Dumont Borough,Bergen,10


Unnamed: 0_level_0,County,Median_RE_Taxes
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1
Bridgewater Township,Somerset,10000.0
Bernards Township,Somerset,10000.0
Franklin Township,Somerset,9604.0
Woodbridge Township,Middlesex,9041.0
Woodbridge Township,Middlesex,7496.0
...,...,...
Franklin Township,Somerset,10000.0
Franklin Township,Somerset,6600.0
Franklin Township,Somerset,10000.0
Bernards Township,Somerset,10000.0


Unnamed: 0_level_0,County,DCA
Municipality,Unnamed: 1_level_1,Unnamed: 2_level_1
Absecon,Atlantic,101
Atlantic City,Atlantic,102
Brigantine,Atlantic,103
Buena Borough,Atlantic,104
Buena Vista Township,Atlantic,105
...,...,...
Phillipsburg,Warren,2119
Pohatcong Township,Warren,2120
Washington Borough,Warren,2121
Washington Township,Warren,2122
