# Data Extraction and Combination

Notebook to perform tasks to extract, simplify, combine, and analyze data from several sources.

## Data Sources

- Land Area figures for New Hampshire: NH GRANIT system; compiled at [NH Office of Strategic Initiatives](https://www.nh.gov/osi/planning/services/gis/maps.htm), May 2021


In [1]:
# Package imports
import os
import numpy as np
import pandas as pd
import geopandas as gpd
import geemap

In [15]:
# Import land area data
land_area = pd.read_csv('./data/NH-land-area-figures-consolidated.csv',
                        header=3)

In [16]:
land_area.head()

Unnamed: 0,County,Municipality,Total acres,Total sq. miles,Land acres,Land sq. miles,Water acres,Water sq. miles
0,Belknap,Alton,53230.5,83.17,40636.1,63.49,12594.4,19.68
1,Belknap,Barnstead,28758.7,44.94,27215.21,42.52,1543.49,2.41
2,Belknap,Belmont,20427.6,31.92,19190.38,29.98,1237.22,1.93
3,Belknap,Center Harbor,10394.4,16.24,8498.05,13.28,1896.35,2.96
4,Belknap,Gilford,34243.7,53.51,24786.22,38.73,9457.48,14.78


In [17]:
land_area.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 259 entries, 0 to 258
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   County           259 non-null    object 
 1   Municipality     259 non-null    object 
 2   Total acres      259 non-null    float64
 3   Total sq. miles  259 non-null    float64
 4   Land acres       259 non-null    float64
 5   Land sq. miles   259 non-null    float64
 6   Water acres      259 non-null    float64
 7   Water sq. miles  259 non-null    float64
dtypes: float64(6), object(2)
memory usage: 16.3+ KB


In [19]:
total_areas = land_area.groupby(['County']).sum()
total_areas

Unnamed: 0_level_0,Total acres,Total sq. miles,Land acres,Land sq. miles,Water acres,Water sq. miles
County,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Belknap,300786.19,469.98,256443.63,400.68,44342.57,69.28
Carroll,635817.97,993.47,595805.22,930.95,40012.76,62.52
Cheshire,466514.34,728.94,450805.97,704.39,15708.39,24.54
Coos,1171969.27,1831.21,1148425.31,1794.41,23543.98,36.77
Grafton,1119743.95,1749.59,1093489.36,1708.56,26254.63,41.02
Hillsborough,571152.78,892.46,555882.47,868.58,15270.34,23.85
Merrimack,611148.3,954.92,593770.15,927.73,17378.17,27.12
Rockingham,465182.48,726.86,445326.83,695.83,19855.68,31.01
Strafford,244860.75,382.6,233827.9,365.36,11032.86,17.23
Sullivan,353361.9,552.12,342788.07,535.6,10573.83,16.55


In [20]:
# Import political boundaries (pb) data
pba_path = '../../NH_PoliticalBoundaries/GRANIT_20220220160347/pba.shp'  # Lines
pbp_path = '../../NH_PoliticalBoundaries/GRANIT_20220220160347/pbp.shp'  # Polygons

In [21]:
pb_lines = gpd.read_file(pba_path, driver='shapefile')

In [22]:
pb_lines.head()

Unnamed: 0,FNODE_,TNODE_,LPOLY_,RPOLY_,LENGTH,PB_,PB_ID,TYPE,geometry
0,3,1,1,2,74843.38,1,1,1,"LINESTRING (1063158.250 1002584.875, 1063233.0..."
1,1,2,1,2,83566.1,2,2,1,"LINESTRING (1102437.750 1003597.625, 1102581.7..."
2,4,3,1,2,33244.42,3,3,1,"LINESTRING (1043039.000 996290.125, 1043019.12..."
3,5,4,1,2,46482.84,4,4,1,"LINESTRING (1045191.375 966039.625, 1045218.87..."
4,6,5,1,2,57585.89,5,5,1,"LINESTRING (1027435.875 932679.000, 1027442.00..."


In [23]:
pb_poly = gpd.read_file(pbp_path, driver='shapefile')

In [24]:
pb_poly.head()

Unnamed: 0,FIPS,NAME,RPA,ACRES,COUNTY,geometry
0,7160,Pittsburg,1,186430.5,7,"POLYGON ((1063158.250 1002584.875, 1063233.000..."
1,7040,Clarksville,1,39915.8,7,"POLYGON ((1059756.000 928359.250, 1059898.250 ..."
2,7005,Atkinson & Gilmanton,1,12351.3,7,"POLYGON ((1111451.250 916720.625, 1111518.250 ..."
3,7190,Stewartstown,1,30019.1,7,"POLYGON ((1019728.688 911317.500, 1019769.188 ..."
4,7175,Second College,1,26773.9,7,"POLYGON ((1117369.750 895118.500, 1117506.500 ..."


In [25]:
pb_poly.crs

<Derived Projected CRS: PROJCS["NAD83 / New Hampshire (ftUS)",GEOGCS["NAD8 ...>
Name: NAD83 / New Hampshire (ftUS)
Axis Info [cartesian]:
- [east]: Easting (US survey foot)
- [north]: Northing (US survey foot)
Area of Use:
- undefined
Coordinate Operation:
- name: unnamed
- method: Transverse Mercator
Datum: North American Datum 1983
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich