# Living Decisions

## Introduction

I'm about to graduate from The Fletcher School, and I want to make sure that IF I return to Boston post-covid, I find the right place to live. Enter this short, somewhat silly project. The purpose of this analysis is to determine where I should, based on factors I have identified as particularly important to my own individual happiness, including:

- **Impervious surface** - The more impervious surfaces there are, the less green space there is. I'd like a lot of green space.
- **Tree canopy** - Trees have previously been directly linked to my happiness.
- **Boston land cover** - The lower the intensity of a developed area, the better - but it does have to be developed.
- **Proximity to farmers markets** - I like to have fresh veggies and fresher baked goods every weekend.
- **Proximity to libraries** - Crucial for recouping the amount of money I expended on books in grad school.
- **Proximity to restaurants** - Having lived in a rural town with little access to restaurant food for the last year, I'd like some good ones close by.
- **Resident age** - Some of my happiness is based on making friends. While that is aboslutely not limited to a certain age bracket, it would be nice to have young-ish people to hang out with.

## The Data

In case you would like to investigate the data I used for myself, you'll find it in the `data` folder, which contains the following files:

- `data/raster/NLCD/NLCD_2016_Impervious_Boston.tif` - impervious surface raster for Boston and the surrounding area (GeoTIFF)
- `data/raster/NLCD/NLCD_2016_Land_Cover_Boston.tif` - land cover raster for Boston and the surrounding area (GeoTIFF)
- `data/raster/NLCD/NLCD_Land_Cover_Legend.jpg` - the key for the land cover raster (JPG)
- `data/raster/NLCD/NLCD_2016_Tree_Canopy_Boston.tif` - tree canopy raster for Boston and the surrounding area (GeoTIFF)
- `data/vector/MassGIS/FARMERSMARKETS_PT/FARMERSMARKETS_PT.shp` - farmers markets of Boston (ESRI shapefile)
- `data/vector/MassGIS/LIBRARIES_PT/LIBRARIES_PT.shp` - Boston libraries (ESRI shapefile)
- `data/tabular/ACSST5Y2019.S0101/ACSST5Y2019.S0101_data_with_overlays.csv` - resident age data by zip code for Boston (.csv)
- `data/vector/Census/tl_2010_25_zcta510/tl_2010_25_zcta510.shp` - zip code boundaries shapefile for Boston (ESRI shapefile)

Data from OpenStreetMap will be added during the analysis.

## Analysis Overview

Here is a rough outline of the analysis I will perform:

1. Read in and visualize the raster datasets
2. **Impervious surface area** > reclassify for impervious surface preferences (more impervious = less desirable)
3. **Land cover** > reclassify for land cover preferences
4. **Tree canopy** > reclassify for tree canopy preferences
5. **Farmers markets** > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
6. **Libraries** > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
7. **Restaurants** > pull restaurant data from OpenStreetMap > rasterize shapefile > get euclidean distance raster > reclassify (close = more desirable)
8. **Resident age** > calculate average age by zip code > join to zip code shapefile > rasterize shapefile > reclassify (close to 30 = more desirable)
9. Calculate weighted and unweighted desirability raster from all of the reclassified rasters
10. Mask desirability rasters to the shape of Boston
11. Use zonal stats to determine the average desirability by zip code

## Import Dependencies

In [9]:
import rasterio
from rasterio.plot import show
from rasterio import features

import numpy as np
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt

from scipy import ndimage
from rasterstats import zonal_stats

will need to clip the zip codes to make sure we're only looking at the right zips (geopandas geometry relations https://geopandas.org/getting_started/introduction.html#Geometry-relations)

they are not cleaning the files for us. pandas will help with that, uku says.

going to need to remove or deal with the descriptive column and the non-descriptive column that just has codes - might be better to rename them to something easier?
you don't have to delete the second row in pandas. pandas.pydata.org/docs/reference/api/pandas.read_csv.html
- you can use header to tell it to use the first row as column names. this should be the top row, the non-descript row.
- use skiprows to skip the second row. it's an awful row.

make sure you're dealing with the estimate, not the margin of error. but look at the margin of error to see if its too big.

don't do any cleaning in excel. if you do, mention in readme that you did pre-processing.

but just bring it in and deal with it in pandas. it's more reproducible that way.

coastal features will help us create our own base map if we want to. but we'd probably have to mask out the water.

be sure to download something about bike lanes from the mass.gov stuff https://docs.digital.mass.gov/dataset/massgis-data-layers

be sure you're using massachussets state plane. mass.gov stuff will come in like that, but the census data probably won't. don't forget to set that.

open street map can help bring in restaurants!

contextily lets you bring in basemaps 

