--------------------------------------------------------------------------------------

* Team member names: Nathaniel del Rosario
* Team member IDs: A17562063

--------------------------------------------------------------------------------------

# Mini-project 4, DSC 170, Winter 2024

# Suitability Modeling

This project will focus on __suitability analysis__ with raster data. Your tasks will be both conceptual-level and technical. 

There are three parts to this assignment.

1. At the conceptual level, you will define a suitability model of your choice, for an area of your choice (preferably San Diego, because we already have worked with some local data). Consult the lectures on map combination, and also see https://pro.arcgis.com/en/pro-app/latest/help/analysis/spatial-analyst/suitability-modeler/the-general-suitability-modeling-workflow.htm for a brief description of what a suitability model is. For example, you may be looking for best areas for community gardens. Such areas are often selected from underutilized land in residential land uses, with good soils, good drainage, accessible (not steep slope), etc. So you would be looking for areas with a specific type of land use/land cover, with an appropriate range of values of slope, etc. You may build additional criteria based on a range of precipitaiton values, whether the area is affected by wildfires, or has low levels of soil erosion, etc. Feel free to use the imagery layers we explored or mentioned during raster-focused lectures. Several cells in lecture notebooks contained URLs to imagery layer collections available through AGOL - but feel free to find more. Also, feel free to download additional raster layers from elsewhere (an example in lecture demonstrates how to do this from the USGS image repository, and also how to get Sentinel-2 from ESA), publish them on ArcGIS Enterprise, and use in your model. 

You can use any __two__ of the <i>map combination</i> techniques discussed during lectures. You should identify the ones you use and discuss any uncertainty issues associated with these specific map combination models. 

As the outcome of this part, you will need to: a) describe the suitability model you want to develop; b) identify the raster data layers you will use; and c) describe two of the map combination techniques you will use to derive the two suitablity maps, and their pros and cons.

2. The second part will involve implementing your suitability model using arcgis raster functions. Many of these functions are new and experimental! Examples of what works are in the lecture notebooks. Be creative!    

3. The third part will be a brief write-up comparing the two output rasters generated for the suitability models using the two map combination techniques. 

The notebook should include documentation of the steps, as usually.



**Due Date: 03/04/2024 11:59PM (Pacific Time)**

**Total Possible Points: 30 Pts**

## Task 1: Formulate a suitability or risk model (5 points)

Before finding a dataset, we will form our question in the following way:
What are the best areas for _ / best areas that satisfy condition _ ?

### Question:

What are the best locations for new student housing in Berkeley?

### Data
- [Berkeley zoning data](https://data.cityofberkeley.info/browse?q=zoning&sortBy=relevance)
- [Population By Council District](https://geodata.lib.berkeley.edu/catalog/berkeley-s7wq14)
- [GeoData](https://geodata.lib.berkeley.edu/?utf8=%E2%9C%93&search_field=all_fields&q=berkeley)
- [Alameda County Open Data](https://data.acgov.org/search?collection=Dataset&source=AlamedaCounty.CA.US)
- [Fire Hazard Raster](https://ucsdonline.maps.arcgis.com/apps/mapviewer/index.html?layers=cdedfa4417f54ae0a274c78f7c9b8c8e)
- [Census Tracts](https://data.census.gov/allq=alameda%20&t=Housing:Housing%20Units:Renter%20Costs&g=050XX00US06001$1400000)
- [Berkeley Crime Data](https://ucsdonline.maps.arcgis.com/apps/mapviewer/index.html?webmap=9f67de4e3a0d49c39ae07034f5c32966)
- [Berkeley Bus Routes](https://ucsdonline.maps.arcgis.com/apps/mapviewer/index.html?webmap=761a98de8a1a435aad5f229692c5d229)
- [Berkeley Bike Lanes](https://ucsdonline.maps.arcgis.com/apps/mapviewer/index.html?layers=e1453a81e05f45dd9c44699c5b1a40b7)
- [Berkeley Bike Parking](https://ucsdonline.maps.arcgis.com/apps/mapviewer/index.html?webmap=9fa9fa7139584d359a0542cc4561d427)

### Important Features
- Population density
- Proximity to bike paths / bus routes
- Is it in a heat island?
- Proximity to frequent crime scenes 


## Task 2: Implement the model (20 points)

In [1]:
# Imports, etc.
import pandas as pd
import numpy as np
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point


import arcgis
from arcgis.gis import GIS
from arcgis import geometry
from arcgis.features import GeoAccessor, GeoSeriesAccessor, FeatureLayerCollection, FeatureSet, FeatureCollection, FeatureLayer
from arcgis.features.use_proximity import create_buffers
from arcgis.raster import ImageryLayer
from IPython.display import display
import os

In [2]:
gis = GIS("https://ucsdonline.maps.arcgis.com/home/index.html", "dsc170wi24_8", "NDRCZBH@499!")

In [3]:
berkeley_map = gis.map('Berkeley, CA')
berkeley_map

MapView(layout=Layout(height='400px', width='100%'))

In [4]:
# List imagery layers to be used in your model. 
# This cell should contian layer definitions.

# fire risk raster
fire_risk_raster = gis.content.get('6229ec92b2894b9ca718c9e7163bbd7b')

# AC Transit Route, plus an additional bus route
bus_routes1_fl = gis.content.get('1b6d51b614bd47a99efd60cb64ddcdc5')
bus_routes2_fl = gis.content.get('aba7cd6ad57f4102a8af2264e29a5119')

# crime data
crime_fl = gis.content.get('8f4514aca02f417eb6004f03b832d6a9')

# zoning layer
zoning_fl = gis.content.get('9b6611b77cc047ea813ffe7346b1637a') # this is just here for visualization reference

#item_properties = {
#    "type": "GeoJson",
#    "title": "Berkeley Zoning Districts"
#}
#geojson_item = gis.content.add(item_properties, 'ZoningDistricts.geojson')
#zoning_layer = geojson_item.publish()
#zoning_layer

# Define the output  paths
fire_risk_raster_path = "/fire_risk_raster.tif"
bus_routes_raster_path = "/bus_routes_raster.tif"
crime_data_raster_path = "/crime_data_raster.tif"
zoning_raster_path = "/zoning_raster.tif"

In [5]:
# play around until all layers show: there should be bus routes, blue zoning, and yellow-purple heatmap
crime_fl.opacity = 0.8
zoning_fl.opacity = 0.2
bus_routes1_fl.opacity = 1
bus_routes2_fl.opacity = 1
berkeley_map.add_layer(crime_fl)
berkeley_map.add_layer(zoning_fl)
berkeley_map.add_layer(bus_routes1_fl)
berkeley_map.add_layer(bus_routes2_fl)
berkeley_map

MapView(layout=Layout(height='400px', width='100%'))

In [None]:
# Convert Feature Layers to Rasters

import rasterio
from rasterio.features import rasterize

bus_routes1_raster_path = "/bus_routes1_raster.tif"
bus_routes2_raster_path = "/bus_routes2_raster.tif"
crime_raster_path = "/crime_raster.tif"

# Load feature layers as GeoDataFrames
bus_routes1_fl = FeatureLayer('https://services1.arcgis.com/IYiCpZoSIq9lAxi8/ArcGIS/rest/services/MajorACTransitRoutes/FeatureServer/0')
bus_routes2_fl = FeatureLayer(bus_routes2_fl.url)
crime_fl = FeatureLayer('https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/ACS_Poverty_by_Age_Boundaries/FeatureServer/2')

# Query from the feature layer
bus_features1 = bus_routes1_fl.query()
crime_features = crime_fl.query()

# convert the features to a GeoDataFrame for raster 
bus_routes1_gdf = bus_features1.sdf
crime_gdf = crime_features.sdf

# crime_fl.url # 'https://ucsdonline.maps.arcgis.com/home/item.html?id=8f4514aca02f417eb6004f03b832d6a9&view=service'
# zoning_fl.url : 'https://services1.arcgis.com/eGSDp8lpKe5izqVc/arcgis/rest/services/Berkeley_Zoning_Districts/FeatureServer'

In [None]:
xmin, ymin, xmax, ymax = bus_routes1_gdf.total_bounds
resolution = 0.0001 

# Rasterize each GeoDataFrame to create raster layers
with rasterio.open('bus_routes1_raster.tif', 'w', driver='GTiff', width=1000, height=1000, count=1, dtype=rasterio.uint8, crs='EPSG:4326', transform=rasterio.transform.from_bounds(xmin, ymin, xmax, ymax, 1000, 1000)) as dst:
    rasterize(
        shapes=zip(bus_routes1_gdf.geometry, [1]*len(bus_routes1_gdf)),  # Convert all features to value 1
        out_shape=(1000, 1000),
        fill=0,
        out=dst
    )

with rasterio.open('crime_raster.tif', 'w', driver='GTiff', width=1000, height=1000, count=1, dtype=rasterio.uint8, crs='EPSG:4326', transform=rasterio.transform.from_bounds(xmin, ymin, xmax, ymax, 1000, 1000)) as dst:
    rasterize(
        shapes=zip(crime_gdf.geometry, [1]*len(crime_gdf)),  # Convert all features to value 1
        out_shape=(1000, 1000),
        fill=0,
        out=dst
    )

In [None]:
# Derive the area of interest (AOI) and its geometry and extent. 
# The smaller the area the better (so that you don't run into raster size limitations)

"""
Crime Data Extent for AOI

Initial Extent:
XMin: -13241249.212925
YMin: 3952298.35563141
XMax: -12992551.4477593
YMax: 4104580.15436044
Spatial Reference: 102100 (3857)
Full Extent:
XMin: -19942592.3656
YMin: 2023651.5978
XMax: 20012846.0377
YMax: 11537091.6297
Spatial Reference: 102100 (3857)
"""

## Typical steps for preparing a raster layer for suitability analysis:
 - retrieve the layer
 - specify area of interest (study area), e.g. by retrieving a named polygon and getting its extent in an explicit CRS
 - clip the layer to the area of interest
 - define categories to be shown in the output (these may be suitability classes) (make sure these classes actually exist)
 - define a colormap for these classes
 - remap the layer clip to these categories in this colormap
 

### Name the two map combination techniques you will use to combine the data and describe their pros and cons

Your text here 

In [None]:
# Prepare your input layers for map combination: clip to AOI, remap/normalize, visualize the layers. 

In [None]:
# Generate a composite raster layer for your first map combination technique
# F1 = cumsum(layers) = fire_risk + crime_data 


In [None]:
# Generate a composite raster layer for your second map combination technique
# F2 = fire_risk + crime_data + lambda * bus_routes
# the second function differs as we add another feature, bus_routes which is a 0 or 1 value:
# if there is a bus route nearby, we add 1, else the score stays the same

## Task 3: Compare the results (5 points)
... and describe how different combination techniques resulted in different outputs (or not.) 

Since the functions were pretty similar, with one including bus route proximity and one not including bus route proximity, the only difference in outputs was in areas around bus routes, which had a higher score because they were not penalized for not being near bus routes. The second function therefore emphasized importance of this proximity to a bus route. In the scope of planning a new student housing, the reasoning for a function that excludes and includes bus route proximity is that some students may not bus to school/if the housing was within a few blocks to campus, the priority / need for a bus in this context would not really be important compared to housing that is farther (in this case, more West towards I-80 and Telegraph Ave & Beyond)

In [None]:
## Timekeeping
# Please let us know how much time you spent on this project, in hours: 
# (we will only examine distributions and won't look at individual responses)

assignment_timespent = 8 # mostly because I was having a lot of difficulty converting from FL to Raster