In [1]:
# !pip install pyproj

## Using Fiona to manipuate shapefiles and do spatial analysis
Fiona is an excellant tool for spatial manipulation. This session show you how to use fiona to read shapefile and metadata, use Shapely and Fiona to do spatial analysis, and and write shapefiles. These libraries are essentially wrappers for GEOS and OGR, respectively, which provide clean, Pythonic interfaces for performing the processing, while still keeping the performance capabilities of the underlying libraries.

Fiona is used for reading and writing vector files (here we’re using Shapefiles), while Shapely is used for doing the manipulation and analysis of the geometric objects.

This session will include some major sections, 

1. Read metadata, attributes, geometry of features in shapfile using fiona
1. Create shapefile based on the longitude and latitude 
1. Convert the projection of shapefiles using pyproj
1. Do buffer analysis using fiona+shapely
1. Do intersection of point feature class and polygon feature class

**References**:
- Pandas Tutorial, https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python
- Fiona Manual, http://toblerity.org/fiona/manual.html
- Shapely Manual, http://toblerity.org/shapely/manual.html
- AZAVEA Research Blog, Using Shapely and Fiona to Locate High-Risk Traffic Areas, https://www.azavea.com/blog/2016/10/05/philippines-road-safety-using-shapely-fiona-locate-high-risk-traffic-areas/


Let's print the geometry and attribute of the features in shapefile

### Convert the car crash shapefile projection to the same project with the census tract

In [7]:
import pyproj
from itertools import repeat
import sys
import time
import fiona
from pyproj import Transformer
from shapely.geometry import mapping, shape

traffic_accident = 'data/crash_data_collision_crash_2007_2017.shp'
# the name of the ouput reprojected shapefile
trafficAccident_reproj = 'data/crash_data_collision_crash_2007_2017_reproj.shp'
transformer = Transformer.from_crs(4326, 2272)

# write the reprojected point feature to shapefile
with fiona.open(traffic_accident) as source:
    crs={'init': 'epsg:2272'}
    schema = source.schema
    
    with fiona.open(trafficAccident_reproj, 'w', driver=source.driver, \
                    crs=crs,schema=schema) as dest:
        for feat in source:
            feat_geom = feat['geometry']
            data = feat['properties']
            
            coordinates = feat_geom['coordinates']
            #print(coordinates)
            # Transform the coordinates of every ring.
            reprojCoords = transformer.transform(coordinates[1], coordinates[0])
            
            reproj_geom = {
                'type': feat_geom['type'],  # Preserve original geometry type
                'coordinates': reprojCoords    # Use new reprojected coordinates
            }
            
            dest.write({'geometry': mapping(shape(reproj_geom)),'properties': data})
            

You may find it slower than GeoPandas, this is because we use `for loop` here and do the transform one by one. There are several tricks to increase the efficiency, like use numpy array. In this class, we are not going to increase the efficiency here. If you interested, you can also check the implementation in GeoPandas. https://github.com/geopandas/geopandas

## R-tree for overlay of two shapefile

The core idea behind the `R-tree` is to form a tree-like data structure where nearby objects are grouped together, and their geographical extent (minimum bounding box) is inserted into the data structure (i.e. R-tree). This bounding box then represents the whole group of geometries as one level (typically called as “page” or “node”) in the data structure.


#### Build and fill Rtree
First step is to build the Rtree on the point feature. If you have question about which shapefile should be used as base for the Rtree, a tip is to use the shapefile has more features.

In [8]:
import rtree
import fiona
import os, os.path
from statistics import median
from shapely.geometry import shape
from shapely.ops import transform
from functools import partial
import pyproj
import time


neighborhood_shp = 'data/philadelphia-census-tract.shp'
trafficAccident_reproj = 'data/crash_data_collision_crash_2007_2017_reproj.shp'
outPolygonShp = 'data/census-traffic-accident.shp'


t0 = time.time()
pnt_lyr = fiona.open(trafficAccident_reproj, 'r')     
# create an empty spatial index object
index = rtree.index.Index()


# populate the spatial index, the polygon features
i = 0
for fid, feature in pnt_lyr.items():
    i = i + 1
    if i % 10000 == 0: print (i)
    geometry = shape(feature['geometry'])
    
    # add a buffer in order to create a r-tree
    geometry_buffered = geometry.buffer(10) 
    geotype = feature['geometry']['type']
    
    index.insert(fid, geometry_buffered.bounds)
    

10000
20000
30000
40000
50000
60000
70000


#### Start the overlay based on the built rtree
Based on the built Rtree to loop all features in the polygon and calculate the attribute

In [16]:
# loop all polygons and assign GVI values
with fiona.open(neighborhood_shp, 'r') as polygon_lyr:
    schema = polygon_lyr.schema.copy()
    schema['properties']['AcciNum']='int' 
    input_crs = polygon_lyr.crs
    
    # write the intersected point into the new shapefile
    with fiona.open(outPolygonShp, 'w', 'ESRI Shapefile', schema, input_crs) as output:
        
        # loop the polygon feature
        for idx, featPoly in enumerate(polygon_lyr):
            if idx % 10 == 0:
                print('Polygon:', idx)
            
            geomPoly = shape(featPoly['geometry'])                
            attriPoly = featPoly['properties']
            
            # using the bounding box to find the close but may not intersected point feature
            fids = [int(i) for i in index.intersection(geomPoly.bounds)]
            
            # count the number of accidents
            count = 0
            
            # loop all features in bounding box and then judge if they are intersected
            for fid in fids:
                featPnt = pnt_lyr[fid]
                geomPnt = shape(featPnt['geometry'])
                
                # if the point is intersected with the polygon, then save the point feature into the output shapefile
                if geomPoly.intersects(geomPnt):
                    count = count + 1
            
            attriPoly['AcciNum']=count
            output.write({'geometry': mapping(geomPoly),'properties': attriPoly})
            

Polygon: 0


  attriPoly['AcciNum']=count


Polygon: 10
Polygon: 20
Polygon: 30
Polygon: 40
Polygon: 50
Polygon: 60
Polygon: 70
Polygon: 80
Polygon: 90
Polygon: 100
Polygon: 110
Polygon: 120
Polygon: 130
Polygon: 140
Polygon: 150
Polygon: 160
Polygon: 170
Polygon: 180
Polygon: 190
Polygon: 200
Polygon: 210
Polygon: 220
Polygon: 230
Polygon: 240
Polygon: 250
Polygon: 260
Polygon: 270
Polygon: 280
Polygon: 290
Polygon: 300
Polygon: 310
Polygon: 320
Polygon: 330
Polygon: 340
Polygon: 350
Polygon: 360
Polygon: 370
Polygon: 380


The count is: 10
Polygon: 282
The count is: 18
Polygon: 283
The count is: 23
Polygon: 284
The count is: 133
Polygon: 285
The count is: 6
Polygon: 286
The count is: 12
Polygon: 287
The count is: 12
Polygon: 288
The count is: 6
Polygon: 289
The count is: 31
Polygon: 290
The count is: 7
Polygon: 291
The count is: 15
Polygon: 292
The count is: 9
Polygon: 293
The count is: 15
Polygon: 294
The count is: 37
Polygon: 295
The count is: 62
Polygon: 296
The count is: 17
Polygon: 297
The count is: 13
Polygon: 298
The count is: 26
Polygon: 299
The count is: 10
Polygon: 300
The count is: 15
Polygon: 301
The count is: 5
Polygon: 302
The count is: 6
Polygon: 303
The count is: 21
Polygon: 304
The count is: 18
Polygon: 305
The count is: 23
Polygon: 306
The count is: 17
Polygon: 307
The count is: 21
Polygon: 308
The count is: 13
Polygon: 309
The count is: 13
Polygon: 310
The count is: 6
Polygon: 311
The count is: 323
Polygon: 312
The count is: 10
Polygon: 313
The count is: 15
Polygon: 314
The count is: 8

In [17]:
featPnt['properties']['crash_year']

2011