# OSM Raster PolyPoints

This notebook contains analysis and visualizations for:

1. Loading polygons from masked raster 
2. Loading in points from OSM
3. Associating points with polygons

Updated: 2018-11-21

2018-11-30: Ran on WP1000c600 for 2000 & 2015 using 872 OSM cities with IDs and exported .shp polygons as
20181130_africa1k_20**XX**_mask_1000c600_polypoints.shp. 739 foot prints for 2015 & 705 for 2000

2018-12-04: Ran on WP1000c600 **with new function from Ryan** for 2000 & 2015 using 872 OSM cities with IDs and exported .shp polygons as
20181204_africa1k_20**XX**_mask_1000c600_polypoints.shp. 721 foot prints for 2015 & 691 for 2000 (14 min)

### NOTE on 2018-11-30 DROP polygon doubles at the VERY END

### NOTE on 2018-12-04 Add array to function to capture osm points that are not found w/in polygons

### Note on 2018-12-04 In loop, drop polygons as they are matched to lower the number 

### NEED TO FIND AWAY TO ASSOCIATE RASTER PIXELS AND POINTS WITH COUNTRIES Before we run giant for loop - show kelly problems in QGIS

1. Can likely clip points by polygon geometry and chunk
https://www.earthdatascience.org/courses/earth-analytics-python/spatial-data-vector-shapefiles/clip-vector-data-in-python-geopandas-shapely/

2. Can likely clip polygons by countries

https://gis.stackexchange.com/questions/168266/pyqgis-a-geometry-intersectsb-geometry-wouldnt-find-any-intersections

#### updated 2018-11-21 Loop isn't that big and a good Africa basemap has a ton of polygons ... better to chunk later

In [None]:
# Load africa countries -- 762 polygons because of islands 
# Africa_poly = gpd.read_file(outfilepath+"Africa_polys_test.shp")
# len(Africa_poly)

# Dependencies

In [7]:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
from shapely.geometry import mapping
from shapely.geometry import Polygon
from shapely.geometry import shape
import ast
from shapely.geometry import mapping
import rasterio

# Load in OSM and Polygons from csv

In [10]:
# will build out folders later

# data folder git will ignore
#infilepath = "/home/cascade/tana-crunch-cascade/projects/NTL/data/" # git will ignore
#outfilepath = "/home/cascade/tana-crunch-cascade/projects/NTL/temp_data/" # git will not ignore - NO BIG FILES 

# Local computer 
infilepath = '/Users/cascade/Github/NTL/data/raw/worldpop/Africa-1km-Population/'
outfilepath = '/Users/cascade/Github/NTL/temp_data/'

In [5]:
def load_points (file):
    """ This function loads a csv of points and turns it into shapely points"""
    df = pd.read_csv(file)

    # creating a geometry column 
    geometry = [Point(xy) for xy in zip(df['lon'], df['lat'])]

    # Coordinate reference system : WGS84
    crs = {'init': 'epsg:4326'}

    # Creating a Geographic data frame 
    point_gdf = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)
    
    return point_gdf

In [11]:
# Load OSM Points
osm_point_gdf = load_points(outfilepath+'20181129_osm_africa_cities.csv')

In [12]:
len(osm_point_gdf)

872

# Function for buffering points

In [29]:
def point_buffer(gpd_df, raduis):
    "Function to make a shapely polygon buffer around a point"
   
    new_gpd_df = gpd.GeoDataFrame()
    arr = []
    
    for point in gpd_df['geometry']:
        buffer = point.buffer(radius)
        arr.append((buffer))
    
    new_gpd_df['id'] = gpd_df['id']
    new_gpd_df['geometry'] = arr
    
    return new_gpd_df

In [17]:
# AGU 2018-12-04 - radius set to ~250m at the equator 

radius = 250*1/(111*1000)
radius

0.0022522522522522522

In [42]:
osm_buffer_gdf = point_buffer(osm_point_gdf, radius)


In [44]:
# write out shape file buffer 2018-12-04
#test.to_file(outfilepath+'test_250mBuffer.shp', driver='ESRI Shapefile')

# Function for searching if a point is within a polygon

In [None]:
def poly_point (poly, point):
    """
    This function will check if points are inside polygons if given two gpd dataframes with points and polygons
    Returns point ids, point geometry, polygon index # and polygon geometry
    """
    
    out_arr = [] #return an array <<< ---------------- ASK RYAN IF BETTER DO USE DICT 
    
    for index_point, row_point in point.iterrows():
        for index_poly, row_poly in poly.iterrows():
            if row_point['geometry'].within(row_poly['geometry']):
                point_id = row_point['id']
                point_geom = mapping(row_point['geometry']) # makes a dict w/ keys : type and cood
                poly_id = index_poly
                poly_geom = mapping(row_poly['geometry']) # makes a dict w/ keys : type and cood
                
                out_arr.append((point_id, 
                                point_geom, 
                                poly_id, 
                                poly_geom))

    return out_arr

# Note 2018-11-30 update arr to gpd_df 

# Function for polygon intersection

In [81]:
def poly_buffer (point_buffer, poly_raster):
    """
    This function will check if point buffers intersect with polygons 
    if given two gpd dataframes with point buffers and polygons
    Returns point ids, point geometry, polygon index #, and polygon geometry in a geopandas DF.
    It goes faster if smaller list goes first
    """
    
    # make arrays to fill
    osm_id_arr = [] 
    FID_arr = [] 
    poly_geom_arr = []
    
    for index_point_buffer, row_point_buffer in point_buffer.iterrows():
        for index_poly_raster, row_poly_raster in poly_raster.iterrows():
            if row_point_buffer['geometry'].intersects(row_poly_raster['geometry']):
                osm_id = row_point_buffer['id']
                poly_id = row_poly_raster['FID']
                poly_geom = shape(mapping(row_poly_raster['geometry'])) # make polygon 

                osm_id_arr.append((osm_id))
                FID_arr.append((poly_id))
                poly_geom_arr.append((poly_geom))
    
    # put results into a geopandas df
    new_gpd_df = gpd.GeoDataFrame()
    new_gpd_df['osm_id'] = osm_id_arr
    new_gpd_df['FID'] = FID_arr
    new_gpd_df['geometry'] = poly_geom_arr
    
    return new_gpd_df

# Note 2018-11-30 update arr to gpd_df 

In [89]:
# Load Polygon

WP2000_poly = gpd.read_file(outfilepath+'20181204_africa1k_2000_mask_1000c600_poly.shp')
WP2015_poly = gpd.read_file(outfilepath+'20181204_africa1k_2015_mask_1000c600_poly.shp')

In [90]:
print(len(WP2015_poly))
print(len(WP2000_poly))
print(len(osm_buffer_gdf))

22029
15055
872


In [96]:
import time

checkpoint = time.time()

WP2015_polybuff = poly_buffer(osm_buffer_gdf, WP2015_poly)

print("elapsed time is: {}s".format(time.time()-checkpoint))

elapsed time is: 1293.72336435318s


In [98]:
len(WP2015_polybuff)

721

In [99]:
#WP2015_polybuff.to_file(outfilepath+'20181204_africa1k_2015_mask_1000c600_polypoints.shp', driver='ESRI Shapefile')

# Function to take dicts and make shapes

In [None]:
# from shapely.geometry import shape

def arr_gpd(gpd_df, incolname, newcolname):
    """Function takes a geopandas dataframe with dicts and returns proper geometry to make shapefiles"""
    arr = []

    for i in gpd_df[incolname]:
        i = shape(i)
        arr.append((i))

    # for poly in polypoints_2000_df.iloc[:,6]:
    #     poly = shape(ast.literal_eval(poly))
    #     test.append = (poly)

    #polypoints_2020_df['poly_geom'] = polypoints_2020_df['poly_geom'].apply(ast.literal_eval())
    gpd_df[newcolname] = arr
    
    return gpd_df

In [None]:
#WP2015_polypoints_df.to_file(outfilepath+'20181130_africa1k_2015_mask_1000c600_polypoints.shp', driver='ESRI Shapefile')


# Old Code

In [None]:
# building a function to check if points are in poly for lists of poly and points
# needs geopandas data frame with point and poly geometry 

# def poly_point (poly, point):
#     """
#     This function will check if points are inside polygons if given two gpds with points and polygons
#     Returns city names or no list 
#     """
    
#     out_arr = [] #return an array <<< ---------------- ASK RYAN IF BETTER DO USE DICT 
    
#     for index_point, row_point in point.iterrows():
#         for index_poly, row_poly in poly.iterrows():
#             if row_point['geometry'].within(row_poly['geometry']):
#                 country = row_point['Country']
#                 city = row_point['City']
#                 point_id = row_point['Id']
#                 point_geom = mapping(row_point['geometry']) # makes a dict w/ keys : type and cood
#                 poly_id = index_poly
#                 poly_geom = mapping(row_poly['geometry']) # makes a dict w/ keys : type and cood
                
#                 out_arr.append((country, 
#                                 city, 
#                                 point_id, 
#                                 point_geom, 
#                                 poly_id, 
#                                 poly_geom))
# #             else:
# #                 test.append('no')
#     return out_arr