# Brooklyn Turfs

Use basic spatial joins to find the STEW-MAP turfs in Brooklyn.  This is the starting point to complete the purpose built data set.

The process is straight forward:

  1.  Get the boundaries for the 5 boroughs.
  2.  Select the Brooklyn boundary and create/save to the processed/brooklyn directory.
  3.  Get the STEW-MAP turfs.
  4.  Use spatial join on the two geodataframes to find the turfs that overlap Brooklyn.
  5.  Save the resulting turfs to processed/brooklyn

In [None]:
borough_gdf = gpd.read_parquet('../data/processed/admin-boundaries/boroughs.parq')

In [None]:
borough_gdf.info()

In [None]:
borough_gdf.explore(column='boro_name', 
                    cmap=["red", "blue", "green", "yellow", "purple"])

In [None]:
brooklyn_gdf = borough_gdf.loc[borough_gdf['boro_code'] ==  3]

In [None]:
brooklyn_gdf.explore()

In [None]:
brooklyn_gdf.to_parquet('../data/processed/brooklyn/brooklyn-boundary.parq')

In [None]:
brooklyn_gdf.crs

In [None]:
turfs_gdf = gpd.read_parquet('../data/processed/turfs.parq')

Let's examine the geodataframe.  Remember, we've already done this in 01.1-explore-stew-geo.ipynb

In [None]:
turfs_gdf.shape

In [None]:
turfs_gdf.info()#(verbose=True, show_counts=True)

I am curious about a unique id for this data set.  Supposed to be PopID.  

In [None]:
turfs_gdf.PopID.is_unique

We will come back and revisit this!

In [None]:
%%time
turfs_in_brooklyn_gdf = brooklyn_gdf.to_crs('epsg:4326').overlay(turfs_gdf.to_crs('epsg:4326'), how='intersection')

In [None]:
turfs_in_brooklyn_gdf.info()#(verbose=True, show_counts=True)

In [None]:
turfs_in_brooklyn_gdf.shape

In [None]:
309 / 751

So, **41%** of the turfs fall in Brooklyn. 

Before I save this dataframe I want to some house cleaning.

In [None]:
dtypes_map = {
    "boro_code": "Int8",
"OBJECTID" : "Int32",
"ResID" : "Int32",
"Conserve" : "Int32",
"Manage" : "Int32",
"Transform_" : "Int32",
"Monitor" : "Int32",
"Advocate" : "Int32",
"Educate" : "Int32",
"Particip" : "Int32",
"NoneSTEWFn" : "Int32",
"OrgType" : "Int32",
"PS_Wtershd" : "Int32",
"PS_Stream" : "Int32",
"PS_Shrline" : "Int32",
"PS_Wetland" : "Int32",
"PSSaltMrsh": "Int32",
"PS_Forest" : "Int32",
"PS_Park" : "Int32",
"PS_CommGrd" : "Int32",
"PS_UrbFarm" : "Int32",
"PS_VacLot" : "Int32",
"PS_Brwnfld" : "Int32",
"PS_Ballfld" : "Int32",
"PS_PlayFld" : "Int32",
"PS_DogPark" : "Int32",
"PS_PubGrdn" : "Int32",
"PS_Grnways" : "Int32",
"PS_PROW" : "Int32",
"PS_PROW_1" : "Int32",
"PS_StrTree" : "Int32",
"PS_Planter" : "Int32",
"PS_ResBldg" : "Int32",
"PS_Schlyrd" : "Int32",
"PS_PubBldg" : "Int32",
"PS_Crtyard" : "Int32",
"PS_GrnRoof" : "Int32",
"PS_GrnBldg" : "Int32",
"PS_WsteSys" : "Int32",
"PS_EnrgySy" : "Int32",
"PS_FdSys" : "Int32",
"PS_StrmWtr" : "Int32",
"PS_Atmsphr" : "Int32",
"PS_Other" : "Int32",
"PS_None" : "Int32",
"PO_Local" : "Int32",
"PO_NGO" : "Int32",
"PO_PubPriv" : "Int32",
"PO_State" : "Int32",
"OF_Animal" : "Int32",
"OF_Arts" : "Int32",
"OF_CommImp" : "Int32",
"OF_Crime" : "Int32",
"OF_EconDev" : "Int32",
"OF_Educ" : "Int32",
"Of_ER" : "Int32",
"OF_Employ" : "Int32",
"OF_EngyEff" : "Int32",
"OF_Environ" : "Int32",
"OF_Faith" : "Int32",
"OF_Food" : "Int32",
"OF_Housing" : "Int32",
"OF_HumServ" : "Int32",
"OF_LeglSrv" : "Int32",
"OF_PwrGen" : "Int32",
"OF_GrantPR" : "Int32",
"OF_PubHlth" : "Int32",
"OF_SprtRec" : "Int32",
"OF_RandD" : "Int32",
"OF_Senior" : "Int32",
"OF_Pollute" : "Int32",
"OF_Transpo" : "Int32",
"OF_Youth" : "Int32",
"OrgFnOther" : "Int32",
"PctStew" : "Int32",
"FTStaff" : "Int32",
"PTStaff" : "Int32",
"Members" : "Int32",
"Volunteers" : "Int32",
"OccVolHrs" : "Int32",
"ComPartic" : "Int32",
"TrustBnNei" : "Int32",
"InflncPP" : "Int32",
"PltsHabQy" : "Int32",
"AirWatQlty" : "Int32",
"LndPrtctn" : "Int32",
"UrbnSustn" : "Int32",
"PlaNYC2007" : "Int32",
"MTNYC" : "Int32",
"DEP2010" : "Int32",
"Vis2020" : "Int32",
"PlaNYC2013" : "Int32",
"VZero2014" : "Int32",
"OneNYC2015" : "Int32",
"Waste2015" : "Int32",
"OthPlans" : "Int32",
"Dr_ExtrWth" : "Int32",
"DR_CC" : "Int32",
"Dr_FinanCr" : "Int32",
"Dr_SocialM" : "Int32",
"Dr_EO" : "Int32",
"Dr_NeighDe" : "Int32",
"Dr_Other" : "Int32",
"Serv_Data" : "Int32",
"Serv_Legal" : "Int32",
"Serv_Build" : "Int32",
"Serv_Equip" : "Int32",
"Serv_Tech" : "Int32",
"Serv_Labor" : "Int32",
"Serv_Grnts" : "Int32",
"Serv_Comp" : "Int32",
"Serv_PR" : "Int32",
"Serv_Data_" : "Int32",
"Serv_Ot" : "Int32",
"Shr_No" : "Int32",
"Shr_Natl" : "Int32",
"Shr_Local" : "Int32",
"Shr_Dir" : "Int32",
"Shr_MailBs" : "Int32",
"Shr_Door" : "Int32",
"Shr_WrdMth" : "Int32",
"Shr_Flyer" : "Int32",
"Shr_Web" : "Int32",
"Shr_Social" : "Int32",
"Shr_List" : "Int32",
"Shr_Blog" : "Int32",
"Shr_NtlCnf" : "Int32",
"Shr_City" : "Int32",
"Shr_Radio" : "Int32",
"Shr_TV" : "Int32",
"Shr_Ot" : "Int32",
"Inter_YN" : "Int32"
}

In [None]:
%%time
for key, value in dtypes_map.items():
    turfs_in_brooklyn_gdf[key] = turfs_in_brooklyn_gdf[key].astype(value)

I converted all those float64 types to something more appropriate.  I cuts the size a bit.

In [None]:
turfs_in_brooklyn_gdf.info()

According to the data dictionary PopID is a unique id.  I don't think that is true!

First we can check the original STEW-MAP turfs.

In [None]:
turfs_gdf.PopID.value_counts()

Next we can look at the Brooklyn turfs obtained from the spatial join.

In [None]:
turfs_in_brooklyn_gdf.PopID.is_unique

In [None]:
turfs_in_brooklyn_gdf.PopID.value_counts()

Two PopID's, 1190 and 20212, are not unique in the Brooklyn turfs.  At first glance it doesn't look like an artifact of the spatial joins?? 

I am going to check them out and get rid of one of them.

**Note:**  There are two of them so this is a mechanical technique.  Might want to do this right up front and get rid of the third one!

In [None]:
turfs_in_brooklyn_gdf[turfs_in_brooklyn_gdf['PopID'] == 1190]

In [None]:
turfs_in_brooklyn_gdf[turfs_in_brooklyn_gdf['PopID'] == 20212]

After visual inspection index 298 for PopID 1190 seems a reasonable choice?

For PopID 20212 I flipped a coin and selected index 129.

In [None]:
turfs_in_brooklyn_gdf = turfs_in_brooklyn_gdf.drop(index=[298, 129])

In [None]:
turfs_in_brooklyn_gdf.PopID.is_unique

Now we have a unique id (as for the data dictionary) and 307 turfs in Brooklyn.

In [None]:
turfs_in_brooklyn_gdf.shape

In [None]:
turfs_in_brooklyn_gdf.explore('OrgName')

Finally, we can save the brooklyn-turfs for next steps.

In [None]:
turfs_in_brooklyn_gdf.to_parquet('../data/processed/brooklyn/brooklyn-turfs.parq')