## The Spatial DataFrame

### What is a Spatial DataFrame?

A Spatial DataFrame is an extension of a Pandas' DataFrame object, which is a labeled data structure with columns of potentially different types. The Spatial DataFrame can be thought of as a feature class or table loaded into memeory.  It has a geometry column and a series of attribute columns.  

*Note:*
- The Spatial DataFrame can be used to work with feature classes and feature layer data.  
- The Spatial DataFrame is designed to work with Python **3**.
- The Spatial DataFrame can import/export data to and from services.

### Licensing and Data Frames

The Spatial DataFrame is designed to be used with an authenticated GIS object **or** ArcGIS Pro ArcPy.  If both of those are missing, the Spatial DataFrame will throw an exception.



## Creating a SpatialDataFrame

Spatial Dataframe is used to work with service data or local data using Pandas.

Data Frames can be created from local data, service data, or from other data frames.  

#### From Service Layer

In [1]:
from arcgis.features import SpatialDataFrame
from arcgis.features import FeatureLayer
from arcgis.gis import GIS
import getpass

gis = GIS(url="https://agsapipor1.esri.com/portal", 
          username="admin", 
          password=getpass.getpass(),
          verify_cert=False)

········


In [3]:
url = "https://sampleserver6.arcgisonline.com/arcgis/rest/services/MontgomeryQuarters/MapServer/0"
sdf = SpatialDataFrame.from_layer(layer=FeatureLayer(url), gis=gis)
sdf.head()

Unnamed: 0,block,creationti,lastmodifi,objectid,res,shape_star,shape_stle,st_area(shape),st_length(shape),SHAPE
0,,,,37831,1,76366.376771,1124.239169,76366.369931,1124.239104,"{'rings': [[[505850.8437930122, 683364.4999688..."
1,,,,37832,1,76367.793389,1124.185333,76367.805749,1124.185328,"{'rings': [[[505758.46864967793, 683662.500029..."
2,,,,37833,0,81941.731426,1177.762622,81941.699125,1177.762331,"{'rings': [[[505643.6873831786, 683616.1251217..."
3,,,,37834,0,81954.882522,1177.729068,81954.845708,1177.728593,"{'rings': [[[505473.93739459664, 683213.812606..."
4,,,,37835,1,88451.463286,1234.824606,88451.431165,1234.824367,"{'rings': [[[506117.8750830963, 683809.9373981..."


#### From Feature Class

In [4]:
fc = r"data/airports.sdc/airports"
shp_sdf = SpatialDataFrame.from_featureclass(fc)
shp_sdf.head()

Unnamed: 0,ObjectID,NAME,FCC,LOC_ID,USERS,SHAPE
0,0,Inuvik Airport,D59,CYEV,,"{'spatialReference': {'wkid': 4326, 'latestWki..."
1,1,Inuvik Airport,D58,CYEV,,"{'spatialReference': {'wkid': 4326, 'latestWki..."
2,2,Norman Wells Airport,D58,CYVQ,,"{'spatialReference': {'wkid': 4326, 'latestWki..."
3,3,Norman Wells Airport,D59,CYVQ,,"{'spatialReference': {'wkid': 4326, 'latestWki..."
4,4,Tuktoyaktuk Airport,D59,CYUB,,"{'spatialReference': {'wkid': 4326, 'latestWki..."


#### From Another Data Frame

In [6]:
import pandas as pd
import arcpy
from arcgis.features import SpatialDataFrame
from arcgis.geometry import _types
attributes = [['A', 1, 2],
              ['B', 13, -2.99],
              ['C', 1-(4**3), 2**9]]
coords = [[0,1], [1,1], [2,3]]
column_names = ['A', "B", "C"]
df = pd.DataFrame.from_records(data=attributes, columns=column_names)
sdf_manual = SpatialDataFrame(df,
                       geometry=[arcpy.PointGeometry(arcpy.Point(X=r[0], Y=r[1])) for r in coords])
sdf_manual.head()

Unnamed: 0,A,B,C,SHAPE
0,A,1,2.0,"{'x': 0, 'y': 1, 'spatialReference': {'wkid': ..."
1,B,13,-2.99,"{'x': 1, 'y': 1, 'spatialReference': {'wkid': ..."
2,C,-63,512.0,"{'x': 2, 'y': 3, 'spatialReference': {'wkid': ..."


## Working with Spatial DataFrames

SpatialDataFrame objects are very versatile.  They allow for a user to quickly manipulate data and save it to disk, or with a bit of code, push it back to a service. 
<hr/>
Some **common** tasks outlined are:

* create spatial index 
* accessing geometry operations
* converting to feature classes

#### Creating and Using a spatial index

Spatial indexes allow for the quick quering of spatial data based on a given extent (x/y lower left and x/y upper right).  By default, the spatial index create a QuadTree index.  Indexes are generated after data is loaded into the dataframe:



In [7]:
spatial_index = sdf.sindex

In [8]:
sdf.geoextent

(504008.531204678, 683193.93764594896, 509180.59369517898, 687996.81255461299)

To query the spatial index, a boundary extent needs to be specified and outputs a list of indexes for the intersecting points.

In [9]:
found_rows = spatial_index.intersect([508008.531204678, 683193.93764594896, 
                                      509180.59369517898, 687996.81255461299])
found_rows

{53, 61, 62}

Using the Pandas' iloc function, you can extract a subset of rows:

In [10]:
sub_set = sdf.loc[found_rows]
sub_set.head()

Unnamed: 0,block,creationti,lastmodifi,objectid,res,shape_star,shape_stle,st_area(shape),st_length(shape),SHAPE
53,,,,37884,0,190661.940156,1896.868382,190661.973507,1896.868152,"{'rings': [[[507958.8438420966, 685702.7498855..."
61,,,,37892,0,502170.893555,3931.285454,502170.925421,3931.285105,"{'rings': [[[508637.62496484444, 685089.562463..."
62,,,,37893,0,502197.29877,3929.750429,502197.244447,3929.75028,"{'rings': [[[507331.56261634454, 684289.000088..."


* **Note** if arcpy is installed to further parse down the data can be done using the selection shape.  Let's assume we have a polygon gemetry

### Spatial Selections

Spatial DataFrames have the ability to select by spatial location.

#### Example Selecting by Location

In [11]:
# Requires arcpy
geom = sdf.iloc[61]['SHAPE']
q = sub_set.disjoint(geom) == False
sub_set[q]

Unnamed: 0,block,creationti,lastmodifi,objectid,res,shape_star,shape_stle,st_area(shape),st_length(shape),SHAPE
61,,,,37892,0,502170.893555,3931.285454,502170.925421,3931.285105,"{'rings': [[[508637.62496484444, 685089.562463..."


<hr/>
#### Working with Geometries



The geometry column can be accessed by the following: 

In [12]:
geometry_column = sdf.geometry

* the default geometry column name is 'SHAPE'

To access the geometry operations such as projectAs and clip, the **arcpy** module must be installed, or these operations will all return NULL results.  
<hr/>

The geometry column provides a rich selections of operations.  

Common operations

- disjoint
- clip by extent
- reproject
- get length and area

exist on the geometry column and on each geometry object.  When an operation is executed on the geometry column, the result returns a Pandas' Series object.

<div>

<img src='staticimgs/SDF_Geometry.png' align='left'/>
</div>

The operations are further explained on the Spatial DataFrame help page.


##### Reprojecting the Dataframe

To reproject the dataframe, use the arcpy.SpatialReference object and pass it into the arcgis.geometry.projectAs().

In [13]:
import arcpy
sr = arcpy.SpatialReference(4326)
sdf.geometry = sdf.geometry.project_as(sr)

### Saving you Work

Just like reading data, Spatial DataFrames can save the data you work with to various output formats.  These operations require **arcpy, pyshp or a authenticated GIS object** to save locallly or to a service.

Users can exports either the whole dataset to disk, or can export DataFrames with queries applied.  

#### Example: Saving DataFrame to a Feature Class

In [17]:
import arcpy
scratch_gdb = arcpy.env.scratchGDB
ds = shp_sdf.to_featureclass(out_location=scratch_gdb, out_name="airports")
print(ds)

C:\Users\andr5624\AppData\Local\Temp\scratch.gdb\airports


#### Example: Import Spatial DataFrame to Enterprise GIS

The GIS content object can import Spatial DataFrames directly as a hosted feature service.  

In [22]:
gis = GIS(username="*****", password="*****")

In [23]:
gis.content.import_data(df=shp_sdf, title="WorldAirports")