In [16]:
import warnings
warnings.filterwarnings('ignore')

# Importing, Exporting, Geometry Engines, Oh My!

Now that we've seen the basics of Pandas, NumPy, and the Spatially Enabled DataFrame, let's cover one of the most useful applications of it: reading and writing data. In addition to manipulation and analysis, the Spatially Enabled DataFrame gives you a wide range of options for handling and persisting data. We'll cover the basics of how it works, and some new functionality added in the ArcGIS API for Python version 2.4.1 that gives users even more power in the ways they can read/write.

### Behind the Scenes

First, let's go over how things work behind the scenes. As I'm sure many of you are familiar with, there are numerous different formats that spatial data can be stored in. The two most common ones are File Geodatabase and Shapefile, which are the ones we'll focus on here. These two formats were both designed by Esri, and went in tandem with or resulted in the development of geometry engines that could process these respective types. Most notably, ArcPy is used for File Geodatabases, and shapely/pyshp for Shapefiles. This meant that prior to 2.4.1, users had to have arcpy in their environment to read/write File Geodatabases, and a Shapefile-specific engine to read and write Shapefile. However, we've brought some changes.

We've worked GDAL into the reading/writing functions behind the scenes, which means that users can now easily translate these data types with one single shape engine. For anybody unfamiliar, GDAL stands for Geospatial Data Abstraction Library, and is an open source library that can read/write a wide variety of raster and vector geospatial data formats. It already comes in the standard environment for ArcGIS Pro on Windows, and can also be installed via Python package managers Conda and Pip on Linux & MacOS. This means that File Geodatabases are now easily read into/written from the SeDF on Linux and Mac. Let's jump in!

In [17]:
# set the environment variable with os.environ
# very important to do this before importing arcgis! must reset environment otherwise
import os
os.environ["ARCGIS_GEOMETRY_ENGINE"] = "gdal" # 'gdal', 'shapefile', 'arcpy'

In [18]:
from arcgis.features import GeoAccessor, GeoSeriesAccessor
import pandas as pd

In [19]:
# these are hidden properties that show us environment info
from arcgis._impl._geometry_engine import SELECTED_ENGINE, HAS_ARCPY, HAS_GDAL, HAS_PYSHP
print("Has ArcPy: " + str(HAS_ARCPY))
print("Has GDAL: " + str(HAS_GDAL))
print("Has PySHP: " + str(HAS_PYSHP))
print("Selected Engine: " + str(SELECTED_ENGINE))

Has ArcPy: True
Has GDAL: True
Has PySHP: True
Selected Engine: GeometryEngine.GDAL


It's as easy as that! Now that we have our shape engine set, let's look at all the different sources we can import data from.

#### Shapefile

In [20]:
# importing from a shapefile on disk
sedf = pd.DataFrame.spatial.from_featureclass(r"C:\Users\computer\Downloads\rewritten_to_shape\rewritten_to_shape.shp")
sedf

Unnamed: 0,fav_color,fav_number,time,SHAPE
0,green,7,10:30:23,"{""x"": 1049942.711199999, ""y"": 6331440.64400000..."
1,blue,10,21:34:20,"{""x"": 1044536.801399998, ""y"": 6334525.54599999..."
2,red,4,18:04:55,"{""x"": 1042749.384300001, ""y"": 6332430.48569999..."


#### File Geodatabase

In [21]:
# importing from fgdb on disk
sedf = pd.DataFrame.spatial.from_featureclass(r"C:\Users\computer\Downloads\gdal_to_gdb.gdb")
sedf

Unnamed: 0,fav_color,fav_number,time,SHAPE
0,green,7,10:30:23,"{""x"": 1049942.711199999, ""y"": 6331440.64400000..."
1,blue,10,21:34:20,"{""x"": 1044536.801399998, ""y"": 6334525.54599999..."
2,red,4,18:04:55,"{""x"": 1042749.384300001, ""y"": 6332430.48569999..."


#### Feature Layer

In [22]:
# importing from feature layer
from arcgis.gis import GIS
from arcgis.features import FeatureLayer
gis = GIS("Home")
flayer = FeatureLayer("https://services7.arcgis.com/JEwYeAy2cc8qOe3o/arcgis/rest/services/kc_house_data_XYTableToPoint/FeatureServer/0")
sedf = pd.DataFrame.spatial.from_layer(flayer)
sedf

Unnamed: 0,OBJECTID,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,...,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15,SHAPE
0,1,7129300520.0,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,...,1180,0,1955,0,98178,47.5112,-122.257,1340,5650,"{""x"": -122.25699999999995, ""y"": 47.51120000093..."
1,2,6414100192.0,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,...,2170,400,1951,1991,98125,47.721,-122.319,1690,7639,"{""x"": -122.31899999999996, ""y"": 47.72100000093..."
2,3,5631500400.0,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,...,770,0,1933,0,98028,47.7379,-122.233,2720,8062,"{""x"": -122.23299999999996, ""y"": 47.73790000093..."
3,4,2487200875.0,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,...,1050,910,1965,0,98136,47.5208,-122.393,1360,5000,"{""x"": -122.39299999999997, ""y"": 47.52080000093..."
4,5,1954400510.0,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,...,1680,0,1987,0,98074,47.6168,-122.045,1800,7503,"{""x"": -122.04499999999996, ""y"": 47.61680000093..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21608,15609,3904100089.0,20150318T000000,300000.0,3,1.75,1350,7370,1.0,0,...,1350,0,1912,0,98118,47.5336,-122.278,1440,6000,"{""x"": -122.27799999999995, ""y"": 47.53360000093..."
21609,15610,1797500780.0,20140521T000000,540000.0,3,2.0,1470,1691,2.0,0,...,1000,470,2007,0,98115,47.6743,-122.316,1660,4000,"{""x"": -122.31599999999999, ""y"": 47.67430000093..."
21610,15611,7614100080.0,20150211T000000,140000.0,3,1.75,1270,8991,2.0,0,...,1270,0,1981,0,98042,47.3563,-122.149,1270,8993,"{""x"": -122.14899999999994, ""y"": 47.35630000093..."
21611,15612,87000006.0,20150413T000000,275000.0,4,1.75,1680,19405,1.0,0,...,1560,120,1959,0,98055,47.4552,-122.202,2000,12900,"{""x"": -122.20199999999996, ""y"": 47.45520000093..."


#### URL

In [23]:
# importing from shapefile at a URL
sedf = pd.DataFrame.spatial.from_featureclass(r"https://www.arcgis.com/sharing/rest/content/items/804abf2794b346828eeff285bffe9259/data")
sedf

Unnamed: 0,ACTIVITYCO,INPUT_DATE,EDIT_DATE,SHAPE
0,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[453602.48000000045, 4098028.34999..."
1,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[336426.4754999997, 4102376.7609],..."
2,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[503809.86510000005, 4113120.54409..."
3,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[496311.0599999996, 4112007.26], [..."
4,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[255773.25, 4117713.369999999], [2..."
...,...,...,...,...
362,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[319267.9144000001, 4262033.4571],..."
363,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[337010.0694000004, 4295655.9245],..."
364,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[337936.74509999994, 4267894.5615]..."
365,EKHC,2024-10-24,2024-10-24,"{""rings"": [[[354751.16980000027, 4260167.13660..."


#### CSV

In [24]:
# read a CSV into a dataframe then use from_xy
df = pd.read_csv(r"C:\Users\computer\Downloads\new.csv")
sedf = pd.DataFrame.spatial.from_xy(df, "x", "y")
sedf

Unnamed: 0,FID,city_id,name,state,capital,pop2000,pop2007,longitude,latitude,ObjectId,x,y,SHAPE
0,1,1,Honolulu,HI,State,37165,37858,-157.823436,21.305782,1,-157.823436,21.305782,"{""spatialReference"": {""wkid"": 4326}, ""x"": -157..."
1,2,2,Juneau,AK,State,30711,31592,-134.511582,58.351418,2,-134.511582,58.351418,"{""spatialReference"": {""wkid"": 4326}, ""x"": -134..."
2,3,3,Boise City,ID,State,18578,20352,-116.237655,43.613736,3,-116.237655,43.613736,"{""spatialReference"": {""wkid"": 4326}, ""x"": -116..."
3,4,4,Olympia,WA,State,27514,45523,-122.893073,47.042418,4,-122.893073,47.042418,"{""spatialReference"": {""wkid"": 4326}, ""x"": -122..."
4,5,5,Salem,OR,State,13692,15203,-123.029155,44.931109,5,-123.029155,44.931109,"{""spatialReference"": {""wkid"": 4326}, ""x"": -123..."
5,6,6,Carson,NV,State,52457,56641,-119.753873,39.160946,6,-119.753873,39.160946,"{""spatialReference"": {""wkid"": 4326}, ""x"": -119..."
6,7,7,Sacramento,CA,State,40701,46291,-121.468927,38.555609,7,-121.468927,38.555609,"{""spatialReference"": {""wkid"": 4326}, ""x"": -121..."
7,8,8,Phoenix,AZ,State,13210,15021,-112.0763,33.528373,8,-112.0763,33.528373,"{""spatialReference"": {""wkid"": 4326}, ""x"": -112..."
8,9,9,Salt Lake City,UT,State,18174,18536,-111.892618,40.7547,9,-111.892618,40.7547,"{""spatialReference"": {""wkid"": 4326}, ""x"": -111..."
9,10,10,Cheyenne,WY,State,53011,54750,104.802046,-41.145545,10,104.802046,-41.145545,"{""spatialReference"": {""wkid"": 4326}, ""x"": 104...."


### Writing Out

Now that we have more SeDF's than we know what to do with, let's spam our local disk and GIS org with them in different ways!

In [25]:
flayer = FeatureLayer("https://services7.arcgis.com/JEwYeAy2cc8qOe3o/arcgis/rest/services/kc_house_data_XYTableToPoint/FeatureServer/0")
sedf = pd.DataFrame.spatial.from_layer(flayer)

In [26]:
# first let's do to shapefile
sedf.spatial.to_featureclass(r"C:\Users\computer\Downloads\DevSummit\to_shape\out_example.shp")

'C:\\Users\\noa12726\\Downloads\\DevSummit\\to_shape\\out_example.shp'

In [27]:
# then let's do to file geodatabase
sedf.spatial.to_featureclass(r"C:\Users\computer\Downloads\DevSummit\to_fgdb\out_example.gdb")

'C:\\Users\\noa12726\\Downloads\\DevSummit\\to_fgdb\\out_example.gdb\\out_example'

In [28]:
sedf = pd.DataFrame.spatial.from_featureclass(r"C:\Users\computer\Downloads\rewritten_to_shape\rewritten_to_shape.shp")
new_item = sedf.spatial.to_featurelayer("Dev Summit Example")
new_item