# geopandas
Geopandas is a Python library that extends `pandas` by adding support for geospatial data.

In [1]:
import os
import geopandas as gpd
import matplotlib.pyplot as plt

To import the data, we first use `os` package to create a reproducible file path:

In [14]:
%pwd

'/Users/vedikashirtekar/Documents/MEDS/eds-220/eds-220-2025-in-class'

In [13]:
os.path.exists("data/data2/gbif_sus_scrofa_california/gbif_sus_scrofa_california.shp")

False

In [17]:
fp = os.path.join("data/data 2", "gbif_sus_scrofa_california", "gbif_sus_scrofa_california.shp")
fp

'data/data 2/gbif_sus_scrofa_california/gbif_sus_scrofa_california.shp'

In [18]:
pigs = gpd.read_file(fp)
pigs.head()

Unnamed: 0,gbifID,species,state,individual,day,month,year,inst,collection,catalogNum,identified,geometry
0,899953814,Sus scrofa,California,,22.0,3.0,2014.0,iNaturalist,Observations,581956,edwardrooks,POINT (-121.53812 37.08846)
1,899951348,Sus scrofa,California,,9.0,6.0,2007.0,iNaturalist,Observations,576047,Bruce Freeman,POINT (-120.54942 35.47354)
2,896560733,Sus scrofa,California,,20.0,12.0,1937.0,MVZ,Hild,MVZ:Hild:195,"Museum of Vertebrate Zoology, University of Ca...",POINT (-122.27063 37.87610)
3,896559958,Sus scrofa,California,,1.0,4.0,1969.0,MVZ,Hild,MVZ:Hild:1213,"Museum of Vertebrate Zoology, University of Ca...",POINT (-121.82297 38.44543)
4,896559722,Sus scrofa,California,,1.0,1.0,1961.0,MVZ,Hild,MVZ:Hild:1004,"Museum of Vertebrate Zoology, University of Ca...",POINT (-121.74559 38.54882)


In [20]:
fp = os.path.join("data/data 2", "ca_state_boundary", "ca_state_boundary.shp")
ca_boundary = gpd.read_file(fp)
ca_boundary

Unnamed: 0,REGION,DIVISION,STATEFP,STATENS,GEOID,STUSPS,NAME,LSAD,MTFCC,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
0,4,9,6,1779778,6,CA,California,0,G4000,A,403501101370,20466718403,37.1551773,-119.5434183,"MULTIPOLYGON (((-119.63473 33.26545, -119.6363..."


## `Geoseries` and the `GeoDataFrame`
`geopandas.GeoDataFrame` is the core structure in geopandas = `pandas.DataFrame` plus a dedicated geometry colimn that can perform spatial operations. 
**geometry column** holds the geometry (points, polygons, etc.) of each spatial feature. This type is a geopandas.GeoSeries.

In [21]:
pigs.head(3)

Unnamed: 0,gbifID,species,state,individual,day,month,year,inst,collection,catalogNum,identified,geometry
0,899953814,Sus scrofa,California,,22.0,3.0,2014.0,iNaturalist,Observations,581956,edwardrooks,POINT (-121.53812 37.08846)
1,899951348,Sus scrofa,California,,9.0,6.0,2007.0,iNaturalist,Observations,576047,Bruce Freeman,POINT (-120.54942 35.47354)
2,896560733,Sus scrofa,California,,20.0,12.0,1937.0,MVZ,Hild,MVZ:Hild:195,"Museum of Vertebrate Zoology, University of Ca...",POINT (-122.27063 37.87610)


In [24]:
# Check the data type of the pigs data frame 
print((type(pigs)))

# Check the data type of the geometry column
print(type(pigs.geometry))

# Check data type of gibfID column
print(type(pigs.gbifID))

# Check data type of each column 
pigs.dtypes

<class 'geopandas.geodataframe.GeoDataFrame'>
<class 'geopandas.geoseries.GeoSeries'>
<class 'pandas.core.series.Series'>


gbifID           int64
species         object
state           object
individual     float64
day            float64
month          float64
year           float64
inst            object
collection      object
catalogNum      object
identified      object
geometry      geometry
dtype: object

In [26]:
# Check the type of each element in column 
pigs.geom_type

0       Point
1       Point
2       Point
3       Point
4       Point
        ...  
1041    Point
1042    Point
1043    Point
1044    Point
1045    Point
Length: 1046, dtype: object

### What is the geometry type of the single feature on the CA State boundary?

In [28]:
ca_boundary.geom_type


0    MultiPolygon
dtype: object

#### ALWAYS CHECK CRS 
CRS = Structure to location each spatial feature of data frame on surface of Earth

In [30]:
pigs.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [None]:
#### Examine CRS details
print("Ellipsoid", pigs.crs.ellipsoid)
print("Datum:", pigs.crs.datum)
print("Is Geographic:", pigs.crs.is_geographic)
#print("Is Projected:". pigs.crs.is_projected) 

Ellipsoid WGS 84
Datum: World Geodetic System 1984 ensemble
Is Geographic: True


AttributeError: 'str' object has no attribute 'pigs'