# Opening Shapefiles with GeoPandas (Detailed Guide)
This notebook gives a step-by-step breakdown of how to open shapefiles using GeoPandas, including explanation of common pitfalls and how GeoPandas handles geospatial data.

## Introduction
GeoPandas makes it simple to work with shapefiles. If your system is properly set up, opening a shapefile requires just one line of code. In this guide, we'll break down what that line does and how to avoid common mistakes.

## Understanding Your File Structure
Imagine we have a folder named `states` that contains the shapefile we want to open. Inside that folder, there are multiple files including `States.shp`, `States.dbf`, `States.prj`, and more. All these files together make up the shapefile.

In [1]:
import geopandas as gpd

# Load the shapefile from the 'states' directory
# states = gpd.read_file("states/States.shp")

# cities = gpd.read_file('../../geopandas_101_DATA/us/cities.json')
# states = gpd.read_file('https://www2.census.gov/geo/tiger/GENZ2022/shp/cb_2022_us_state_500k.zip')

states = gpd.read_file('../../geopandas_101_DATA/us/cb_2022_us_state_500k.zip')

cities = gpd.read_file('https://services3.arcgis.com/GVgbJbqm8hXASVYi/ArcGIS/rest/services/USA_Major_Cities_/FeatureServer/0/query?where=1=1&outFields=*&f=geojson')
cities = cities.sort_values(['POPULATION'], ascending=False, ignore_index=True)

In [2]:
states.head()

Unnamed: 0,STATEFP,STATENS,AFFGEOID,GEOID,STUSPS,NAME,LSAD,ALAND,AWATER,geometry
0,35,897535,0400000US35,35,NM,New Mexico,0,314198573403,726463825,"POLYGON ((-109.05017 31.48, -109.04984 31.4995..."
1,46,1785534,0400000US46,46,SD,South Dakota,0,196341552329,3387681983,"POLYGON ((-104.05788 44.9976, -104.05078 44.99..."
2,6,1779778,0400000US06,6,CA,California,0,403673617862,20291712025,"MULTIPOLYGON (((-118.60442 33.47855, -118.5987..."
3,21,1779786,0400000US21,21,KY,Kentucky,0,102266581101,2384240769,"MULTIPOLYGON (((-89.40565 36.52816, -89.39868 ..."
4,1,1779775,0400000US01,1,AL,Alabama,0,131185042550,4582333181,"MULTIPOLYGON (((-88.05338 30.50699, -88.05109 ..."


In [3]:
cities.head()

Unnamed: 0,OBJECTID,NAME,CLASS,STATE_ABBR,STATE_FIPS,PLACE_FIPS,POPULATION,POP_CLASS,POP_SQMI,SQMI,CAPITAL,geometry
0,1574,New York,city,NY,36,3651000,8804190,10,29255.6,300.94,,POINT (-74.01013 40.71057)
1,1025,Los Angeles,city,CA,6,644000,3898747,10,8239.8,473.16,,POINT (-118.27058 34.05279)
2,378,Phoenix,city,AZ,4,455000,1608139,10,3097.8,519.12,State,POINT (-112.07387 33.44611)
3,1354,San Diego,city,CA,6,666000,1386932,10,4210.0,329.44,,POINT (-117.14556 32.72033)
4,1386,San Jose,city,CA,6,668000,1013240,10,5616.6,180.4,,POINT (-121.88641 37.33941)


## Why `gpd.read_file()`?
Unlike Pandas, where you use different functions like `read_csv()` or `read_excel()` depending on the file type, GeoPandas uses a single function: `read_file()`.

### This function works for:
- `.shp` (shapefiles)
- `.geojson` (GeoJSON)
- `.json` (geospatial JSON)

> 📌 There is **no** `read_shapefile()` function in GeoPandas. Always use `read_file()`.

## Use GeoPandas, Not Pandas
Make sure you're using `gpd.read_file`, not `pd.read_file`.
- `pd.read_file` will cause an error—Pandas doesn’t recognize geospatial formats.
- `gpd` refers to GeoPandas, which extends Pandas to handle spatial data.

## Unzip Your Files First
When you download a shapefile, it's often zipped. Before opening it:
1. **Unzip** the archive.
2. **Find the `.shp` file** (along with the `.dbf`, `.prj`, `.shx` files, etc.).
3. Pass the path to `.shp` into `read_file()`.

GeoPandas will automatically read the other needed files.

## Summary
- Use `gpd.read_file('path/to/file.shp')` to open shapefiles.
- Always extract `.zip` archives first.
- Only give the `.shp` file to GeoPandas; it will handle the rest.
- Don't use Pandas (`pd`) to read geospatial files.

📌 One simple line can open complex geographic data—just be sure the file is ready!