# In-Class Assignment: Spatial Data Analysis with Geopandas

**Course:** Data Science for Economists  
**Topic:** Introduction to Geospatial Data, Shapely, and CRS

## Instructions
Complete the following exercises to practice manipulating spatial data. You will need the `geopandas`, `shapely`, and `matplotlib` libraries installed.

### Part 1: Conceptual Understanding

**1.1. Spatial Data Components** 
Spatial data is unique because it encodes two distinct types of information. What are they?

*Answer:*
> [Double click to edit]

**1.2. Coordinate Reference Systems (CRS)** 
Explain the difference between a **Datum** and a **Projection** in the context of a CRS. Why does flattening the 3D earth into a 2D space inevitably create distortions?

*Answer:*
> [Double click to edit]

---

### Part 2: Working with Shapely Geometries

The `shapely` library is the engine `geopandas` uses to handle geometric objects. Let's create some shapes manually.

In [None]:
# Import necessary libraries
import shapely
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point, Polygon, LineString

**2.1. Create a Point** 
Create a Shapely `Point` representing the location of New York City (approximate coordinates: Longitude -74.006, Latitude 40.712). Assign it to the variable `nyc_point`.

In [None]:
# Your code here
nyc_point = None

**2.2. Create a Polygon** 
Create a Shapely `Polygon` representing a triangle connecting three arbitrary points. Assign it to the variable `my_triangle`.

In [None]:
# Your code here
my_triangle = None

**2.3. Visualization** 
Use `matplotlib` to plot the `nyc_point` and `my_triangle` you created above.  
*Hint: You can extract x and y coordinates from a shapely object using `.x` and `.y` for points, or `.exterior.xy` for polygons.*

In [None]:
# Your code here
fig, ax = plt.subplots()

# Plot the triangle
# ax.plot(...)

# Plot the point
# ax.scatter(...)

plt.show()

---

### Part 3: Loading and Inspecting Geospatial Data

For this section, we will assume you have a shapefile available (like the US states file discussed in lecture). If you do not have the local file, you can use the built-in dataset from geopandas for this exercise.

**3.1. Load Data** 
Load the `'naturalearth_lowres'` dataset provided by geopandas (or the `cb_2018_us_state_5m.shp` if you have it locally). Save it to a GeoDataFrame named `world`.

In [None]:
# Load built-in dataset
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))

# Display the first 5 rows
# Your code here

**3.2. Inspect the Geometry** 
GeoDataFrames are like pandas DataFrames but with a special column.  
1. What is the name of the column that holds spatial information?  
2. Print the head of just that column.

In [None]:
# Your code here

---

### Part 4: Coordinate Reference Systems (CRS)

**4.1. Check the CRS** 
Check the current CRS of the `world` GeoDataFrame. What is the EPSG code?

In [None]:
# Your code here

**4.2. Understanding the PROJ4 String** 
Based on the PROJ4 string output from the previous cell (or by looking up the EPSG code), what represent the **units** of this CRS? (e.g., degrees, meters, feet).

*Answer:*
> [Double click to edit]

**4.3. Reprojection** 
The current CRS is likely WGS84 (EPSG:4326), which uses degrees. This distorts the shape of countries towards the poles.  
1. Reproject the `world` data to the **Mercator** projection (EPSG:3395) or **Robinson** (EPSG:54030, if available).  
2. Save this new GeoDataFrame as `world_projected`.

In [None]:
# Your code here
world_projected = None

**4.4. Calculate Area** 
Try to calculate the area of the countries using the *original* `world` dataframe (EPSG:4326). You will likely get a warning. Why?  
Then, calculate the area using the `world_projected` dataframe.

In [None]:
# Attempt area calculation on original data
# Your code here

# Area calculation on projected data
# Your code here

**4.5. Visualization Comparison** 
Create a side-by-side plot using `matplotlib` subplots.  
* **Left Plot:** The map in its original CRS (WGS84).  
* **Right Plot:** The map in the projected CRS.  
* Add titles to each plot indicating the CRS used.

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 8))

# Plot original
# Your code here

# Plot projected
# Your code here

plt.show()