# Class on Spatial Data Analysis (Python and R)
**Duration**: 2 hours

## 0. Environment Setup (15 minutes)

### Python Setup on Ubuntu:
1. **Install Python**:
- Ubuntu typically comes with Python pre-installed. Check with:
```bash
python3 --version
```
- If Python isn't installed, use:
```bash
sudo apt update
sudo apt install python3 python3-pip
```

2. **Install Required Libraries**:
- Install spatial libraries:
```bash
pip3 install geopandas shapely folium matplotlib pysal
```

3. **Jupyter Notebook Installation**:
- Install Jupyter to run notebooks:
```bash
pip3 install notebook
```
- Launch Jupyter Notebook:
```bash
jupyter notebook
```

### R Setup on Ubuntu:
1. **Install R**:
- Install R with:
```bash
sudo apt update
sudo apt install r-base
```

2. **Install Required Packages**:
- In R, run:
```r
install.packages(c('sf', 'sp', 'spdep', 'ggplot2', 'leaflet'))
```

### Setting Up R Kernel in Jupyter Notebooks (for Visual Studio Code):
1. **Install IRkernel** in R:
- Run the following in R:
```r
install.packages('IRkernel')
IRkernel::installspec(user = FALSE)  # Registers the R kernel globally
```

2. **Install Jupyter Support in Visual Studio Code**:
- Install the 'Jupyter' and 'R' extensions from the Visual Studio Code Marketplace.

3. **Running R in Jupyter via Visual Studio Code**:
- Open a Jupyter notebook in Visual Studio Code.
- Select the R kernel from the notebook interface to run R commands in Jupyter.

## 1. Introductions and Theoretical Explanation (45 minutes)

### Introduction to Spatial Data (15 min):
- **What is Spatial Data?**:
  - Spatial data refers to information tied to a geographic location, represented by points, lines, and polygons.
  - **Types of Spatial Data**:
    - **Vector Data**: Points, lines, and polygons (e.g., cities, roads).
    - **Raster Data**: Grids, often used in satellite imagery or terrain data.
- **Spatial Data Formats**:
  - Shapefiles, GeoJSON, and CSV with coordinates are common formats.
- **Coordinate Reference Systems (CRS)**:
  - **WGS84 (EPSG:4326)** is the most common CRS for global data.
  - Another example is **UTM**, which uses meters for large-scale projections.

### Autocorrelation and Heterogeneity in Spatial Data (10 min):
- **Autocorrelation**:
  - Spatial autocorrelation occurs when nearby points have similar values. Positive autocorrelation means nearby points are similar, while negative means they are different.
  - **Moran’s I** is a statistical measure for global spatial autocorrelation:

  **Python**:
```python
from pysal.explore.esda.moran import Moran
moran = Moran(gdf['column_name'], w)
print(moran.I)
```

  **R**:
```r
library(spdep)
moran <- moran.test(gdf$column_name, nb2listw(neighbours))
print(moran)
```

### Spatial Visualization (10 min):
- **Importance of Visualization**:
  - Maps and visual plots help identify patterns, clusters, and trends in spatial data.
- **Python Example using Folium**:
```python
import folium
m = folium.Map(location=[-15.793889, -47.882778], zoom_start=12)
folium.Marker([-15.793889, -47.882778], popup='Brasília').add_to(m)
m.save('map.html')
```

- **R Example using Leaflet**:
```r
library(leaflet)
m <- leaflet() %>%
  addTiles() %>%
  addMarkers(lng = -47.882778, lat = -15.793889, popup = 'Brasília')
m
```

### Introduction to Spatial Regression (10 min):
- **What is Spatial Regression?**:
  - Spatial regression models the relationship between variables while accounting for spatial dependencies.
- **Python** Example:
```python
from sklearn.linear_model import LinearRegression
model = LinearRegression().fit(X, y)
print(model.coef_)
```

- **R** Example:
```r
model <- lm(y ~ x1 + x2, data = df)
summary(model)
```

## 2. Classroom Exercises (45 minutes)

### Exercise 1: Load and Plot Spatial Data
- **Objective**: Load a shapefile or GeoJSON file and plot the data.
- **Python**:
```python
gdf.plot()
```
- **R**:
```r
plot(gdf$geometry)
```

### Exercise 2: Spatial Autocorrelation (Moran's I)
- **Objective**: Use Moran’s I to test for spatial autocorrelation.
- **Python**:
```python
moran = Moran(gdf['column'], w)
print(moran.I)
```
- **R**:
```r
moran.test(gdf$column, listw)
```

### Exercise 3: Simple Spatial Regression
- **Objective**: Perform a linear regression on spatial data.
- **Python**:
```python
model = LinearRegression().fit(X, y)
print(model.coef_)
```
- **R**:
```r
model <- lm(y ~ x1 + x2, data = df)
summary(model)
```

### Exercise 4: Interactive Mapping
- **Objective**: Create an interactive map.
- **Python**:
```python
folium.Map(location=[lat, lon], zoom_start=12)
```
- **R**:
```r
leaflet() %>% addTiles() %>% addMarkers(lng=lon, lat=lat)
```

## 3. Solving the Exercises (30 minutes)

- **Review and Discussion**:
  - Go over the solutions for each exercise.
  - Discuss key takeaways such as:
    - The importance of aligning coordinate reference systems (CRS).
    - Understanding Moran's I output for autocorrelation.
    - Interpreting regression coefficients for spatial data.