# Intro to Spatial Data Loading and Wrangling

Load the following libraries to get started.
```r
library(sf)
library(dplyr)
```
Download the Bus Stop Shapefile from https://data.cityofchicago.org/Transportation/CTA-Bus-Stops-Shapefile/pxug-u72f

Move it to your Working Directory and load it in R using the st_read function.

```r
unzip(zipfile = "CTA_BusStops.zip")
Chi_stops = st_read(dsn = "CTA_BusStops.shp")
```
## Subsetting

First we can create a basic plot of the bus stop's geometry. 
```r
plot(Chi_stops)
```

There are too many stops to be meaningful. To see N/S route stops, we can filter bus stops by northbound direction.

```r
NB_s = Chi_stops %>% filter(DIR == "NB")
plot(NB_s$geometry, cex=.5)
```

## Joins
Let's add ridership data from https://data.cityofchicago.org/Transportation/CTA-Ridership-Avg-Weekday-Bus-Stop-Boardings-in-Oc/mq3i-nnqe

```r
ridership <- read.csv("CTA_-_Ridership_-_Avg._Weekday_Bus_Stop_Boardings_in_October_2012.csv")
```

To see attributes from both datasets we will do a full join:

```r
joinstop <- full_join(Chi_stops, ridership, by = c("OBJECTID" = "stop_id"))
plot(joinstop["boardings"], breaks="quantile")
```

In the above code, OBJECTID from the shapefile is equivlent to stop_id from the csv, so we select to join by those fields. We plot the data using quantile breaks, and the legend shows that some outliers exist.

## Buffers
When subetting a smaller region, buffers are a useful tool. The distance is set in meters. With this many points, an example such as this does not plot in an inerpretable way. Learning to geometrically subset data with bounding boxes is the next step.

```r
buf <- st_buffer(joinstop, dist=100)
```