# From csv to slippy map: Intro to spatial data in R

Anna Spiers, CU Boulder  
08 June 2021
R Ladies Boulder Chapter Meetup

AIS sexy map here

**Learning objectives**  
1. Describe what distinguishes spatial data
2. Understand the basic differences between the `sp` and `sf` packages
3. Reproject 
4. 

### What is spatial data?

**AIS BIG IMAGE HERE OF ALL THE EXAMPLES**
Examples: field sites, rivers, county boundries, elevation maps, point clouds, ...

* Spatial data are data that contain information about a specific location in space, in relative (e.g., two trees are 2m apart) or absolute (e.g., on Earth) terms 
* Different types:
    * Vector: graphical respresentations of the world 
        * points, lines polygons
        * File type: .shp, .kml, .kmz, .gpx
    * Raster: grid of numbers
        * ex: digital elevation model (DEM), microscope image
        * File type: .tif
    * 3D: three-dimensional
        * point clouds
        * File type: .las, laz

### Vector data
* We will work with vector data in this tutorial 
**AIS another image of the vector data examples**
* Data types:
    * Points: field sites, state capitols, sensor locations
    * Lines: rivers, division in topography, flight line
    * Polygons: field plot, county boundaries, buffer around a point or line
* Vector data typically store a **geometry** (vertex coordinates) and **attributes** (metadata) for each feature
* File types
    * .shp - [ESRI](https://www.esri.com/en-us/home) developed the shapefile. Consists of several files, so zip them before sharing.
    * .kml/.kmz - Keyole Markup Language. The KML was developed by Keyhole, Inc. which was acquired by Google. It is a simple spatial data format that was originally designed for Google Earth, but now is compatible with non-Google products. A form of XML
    * .gpx - The geopackage is a more modern alternative to the shapefile. You can save different types of spatial data together in one gpx file. Often used in QGIS
    * Many more file types, but we'll stop here
* We will work with shapefiles because we want to work with attributes, and KMLs don't store feature attributes. **AIS IS THIS TRUE?**
    

### Case example
* Wog Wog Habitat Fragmentation Experiment
        **AIS IMAGE OF WOG WOG**
* Join me as a graduate student in an Ecology & Evolutionary Biology department. We have the coordinates for the invertebrate pitfall traps across the site. We want to make an interactive map of the field site. How do we do this? 
* High-level workflow
    1. CSV with coordinates of traps
    2. Convert CSV to SHP
    3. Plot SHP as map

### Getting started in R: spatial packages
* There are tons of spatial packages
* Two main ones are `sp` and `sf`.
* `sp`
    * First released on CRAN in 2005
    * Offers functionality to create and modify vector data and grids
    * Many (about 350) of R's spatial packages use `sp` as a dependency
    * Made up of 'slots' that describe the spatial object: always a bounding box and coordinate reference system (CRS), but also attributes (e.g., name, soil type, replicate)
    * Examples of `sp` object classes: `SpatialPoints`, `SpatialLines`, `SpatialLinesDataFrame`, `SpatialPolygonsDataFrame`, and more
    * To read in a shapefile: `read_OGR()`
* `sf`
    * Stands for 'Simple Features'
    * First released on CRAN in 2016
    * Implements the [WKT (well-known text)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) format standard
    * Examples of `sf` object classes: `POINT`, `LINESTRING`, `POLYGON`, and more
    * To read in a shapefile: `st_read()` - slightly faster than `read_OGR()`
* Data are structured and conceptualized differently between `sp` and `sf`. Other spatial packages may have different formatting standards and class definitions, but many are built as wrappers around the backbone of `sp` and increasingly more `sf`.

Load your csv and look at the data
convert to shp - with sp and sf
look at difference between sp and sf shapefile
save shapefile locally and see that it's 5 files

sp: SpatialLines to SpatialPointsDataFrame (with attributes)
sp: see resource for list of functions

sf



##### Mapping
3. Work through some advanced mapping techniques to show the range of versatility of mapping in R. There are plenty of examples online that offer a few cookie cutter ways to map your data basically, but showing some cool expansions or extra features that are harder to find would be a good use of time  

* geojson.io or google earth - use to extract polygons or lines
* https://www.vdatum.noaa.gov/ - if you need a niche CRS
* view raster data: see what scott sent
* view point cloud data: website from Leah's lesson
* just Leah's lessons in general
* what is a gpx file?
* store all spatial data in one QGIS project


* So you have a csv of coordinates, what do you do?: 
  * convert to shp/kml - **Basics of converting CSVs to shapefiles etc. and why**
  * but what CRS? 
    * global vs projected
    * Google maps coordiantes are in ?
    * best to do spatial analyses using a projection like your UTM zone (CO is 13N, Wog Wog is 55S) - include image of UTM zones 
* **Once the data is in working order, very brief basics of mapping **
* static maps... introduce this or no?
* interactive maps in R (tmap, leaflet). 
  * many ways to do this, but I will use `tmap`
  * *Reference CU Boulder libraries guide*
  * how to change which label is visible (change order in spdf)


##### Misc
* **Cool examples of the range of versatility, harder to find stuff, maybe some sticking points that took you a while to work through and the solution.**
* use naming convention - agreed upon by team or standard (e.g., remote sensing data: L0, L1, etc.)
* pay attention to CRS
  * recommended to do analysis in UTM, but mapping in WGS84 (e.g. tmap requires WGS84)
* best practices in storing data (save large data files locally)
* I recommend visualizing your data in a GIS first, like QGIS, as a sanity check. right crs? does this look right? do data nearby each other appear nearby?


##### Troubleshooting
* always ask stackoverflow and your spatial friends


##### Relevant topics that I will not cover
* conversion from proj4 to prO(theta)4 - when and why?
* working with Raster data (this will involve only vector data)
* QGIS stuff... how to share QGIS data? 

##### My resources for this presentation:
* https://cengel.github.io/rspatial/2_spDataTypes.nb.html
* https://www.nceas.ucsb.edu/sites/default/files/2020-04/OverviewCoordinateReferenceSystems.pdf
*
