# Spatial Modeling and Analytics

### Segment 3 of 4
## Spatial Data Frames

In [None]:
# This code cell starts the necessary setup for Hour of CI lesson notebooks.
# First, it enables users to hide and unhide code by producing a 'Toggle raw code' button below.
# Second, it imports the hourofci package, which is necessary for lessons and interactive Jupyter Widgets.
# Third, it helps hide/control other aspects of Jupyter Notebooks to improve the user experience
# This is an initialization cell
# It is not displayed because the Slide Type is 'Skip'

from IPython.display import HTML, IFrame, Javascript, display
from ipywidgets import interactive
import ipywidgets as widgets
from ipywidgets import Layout
import warnings
import getpass # This library allows us to get the username (User agent string)

# import package for hourofci project
import sys
sys.path.append('../../supplementary') # relative path (may change depending on the location of the lesson notebook)
import hourofci


# load javascript to initialize/hide cells, get user agent string, and hide output indicator
# hide code by introducing a toggle button "Toggle raw code"
HTML(''' 
    <script type="text/javascript" src=\"../../supplementary/js/custom.js\"></script>
    
    <style>
        .output_prompt{opacity:0;}
    </style>
    
    <input id="toggle_code" type="button" value="Toggle raw code">
''')

In [None]:
%load_ext rpy2.ipython

## R Spatial Data Types

If you conmpleted the introductory lesson on Geospatial Data, you already know the two most common types of spatial data:
- Rasters (grids of values)
- Vectors (points, lines and polygons)

In R, these different kinds of data are handled by different packages
- Raster data use the <code>raster</code> package
- Vector data use the <code>sp</code> and <code>sf</code> packages
    - sp and sf represent spatial information efficiently as spatial data frames
    - these data are accepted by most R packages that require spatial data frames

## Raster Data Representation in R

Raster data have Information + Attributes
- Attributes are the data values associated with the grid cells

Necessary raster information includes:
- nrows : Number of rows
- ncols: Number of columns
- nbands: Number of bands
- Extent: Spatial extent
- Projection: Projection information

## Raster Data Representation in R

There are three types of raster objects
- RasterLayer: Single variable (band) rasters<br>
<img src ='supplementary/red.png' width=53>
- RasterStack: Multi variable (band) rasters. Data can reside in different files on disk. Preferred for flexibility.<br>
<img src ='supplementary/on_disk1.png' width=130>
- RasterBrick: Multi variable (band) rasters. Data has to be from a single file on disk. Preferred for performance.<br>
<img src ='supplementary/on_disk2.png' width=130>

## A Closer Look at a Raster Data Frame

Due to data-volume, R displays only header information on rasters

```r
class      : RasterBrick 
dimensions : 9545, 9340, 89150300, 3         (= nrow, ncol, ncell, nlayers)
resolution : 30, 30                          (= x, y)
extent     : -2210387, -1930187, 1556342, 1842692  (= xmin, xmax, ymin, ymax)
crs        : NA 
source     : memory
names      : Band_1, Band_2, Band_3 
min values :      0,      0,      0 
max values :  65535,  65535,  65535 
```

The utility function <code>values()</code> can be used to access raster values

## Spatial Vectors in R: the sf standard

While both <code>sp</code> and <code>sf</code> data frames are used in spatial R, the <code>sf</code> package is newer and is used whenever possible
- the sf package, uses the simple features (sf) standard of the Open Geospatial Consortium (OGC - https://www.ogc.org/)

Simple Features are:
1. Point
1. Polygon
1. Linestring
1. Multipoint
1. Multipolygon
1. Multilinestring
1. Geometrycollection 

## Spatial Vector Data Representation in R

Same pattern as raster data: Geometry Info + Attributes
- sf and sp only differ in how they represent geometry information

Once vector data is in sf and sp format, you can
- use spatial operators such as join, dissolve, merge, etc.
- reproject the data (change projection system)
- spatially subset the data

## A Closer Look at a Point Vector Data Frame
Here is an example sf data frame containing a **point** dataset

<center><img src = 'supplementary/vector_df2.png' width = 100%></center>

## A Closer Look at a Polygon Vector Data Frame
Here is an example sf data frame containing a **polygon** dataset

<center><img src = 'supplementary/vector_df.png' width = 100%></center>

Now we're ready to work with some geospatial data using R. 

<font size="+1"><a style="background-color:blue;color:white;padding:12px;margin:10px;font-weight:bold;" 
href="sma-tryit-raster.ipynb">Click here to go to the next notebook.</a></font>
