### Introduction to Raster Data
<blockquote class="objectives">
  <h2>Overview</h2>

<div class="row">
    <div class="col-md-3">
      <strong>Teaching:</strong> 15 min
      <br>
      <strong>Exercises:</strong> 0 min
    </div>
<div class="col-md-9">
      <strong>Questions</strong>
<ul>
	
<li><p>What is a raster?</p>
</li>
	
<li><p>What are the main attributes of raster data?</p>
</li>
	
</ul>
    </div>
  </div>
  
<div class="row">
    <div class="col-md-3">
    </div>
    <div class="col-md-9">
      <strong>Objectives</strong>
<ul>
	
<li><p>Understand the raster data model</p>
</li>
	
<li><p>Describe the strengths and weaknesses of storing data in raster format.</p>
</li>
	
<li><p>Distinguish between continuous and categorical raster data and identify types of datasets that would be stored in each format.</p>
</li>
	
</ul>
    </div>
  </div>
  
</blockquote>
<h1 id="data-structures-raster-and-vector">Data Structures: Raster and Vector</h1>
<p>The two primary types of geospatial data are raster and vector data. 
Vector data structures represent specific features on the Earth’s surface, and assign attributes to those features. 
Raster data is stored as a grid of values which are rendered on a map as pixels. 
Each pixel value represents an area on the Earth’s surface.</p>
<h2 id="about-raster-data">About Raster Data</h2>
<p>Raster data is any pixelated (or gridded) data where each pixel is associated
with a specific geographical location. The value of a pixel can be
continuous (e.g. elevation) or categorical (e.g. land use).</p>
<p>If this sounds
familiar, it is because this data structure is very common: it’s how
we represent any digital image. A geospatial raster is only different
from a digital photo in that it is accompanied by spatial information
that connects the data to a particular location. This includes the
raster’s extent and cell size, the number of rows and columns, and
its coordinate reference system (or CRS).</p>

<p><img src="raster_concept.png" alt="Raster Concept"></p>

<p class="text-center">Source: National Ecological Observatory Network (NEON)</p>

<p>In the 1950’s raster graphics were noted as a faster and cheaper (but
lower-resolution) alternative to vector graphics.</p>

<p>Some examples of continuous rasters include:</p>

<ol>
  <li>Orthorectified multispectral imagery such as those acquired by <a href="https://landsat.usgs.gov">Landsat</a> or <a href="https://modis.gsfc.nasa.gov">MODIS</a> sensors</li>
  <li>Digital Elevation Models (DEMs) such as <a href="https://asterweb.jpl.nasa.gov/gdem.asp">ASTER GDEM</a></li>
  <li>Maps of canopy height derived from LiDAR data.</li>
</ol>

<p>A map of elevation for Harvard Forest derived from the <a href="http://www.neonscience.org/data-collection/airborne-remote-sensing">NEON AOP LiDAR sensor</a>
is below. Elevation is represented as continuous numeric variable in this map. The legend
shows the continuous range of values in the data from around 300 to 420 meters.</p>

<p><img src="rmd-01-elevation-map-1.png" alt="NEON AOP LiDAR Render"></p>

<p>Some rasters contain categorical data where each pixel represents a discrete
class such as a landcover type (e.g., “forest” or “grassland”) rather than a
continuous value such as elevation or temperature. Some examples of classified
maps include:</p>

<ol>
  <li>Landcover / land-use maps.</li>
  <li>Tree height maps classified as short, medium, and tall trees.</li>
  <li>Snowcover masks (binary snow or no snow)</li>
</ol>

<p>The following map shows the contiguous United States with landcover as categorical
data. Each color is a different landcover category. (Source: Homer, C.G., et
al., 2015, Completion of the 2011 National Land Cover Database for the
conterminous United States-Representing a decade of land cover change
information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p.
345-354)</p>

<p><img src="USA_landcover_classification_sm.jpg" alt="USA landcover classification"></p>

<p>The following map shows elevation data for the NEON Harvard Forest field
site. In this map, the elevation data (a continuous variable) has been divided
up into categories to yield a categorical raster.</p>

<p><img src="rmd-01-classified-elevation-map-1.png" alt="NEON Classified"></p>

<h2 id="advantages-and-disadvantages">Advantages and Disadvantages</h2>

<p>Advantages:</p>

<ul>
  <li>representation of continuous surfaces</li>
  <li>potentially very high levels of detail</li>
  <li>data is ‘unweighted’ across its extent - the geometry doesn’t implicitly highlight features</li>
  <li>cell-by-cell calculations can be very fast and efficient</li>
</ul>

<p>Disadvantages:</p>

<ul>
  <li>very large file sizes as cell size gets smaller</li>
  <li>can be difficult to represent complex information</li>
  <li>Measurements are spatially arranged in a regular grid, which may not be an
accurate representation of real-world phenomena.</li>
  <li>Space-filling model assumes that all pixels have value</li>
  <li>Changes in resolution can drastically change the meaning of values in a dataset</li>
</ul>

<h2 id="important-attributes-of-raster-data">Important Attributes of Raster Data</h2>

<p>A raster is just an image in local pixel coordinates until we specify what part
of the earth the image covers.  This is done through two primary pieces of metadata:</p>

<h3 id="extent">Extent</h3>

<p>The spatial extent is the geographic area that the raster data covers - a bounding box defined by the 
minimum and maximum x and y coordinates of the data.</p>

<p><img src="spatial_extent.png" alt="Spatial extent image"></p>

<p class="text-center">(Image Source: National Ecological Observatory Network (NEON))</p>

<blockquote class="challenge">
  <h2 id="extent-challenge">Extent Challenge<span class="fold-unfold glyphicon glyphicon-collapse-down"></span></h2>
<p>In the image above, the dashed boxes around each set of objects seems to imply that the three objects have the same extent. Is this accurate? If not, which object(s) have a different extent?</p>
  <p style="display: none;">In the image above, the dashed boxes around each set of objects 
seems to imply that the three objects have the same extent. Is this 
accurate? If not, which object(s) have a different extent?</p>

  <blockquote class="solution" style="display: none;">
    <h2 id="solution">Solution<span class="fold-unfold glyphicon glyphicon-collapse-down"></span></h2>

    <p style="display: none;">The lines and polygon objects have the same extent. The extent for
the points object is smaller in the vertical direction than the 
other two because there are no points on the line at y = 8.</p>
  </blockquote>
</blockquote>

<h3 id="resolution">Resolution</h3>

<p>A resolution of a raster represents the area on the ground that each
pixel of the raster covers. The image below illustrates the effect
of changes in resolution.</p>

<p><img src="raster_resolution.png" alt="Resolution image"></p>

<p class="text-center">(Source: National Ecological Observatory Network (NEON))</p>

<h3 id="coordinate-reference-system-crs">Coordinate Reference System (CRS)</h3>

<p>This specifies the datum, projection, and additional parameters needed to 
place the raster in geographic space.</p>

<p><img src="us_crs.jpg" alt="Maps of the United States in different projections"></p>

<p>For a dedicated lesson on CRSs, see: 
<a href="https://datacarpentry.org/organization-geospatial/03-crs/index.html">https://datacarpentry.org/organization-geospatial/03-crs/index.html</a></p>

<h3 id="affine-geotransformation">Affine Geotransformation</h3>

<p>This is the essential matrix that relates the raster pixel coordinates (rows, columns) to the geographic coordiantes (x and y defined by the CRS).</p>

<p>This is typically a 6-parameter matrix that defines the origin, pixel size and rotation of the raster in the geographic coordinate system:</p>

<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)
    Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)
</code></pre></div></div>

<p>You may have encountered an <a href="https://en.wikipedia.org/wiki/World_file">ESRI World file</a>, which defintes this matrix.</p>

<p>For more information about the common GDAL data model used by most GIS applications:
<a href="https://www.gdal.org/gdal_datamodel.html">https://www.gdal.org/gdal_datamodel.html</a></p>

<h2 id="raster-data-format">Raster Data Format</h2>

<p>Raster data can come in many different formats. For this workshop, we will use
the GeoTIFF format which has the extension <code class="highlighter-rouge">.tif</code>. A <code class="highlighter-rouge">.tif</code> file stores metadata
or attributes about the file as embedded <code class="highlighter-rouge">tif tags</code>. For instance, your camera
might store EXIF tags that describes the make and model of the camera or the date
the photo was taken when it saves a <code class="highlighter-rouge">.tif</code>. A GeoTIFF is a standard <code class="highlighter-rouge">.tif</code> image
format with additional spatial (georeferencing) information embedded in the file
as tags. These tags should include the following raster metadata:</p>

<ol>
  <li>Geotransform (defines extent, resolution)</li>
  <li>Coordinate Reference System (CRS)</li>
  <li>Values that represent missing data (<code class="highlighter-rouge">NoDataValue</code>)</li>
</ol>

<p>Spatially-aware applications are careful to interpret this metadata
appropriately.  If we aren’t careful (or are using a raster-editing application
that ignores spatial information), we can accidentally strip this spatial
metadata.  Photoshop, for example, can edit GeoTiffs, but we’ll lose the embedded
CRS and geotransform if we save to the same file!</p>

<blockquote class="callout">
  <h2 id="more-resources-on-the--tif-format">More Resources on the  <code class="highlighter-rouge">.tif</code> format</h2>

  <ul>
    <li><a href="https://en.wikipedia.org/wiki/GeoTIFF">GeoTIFF on Wikipedia</a></li>
    <li><a href="https://trac.osgeo.org/geotiff/">OSGEO TIFF documentation</a></li>
  </ul>
</blockquote>

<h2 id="multi-band-raster-data">Multi-band Raster Data</h2>

<p>A raster can contain one or more bands. One type of multi-band raster
dataset that is familiar to many of us is a color
image. A basic color image consists of three bands: red, green, and blue.</p>

<p><img src="RGBSTack_1.jpg" alt="RGB multi-band raster image"></p>

<p class="text-center">(Source: National Ecological Observatory Network (NEON).)</p>

<p>Each band represents light reflected from the red, green or blue portions of
the electromagnetic spectrum. The pixel brightness for each band, when
composited creates the colors that we see in an image.</p>

<p class="text-center"><img src="ETM+vOLI-TIRS-web_Feb20131_sm.jpg" alt="Bands in Landsat 7 (bottom row of rectangles) and Landsat 8 (top row)">
(Source: L.Rocchio &amp; J.Barsi)</p>

<p>We can plot each band of a multi-band image individually.</p>

<p><img src="rmd-01-demonstrate-RGB-Image-1.png" alt="RGB individual bands"></p>

<p>Or we can composite all three bands together to make a color image.</p>

<p><img src="rmd-01-plot-RGB-now-1.png" alt="RGB composite"></p>

<p>In a multi-band dataset, the rasters will always have the same extent,
resolution, and CRS.</p>

<blockquote class="callout">
  <h2 id="other-types-of-multi-band-raster-data">Other Types of Multi-band Raster Data</h2>

  <p>Multi-band raster data might also contain:</p>

  <ol>
    <li><strong>Time series:</strong> the same variable, over the same area, over time.</li>
    <li><strong>Multi or hyperspectral imagery:</strong> image rasters that have 4 or
more (multi-spectral) or more than 10-15 (hyperspectral) bands</li>
  </ol>
</blockquote>

<ul>
  <li>Much of this lesson was adapted from: <a href="https://datacarpentry.org/organization-geospatial/01-intro-raster-data/index.html">https://datacarpentry.org/organization-geospatial/01-intro-raster-data/index.html</a></li>
</ul>

<blockquote class="keypoints">
<h2>Key Points</h2>
  <ul>
    
<li><p>Raster data is pixelated data where each pixel is associated with a specific location.</p>
</li>
    
<li><p>Raster data always has an extent and a resolution.</p>
</li>
    
<li><p>The extent is the geographical area covered by a raster.</p>
</li>
    
<li><p>The resolution is the area covered by each pixel of a raster.</p>
</li>
    
  </ul>
</blockquote>      
<hr>

This notebook is inspired by the material in the website <a href="https://geohackweewebsite k.github.io/raster/04-workingwithrasters/"> GeoHackWeek </a>

</body>
