## Lab 2 Point Pattern Analysis

In this lab we will perform point pattern analysis using R. Similar to previous labs, we will use a set of libraries and then create charts as well as maps of our data.

## Loading libraries and our earthquakes data

Let's first import our data and load up our libraries.

We will use a library called **readr**.
Our data is in the `data` directory and it is titled `earthquakes.csv`

First, write a line of code below that imports the `readr` library.
Remember that you can import libraries via:
`library(name_of_library)`

In [8]:
# Line of code to import readr


Next, write a line of code below to read in the earthquake data.

Remember that in R, we name the variable on the left hand side and then use a function to read in the data.

The structure is like this:

    variable_name <- read_csv('file location')

In our case, the file is `earthquakes.csv` and the file location is `data/earthquakes.csv`

In [11]:
# Line of code to make a variable called quakes and read in csv from 'data/earthquakes.csv'


Parsed with column specification:
cols(
  CUSP_ID = col_integer(),
  NZMGE = col_integer(),
  NZMGN = col_integer(),
  ELAPSED_DAYS = col_double(),
  MAG = col_double(),
  DEPTH = col_double(),
  YEAR = col_integer(),
  MONTH = col_integer(),
  DAY = col_integer(),
  HOUR = col_integer(),
  MINUTE = col_integer(),
  SECOND = col_double()
)


You should see also see a data table appear in red as part of the output. The data appear very similar to a spreadsheet. 

In R, data tables are known as dataframes and each column is an attribute or variable.
The various variables that appear in the table are:

    CUSP_ID a unique identifier for each earthquake or aftershock event
    NZMGE and NZMGN are New Zealand Map Grid Easting and Northing coordinates
    ELAPSED_DAYS is the number of days after September 3, 2010, when the big earthquake was recorded
    MAG is the earthquake or aftershock magnitude
    DEPTH is the estimate depth at which the earthquake or aftershock occurred
    YEAR, MONTH, DAY, HOUR, MINUTE, SECOND provide detailed time information

## Examining the Data

Now, if we want to use R to do some statistics, these data are stored in a variable named quakes (in my example, you may have called it something different). 

I can refer to columns in the dataframe by calling them `quakes$MAG` (note the `$` sign).

In [15]:
# Get the mean of earthquake magnitudes
mean(quakes$MAG)

In [16]:
# Write a line of code here to get the mean of earthquake depths (the DEPTH column)

The boxplot function returns a helpful boxplot of our data. 
For example, a box plot of earthquake depth is made via:
`boxplot(quakes$DEPTH)`

Make a boxplot of earthquake magnitude below:

In [19]:
# Line of code to make boxplot of earthquake magnitude

In [21]:
# The hist() function accepts a column of data, similar to boxplot(). 
# Try out the hist() function here and make a histogram of earthquake magnitudes.

It gets tedious typing quakes all the time, so you can attach the dataframe
so that the variable names are directly accessible without the quakes$
prefix by typing
> attach(quakes)

In [20]:
# Attach the dataframe here

In [22]:
# Try calling the hist() function on just MAG instead of quakes$MAG here

You can make a simple map of all the data by plotting the NZMGE
variable as the x (i.e. horizontal axis) and NZMGN as the y axis of a
scatterplot:
> plot(NZMGE, NZMGN)

Because R is not a GIS it doesn’t automatically know about things
like projections, so this is a very crude map. For example, if you
resize the plot window it independently rescales the east-west and
north-south directions, which is not helpful for a map. To prevent it
doing this we can specify an option to the plot command requiring
the aspect-ratio to be fixed at 1:

> plot(NZMGE, NZMGN, asp=1)

To see if there is a relationship between earthquake
depth and magnitude, try this:
> plot(DEPTH, MAG)

In [None]:
# Try out plotting depth and mag

and because R is a statistics package, we can easily fit and plot a
simple regression model to the data:

> regmodel <- lm(MAG ~ DEPTH)

> abline(regmodel, col='red')

## Point Processes in R

The major strength of R is the wide range of packages that have been developed for performing specialized statistical analysis (like spatial analysis). For point patterns, the go-to package is `spatstat`.

To use spatstat, we have to import it similar to the library(readr) function above. 
Below, import the spatstat package

In [None]:
# Line of code to import spatstat

The user guide for **spatstat** is here: http://www.stats.uwo.ca/faculty/kulperger/S9934a/Papers/SpatStatIntro.pdf

For our purposes, what you really need to know is that spatstat is the most powerful tool around for
doing point pattern analysis.
    
Consequently you can use spatstat to generate a wide variety of point patterns using a range of point
processes. Among these are the following R commands:
    
    rpoint() – generates a specified number of random uniform distributed points
    rpoispp() – generates random points distributed according to some specified intensity pattern
(which may be uniform)
    rThomas() – generates clustered points via a ‘parent-child’ process
    rSSI() – generates points that exclude one another from being with some specified distance of
each other and many others. 
    
    You can see the full list in the general help page for the package in the subsection entitled To simulate a random point pattern.