-
Notifications
You must be signed in to change notification settings - Fork 0
/
som.Rmd
86 lines (66 loc) · 3.57 KB
/
som.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
title: "Self-organising map (SOM) analysis"
author: "Robert Schlegel"
date: "2019-06-04"
output: workflowr::wflow_html
editor_options:
chunk_output_type: console
csl: FMars.csl
bibliography: MHWNWA.bib
---
```{r global_options, include = FALSE}
knitr::opts_chunk$set(fig.width = 8, fig.align = 'center',
echo = TRUE, warning = FALSE, message = FALSE,
eval = TRUE, tidy = FALSE)
```
## Introduction
This vignette contains the code used to perform the self-organising map (SOM) analysis on the mean synoptic states created in the [Variable preparation](https://robwschlegel.github.io/MHWNWA/var-prep.html) vignette. We'll start by creating custom packets that meet certain experimental criteria before then feeding them into a SOM. We will finish up by creating some cursory visuals of the results. The full summary of the results may be seen in the [Node summary vignette](https://robwschlegel.github.io/MHWNWA/node-summary.html).
```{r libraries}
# Insatll from GitHub
# .libPaths(c("~/R-packages", .libPaths()))
# devtools::install_github("fabrice-rossi/yasomi")
# Packages used in this vignette
library(jsonlite, lib.loc = "../R-packages/")
library(tidyverse) # Base suite of functions
library(lubridate) # For convenient date manipulation
library(yasomi, lib.loc = "../R-packages/") # The SOM package of choice due to PCI compliance
library(data.table) # For working with massive dataframes
# Load functions and objects to be used below
source("code/functions.R")
```
## Data packet
In this last stage before running our SOM analysis we will create a data packet that can be fed directly into the SOM algorithm. This means that it must be converted into a super-wide matrix format. In the first run of this analysis on the NAPA model data it was found that the inclusion of the Labrador Sea complicated the results quite a bit. It was also unclear whether or not the Gulf of St Lawrence region should be included in the analysis. So in the second run of this analysis multiple different SOM variations were employed and it was decided that the gsl region should be included.
### Unnest synoptic state packets
Up first we must simply load and unnest the synoptic state packets made previously.
```{r unnest-packets, eval=FALSE}
# Load the synoptic states data packet
system.time(
synoptic_states <- readRDS("data/SOM/synoptic_states.Rda")
) # 3 seconds
# Unnest the synoptic data
system.time(
synoptic_states_unnest <- synoptic_states %>%
select(region, event_no, synoptic) %>%
unnest()
) # 8 seconds
```
### Create packet
With all of our data ready we may now prepare and save them for the SOM.
```{r create-SOM-packet, eval=FALSE}
# Packet for entire study region
system.time(
packet <- wide_packet_func(synoptic_states_unnest)
) # 187 seconds
saveRDS(packet, "data/SOM/packet.Rda")
```
## Run SOM models
Now that we have our data packet to feed the SOM with a function that ingests them and produces results for us. The function below has been greatly expanded on from the previous version of this project and now performs all of the SOM related work in one go. This allowed me to remove a couple hundreds lines of code and text from this vignette.
```{r som-run, eval=FALSE}
# The SOM on the entire study area
packet <- readRDS("data/SOM/packet.Rda")
system.time(som <- som_model_PCI(packet)) # 132 seconds
# som$ANOSIM # p = 0.001
saveRDS(som, file = "data/SOM/som.Rda")
```
And there we have our SOM results. Up next in the [Node summary vignette](https://robwschlegel.github.io/MHWNWA/node-summary.html) we will show the results with a range of visuals.
## References