160715_Paper_draft.Rmd

---
title: "Mapping Social Ecological Systems Archetypes"
author: "Juan Rocha, Katja Malmborg, Line Gordon, Kate Brauman and Fabrice DeClerk"
date: "7 September 2016"
output:
  word_document: null
  pdf_document:
    citation_package: natbib
    toc: yes
    toc_depth: 2
fontsize: 11pt
cls: pnastwo.cls
bibliography: MappingSES_bibliography.bib
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, dev = 'pdf')

## load libraries
library  ('gdata')
library ('dplyr')
library ('reshape')
library (corrgram)
library(vegan)
library (ggplot2)

# load libraries for mapping
library(maptools)
library(rgeos)
library(RColorBrewer)

# load libraries for clustering
library (vegan)
# library(rgl)
# library(cluster)
library(NbClust)
library(clValid)
# library(MASS)
library(kohonen)

```

## Outline
The manuscript is written with the PNAS audience in mind (max ~3500 words). It should focus on the method development, how it is useful on the context of poverty alleviation and food security, and how it can be applied to other developing contexts. Roughly:

* Abstract 200w
* Intro and problem setting 1000w
    + SDG's: how do we target, measure and monitor progress in developing context?
    + volta river basing as case study
* Method 500w (it's a methods paper so it should have a methods section, but technical details should go at the end)
    + In a nutshell it's a clustering problem: how others have done it
    + Space and time
* Results 500w
    + Figure 1. Clustering all: an schema of the SES framework + clusters based on full data
    + Figure 2. the cube
    + Figure 3. Interactions vs others * 12 = the movie -> change of archetype over time?
* Discussion 1000w
* Conclusion 300w
* Methods
    + Data
    + Clustering & mapping
    + Temporal trends
* Refs 30max
* Appendix / complementary material
    + App1. Master table (var_name, units, source, time period, Ostrom match)
    + App2. All variables mapped with normalized and raw distribution
    + App2. Correlogram

-------


## Abstract
Achieving sustainable development goals requires targeting and monitoring sustainable solutions tailored to different social and ecological contexts. Elinor Ostrom stressed that there is no panaceas or universal solutions to environmental problems, and developed a social-ecological systems' (SES) framework -a nested multi tier set of variables- to help diagnose problems, identify complex interactions, and solutions tailored to each SES arena. However, to our knowledge, the SES framework has only been applied to over a hundred cases, and typically reflect the analysis of local case studies with relatively small coverage in space and time. While case studies are context rich and necessary, their conclusions might not reach policy making instances. Here we develop a data driven method for upscaling Ostrom's SES framework and applied to a context where we expect data is scarce, incomplete, but also where sustainable solutions are badly needed. The purpose of upscaling the framework is to create a tool that facilitates decision making on data scarce environments such as developing countries. We mapped SES by applying the SES framework to poverty alleviation and food security issues in the Volta River basin in Ghana and Burkina Faso. We found archetypical configurations of SES in space given data availability, we study their change over time, and discuss where agricultural innovations such as water reservoirs might have a stronger impact at increasing food security and therefore alleviating poverty and hunger. We conclude outlining how the method can be used in other SES comparative studies.

## Introduction
Zero hunger and no poverty are the first two sustainable development goals [@Assembly:2015th]. Together with clean water and sanitation, they conform the most basic needs of human beings. Understanding how societies and ecosystems self-organize to provide these goods and services, to meet these basic needs, is a core challenge of sustainability science. Countries around the world have agreed on pursuing 17 sustainable development goals; achieving them, however, requires targeting and monitoring solutions that fit each different social-ecological contexts [@Griggs:2013cm]. These countries need to meet this challenge by understanding the diversity and dynamics of social and ecological characteristics of their territories. But data to meet these demands are not always available or even collected, and methods development for data scarce developing countries is imperative. There is a need for mapping social-ecological systems (SES) to better understand their differences, context dependent attributes, and most importantly the generalizability of sustainable solutions.

Nobel prize winner Elinor Ostrom advocated embracing social-ecological complexity. Ostrom recognized that there is no universal solution to problems of destruction of overuse of natural resources [@Ostrom:2007je] and further developed a Social-Ecological Systems' (SES) framework hoping that it will helps us accumulate knowledge and better understanding of what works and what does not in different SES arenas [@Ostrom:2009p6785]. The SES framework is a nested multi tier set of variables that has been suggested as features that characterize distinctive aspects of SES. The SES framework has been typically applied to local case studies that cover relatively small areas and short periods of time (documented in two publicly available datasets: the [SES Library](https://seslibrary.asu.edu)  and the [SES Meta Analysis Database](https://sesmad.dartmouth.edu)). Over a hundred case studies have been coded in these databases, however the scale at which they are coded make it hard to bring their lessons to policy making relevant arenas.

The purpose of this paper is to develop a data driven method to upscale Ostrom's SES framework. It is targeted to developing countries, where available data is restricted in quality and monitoring programs might not be on place. As a working example we studied the Volta River basin, a cross national watershed that covers roughly two thirds of Burkina Faso's and Ghana's joint territories. The West African Sahel is a highly vulnerable area due to wide-spread of poverty, recurrent droughts and dry spells, political upheaval, emergent diseases (e.g. ebola), rapid urbanization, and growing food demand [@Lambin:2014eg]. The region offers a sharp gradient in climate, humidity, as well as a strong gradient of economic development, from the relatively rich southern Ghanaian urban regions in the south to the poor regions of northern Burkina Faso dominated by smallholder agriculture and pastoral systems. In particular, we will look at how the construction of water reservoirs might have an impact on food security, a key sustainable development goal [_I don't want to put here poverty reduction or clean water since we don't really have data for it_]. The following section outlines how we operationalize the Ostrom's SES framework to the scale of the Volta River basin based on publicly available data and national statistics. Next we describe the SES archetypes found, how they change over time and how reservoir development explain the trends. We then discuss the overall results and the applicability of our methods to other developing context.  

[_Needed?_]

>* _Elaborate on the SES framework - what variables from volta were coded_
>* _Volta basin and sub-saharan SES context_

## Method summary: mapping SES archetypes
Identifying SES archetypes from data is in essence a clustering problem. A plethora of methods exist to perform clustering analysis, but before explaining the details of our choices, first we present a brief review of what others have done when trying to characterize SES.

The idea that SES are intertwined and interdependent systems is not new: SES are human and natural coupled systems where people interact with natural components; they often exhibit nonlinear dynamics, reciprocal feedback loops, time lags, heterogeneity and resilience [@Liu:2007dq]. It has been suggested that complex adaptive systems, such as SES, should leave statistical signatures on social and ecological data that allows pattern identification of typologies and makes possible to follow their spatial patterns as well as trajectories through time [@Holland:2012ur; @Levin:2000vk]. Earlier efforts to map SES have been more general in purpose, and global in scale, such as the attempt to identify Anthromes (“human biomes”) [@Ellis:2010p6408; @Ellis:2008p314], or general land system archetypes [@SurendranNair:2016tz; @Vaclavik:2013cs; @Ropero:2015bp]. Reflecting on global consequences of land use, Foley _et al._ [-@Foley:2005p190] proposed a conceptual framework for bundles of ecosystem services, the idea that landscape units can be classified by the sets of goods and services that a SES co-produces, or more generally, a set of social-ecological interactions. The idea of mapping bundles of ecosystem services has gained empirical support with studies that range from the watershed to national scales in Canada [@RaudseppHearne:2010p5327; @Renard:2015ew], Sweden [@Meacham:2016cy; @Queiroz:2015cv], Germany [@Rabe:2016hp], and South Africa [@Hamann:2015cx]. Similar ordination methods has also been used to study regime shifts from foraging to farming societies in ancient SES [@Ullah:2015eb].

Despite the differences in purpose, scale, resolution and datasets used, what the aforementioned studies have in common is that they attempt to map SES by combining multivariate methods of ordination and clustering algorithms to find out i) systems' typologies and ii) potential underlying variables of change. Here we follow a similar rationale but using Ostrom's SES framework. In Ostrom's parlance a SES has 6 key subsystems: i) resource users (RU), resource system (RS), governance system (GS), users (U), interactions (I) and outcomes (O); all framed by social, economic and political settings (S) as well as by related ecosystems (ECO). Each of these subsystems have a nested second tier of variables aimed to capture key features of the first tier. First we collected publicly available datasets that matched as proxies of any of the Ostrom's variables but at the second administrative level for Ghana and Burkina Faso, namely districts and provinces respectively (See Methods). Figure 1 summarizes the framework and the proxies used, while Appendix 1 present a table with data sources, units and other relevant metadata.

Since our analysis focus on food security issues the defining key interaction (I) of our SES characterization is crop production. It is a dummy variable that reflect both the capacity of the ecosystem to provide food services, but also the human labor and preferences necessary as input to co-produce the service. Crop data, both production and cropped area, were collected from national statistical bureaux. While data does exist for 32 crops from 1993-2012, here we only used 7 crops [_10 crops on Katjas table?; in addition do we include here cattle?_] with minimum missing values, and the last 5 year averages to correct for outliers. Users (RU and S) were here characterized by national census statistics and their change over the period 1996-2006 for Burkina Faso and 2000-2010 for Ghana. The ecological system (ECO) is characterized by biophysical variables from CRU [_Katja clarify here_] that summarizes aridity, mean temperature, precipitation and slope. The resource system (RS) is a combination of variables that facilitate or not agriculture (our key interaction) such as the percentage of area dedicated to crops, presence of water reservoirs, tree cover, cattle density or market access. All data is standardized to the range 0:1.

>* _Normalization_

>* _Clustering SES archetypes: routine to find out optimal number of clusters_

To test the optimal number of clusters, 30 different algorithms were compared following the protocol described by Charrad et al [-@Charrad:2014tp]. We further test the internal validation and stability validation of 9 different clustering techniques: hierarchical clustering, self-organizing maps, k-means, partitioning around medoids, divisible hierarchical algorithm _diana_,  a sampling based clustering _clara_, a fuzzy clustering _fanny_, self-organizing trees _sota_, and model-based algorithm [@Brock:2008vz]. Results are typically presented as non-metric multi dimensional ordinations and their spacial distribution in maps of the Volta basin.

We further investigate the interdependences between Ostrom's nested variables, by reiterating the ordination on a set of variables of interest (e.g. interactions [I]) and performing vector fitting with the remaining variable sets (e.g. resource system, users, biophysical). Temporal trends were explored only on the interaction dataset.

```{r data, cache = TRUE, echo = FALSE, error=FALSE, warning=FALSE, message=FALSE, results = 'hide'}
setwd("~/Documents/Projects/TAI/TransformedData/Data_Burkina&Ganha")
# load map for visualizations and data
volta.shp <- readShapeSpatial("~/Documents/Projects/TAI/TransformedData/Bundling/Volta_bundling1.shp")

##### Useful functions for later:

### Select only districts on the Volta Basin
volta.only <- function(x){
  x <- x [ is.element (x$TAI_ID1, volta.shp@data$TAI_ID1 ), ]
  x <- droplevels(x)
}

### Normalizing by scaling everything to 0:1
rescale_hw <- function (x){ # This is Hadley Wickman function from his book R for DataScience
  rng <- range(x, na.rm= TRUE, finite = TRUE)
  (x - rng[1]) / (rng[2] - rng[1])
}

### Correct the numeric format from excel files
correct.num <- function (x){
  l <- length(x)
  x[3:l] <- apply (x[3:l], 2, function (x){as.numeric(x)} )
  return(x)
}

## Read data
#### New file with raw values from Katja: 160901

file <- 'Volta_Vars_raw_160901.xlsx'
sn <- sheetNames(xls=file)

users <- read.xls(file, sheet=2, dec=',')
users <- correct.num(users)

resource <- read.xls (file, sheet=3, dec=',')
biophys <- resource [c(1,2,11:15)]
resource <- resource [-c(11:15)]
biophys <- correct.num(biophys)
resource <- correct.num (resource)
# biophys <- read.xls(file, sheet=3, dec=',')

# Threre is 3 datasets for interactions depending on the sheet
# 4 = kcals per distric area in sq.km
# 5 = kcals_harv.area
# 6 = ratio_harv.area
# 7 = other
interact <- read.xls(file,sheet=5, dec=',')
interact <- correct.num(interact)
interact <- volta.only(interact)

### Combine all datasets for clustering all
dat <- left_join (users, resource, by = c("TAI_ID1", "ADM_2"))
dat <- left_join(dat, biophys, by = c("TAI_ID1", "ADM_2"))
dat <- left_join(dat, interact, by = "TAI_ID1")

dat <- dat[-26] # delete Province

# Note this is non-normalized data, original values. After the following line all is "rescaled" to 0-1
dat[3:35] <- apply(dat[3:35], 2, rescale_hw)
## You can delete "Pop_dens_log" since it's rescaled and you already have pop_density.
dat <- dat[-4]


## Validating number of clusters
## Number of clusters with NdClust: uses 30 different index and compare them to decide the optimal partition
# library (NbClust)
# # euclidean and manhattan are not good for gradient separation (see help vegdist)
clust_num <- NbClust( data = dat[-c(1,2)], dist = 'manhattan',  # diss = designdist(full [-c(2,23)], '1-J/sqrt(A*B)' )
                      min.nc = 3, max.nc = 12, method = 'ward.D2', alphaBeale = 0.1,
                      index = 'all')

## 6 algorithms propose that the number of clusters is 7 and another 5 that it should be 3 clusters
# *******************************************************************
#   * Among all indices:                                                
#   * 5 proposed 3 as the best number of clusters
# * 4 proposed 4 as the best number of clusters
# * 4 proposed 5 as the best number of clusters
# * 1 proposed 6 as the best number of clusters
# * 6 proposed 7 as the best number of clusters
# * 1 proposed 11 as the best number of clusters
# * 1 proposed 12 as the best number of clusters
#
# ***** Conclusion *****                            
#   
#   * According to the majority rule, the best number of clusters is  7
# *******************************************************************


# quartz()
# par(mfrow=c(1,1))
# hist(clust_num$Best.nc[1,],breaks=max(na.omit(clust_num$Best.nc[1,])),
#      xlab="Number of clusters",col="lightgrey", main="Optimal number of clusters?")

# library (clValid)
# ## Internal validation
# intern <- clValid(obj=as.data.frame(dat[-c(1,2)]), nClust=c(3:9),
#                   clMethods=c('hierarchical', 'kmeans', 'diana', 'fanny','som',
#                               'pam', 'sota', 'clara', 'model'),
#                   validation='internal')
#
# ## Stability validation
# stab <- clValid(obj=as.data.frame(dat[-c(1,2)]), nClust=c(3:9),
#                 clMethods=c('hierarchical', 'kmeans', 'diana', 'fanny', 'som',
#                             'pam', 'sota', 'clara', 'model'),
#                 validation='stability')
#
# summary (intern)
# Optimal Scores:
#   
#   Score   Method       Clusters
# Connectivity 10.2159 hierarchical 3       
# Dunn          0.3697 hierarchical 5       
# Silhouette    0.3444 hierarchical 3     


# summary (stab)
# Optimal Scores:
#   
#   Score  Method       Clusters
# APN 0.0006 hierarchical 5       
# AD  0.9623 kmeans       9       
# ADM 0.0026 hierarchical 5       
# FOM 0.1385 som          9


## Prepare the clustering result dataset
mds <- metaMDS(dat[-c(1,2)], distance = 'manhattan', trymax = 1000, verbose = FALSE)
setwd('~/Documents/Projects/TAI/scripts/TAI-Volta')
```


## Results

The Volta river basin has different sets of social-ecological systems. The clustering search identifies an optimal number of 7 archetypes suggested by 6 out of 30 algorithms, followed by 3 clusters (5 algorithms), and 4-5 clusters (4 algorithms each). However, when looking at internal validation hierarchical clustering is suggested as the best performing technique with 3 or 5 clusters; while stability validation suggest higher number of clusters with hierarchical clustering best suited for 5 clusters, or 9 clusters with self-organizing maps or k-means. Figure 1 summarizes the data used to operationalize the Ostrom's SES framework on the Volta Basin, as well as the clustering of second level administrative units with the best performing clustering methods for 7 and 9 archetypes. Lower numbers of clustering often group all provinces of Burkina Faso on one set regardless of the algorithm used. Here we favor higher number of archetypes that enhance stability of the analysis, this is, that results are robust when removing columns of the dataset -a feature desired in highly correlated data [@Brock:2008vz].

```{r Fig1, echo = FALSE, fig.align='center', fig.height=5, fig.width=6, error=FALSE, warning=FALSE, message=FALSE, fig.cap = 'Clustering results. Social-Ecological systems archetypes found by applying different clustering techniques. Optimal numer of clusters is 7 but 9 performs best on stability validation. Upper panel shows non-metric multi dimensional scaling of the data and maps for 7 clusters, lower panel for 9 clusters. Best performing algorithms are hierarchical clustering, k-means and self-organizing maps (som)', dev.args= list(pointsize = 8)}
########################
### Clustering all
########################

## TODO: Need to introduce a framework figure here

# quartz(width=5, height=6, pointsize = 8)
# number of clusters
k <- c(7,9)

par(mfrow=c(4,3))

for (i in 1:length(k)){
  par(mai = c(0.1, 0.2, 0.1 , 0.1))

  ## Best selected algorithms: hierarchical, k-means and som
  fitKM <- kmeans(dat[-c(1,2)], centers= k[i] ,iter.max=2000,nstart=50)
  fitSOM <- som(as.matrix(dat[-c(1,2)]), grid= somgrid(3,2,'hexagonal'))
  dis <- vegdist(dat[-c(1,2)], method='manhattan')
  clus <- hclust(dis, method='ward.D2')
  grp <- cutree(clus, k[i])
  fitHier <- grp

  # create a clusters dataframe
  clusters <- data.frame( TAI_ID1 = dat$TAI_ID1, # codes
                        #clara = as.vector(fitClara$clustering ),
                        k_means = as.vector(fitKM$cluster),
                        #PAM = as.vector(fitPAM$clustering),
                        som = as.vector(fitSOM$unit.classif),
                        hierarchical = as.vector(fitHier))

  # create colors
  levelCols <- brewer.pal(k [i], "Set1")

  # plot
  for (j in 2:4){
    # Plot MDS with grouping according with each clustering algorithm
    ordiplot(mds, type='none', main= paste(k[i], colnames( clusters)[j],  sep = ' '), xlab = NULL, ylab = NULL )
    points(mds$points, cex=1, lwd=1.5,
         col= levelCols[clusters[,j]]) #volta.shp@data[,i]
    ordihull(mds, groups= clusters[,j], label = TRUE, cex=0.7,
           col="purple", lty=1)
    #plot(ef1, p.max=0.05, col='blue', cex=0.5)
    #plot(ef2, p.max=0.05, col='purple', cex=0.5)
  }


  for (j in 2:4){
    par(mai = c(0.2, 0.2, 0.2 , 0.2))
    # Plot map with resulting clusters
    plot(volta.shp, col= levelCols[clusters[,j]],lwd=0.1)  
         # main= paste(k[i], colnames( clusters)[j], sep = ' '))

  }
}


# quartz.save(file = '160912_clustering.png', type = 'png', dpi = 200,width=5, height=6, pointsize = 8 )
```

Following the analogy of boundles of ecosystem services [@Foley:2005p190; @RaudseppHearne:2010p5327], here we also map the bundles of SES variables that co-variate in space using the 7 archetypes suggested by the clustering analysis (Figure 2). Resource users of SES in northern Burkina Faso (Fig 2a) are characterized by low literacy, lower levels of migration both within regions (first level of administrative division) and amongst regions (external migration), with lower rates of urbanization and population density. Their population is dominated by farmers with a high ratio of women and children. Their main crops are sorghum, millet and rice (in kcals per harvested area). The biophysical conditions are dominated by very short wet season and the highest mean temperature; while the resource conditions present very little tree cover and the highest density of cattle per capita as well as per $Km^{2}$. Southern Burkina Faso SES (Fig 2c) shows very similar conditions. Resource users present higher literacy rates and migration. Cowpeas and maize have higher importance in the south; the aridity index is slightly higher than in the north and consequently the ratio of cropped area is smaller. Southern Burkina Faso have lower market access, a similar ratio of cattle per capita but lower in density.

SES in Ghanaian region of the Volta basin have in general higher literacy, population growth and migration rates, longer wet seasons, roughest landscapes (Slope_75) and a very distinctive sets of crops they produce and relay upon (Fig 2b,d,e,g). While Fig 2g specializes in rice and cassava, two crops suited for a longer wet season; Fig 2e has a higher share of yam, cocoyam, platain and groundnuts. A similar set of crops is co-produced in Fig 2b with exception of groundnuts. Fig 2b counts with higher tree cover and the highest concentration of water reservoirs of the basin and longer wet season. Similar production levels (in kcals per cultivated area) are found in central Ghana (Fig 2d), but the favored set of crops combine groundnuts, millet and sorghum as in Burkina Faso, in addition to rice, cassava, yam and cocoyam.


SES characterized by high urbanization were identified in Fig 2f. Urban districts and provinces in the basin also have an important role on food production, especially  of rice, millet, sorghum and groundnuts; in fact they have the higher ratio of cropland and dams density. They also have higher density cattle and small rumiants (sheep) per area but lower per capita compared with other SES archetypes. [_Katja please check this is correct!!_]

```{r Fig2, echo = FALSE, fig.align='center', fig.height=7, fig.width=6, error=FALSE, warning=FALSE, message=FALSE, fig.cap = "Bundles of SES variables. The SES framework was operationalized by analysing datasets that are proxi of Ostrom's suggested variables at spatial units that correspond to the second administrative level", dev.args= list(pointsize = 8) }

k <- 7 # optimal number of clusters
avg <- apply (dat[-c(1,2)], 2, mean)

dis <- vegdist(dat[-c(1,2)], method='manhattan')
clus <- hclust(dis, method='ward.D2')
grp <- cutree(clus, k)
fitHier <- grp

# following Garry's code with k-means
fitKM <- kmeans(dat[-c(1,2)], centers= k ,iter.max=2000,nstart=50)
#residFKM <- fitKM$centers - rbind (replicate (k, avg))

## Colors
clusColors <- c('gray90', 'gray10')
freq <- order (fitKM$size, decreasing = TRUE)

U_Cols <- brewer.pal(length(users) - 2, "Reds")
R_Cols <- brewer.pal(length(resource)- 2, "Greys")
B_Cols <- brewer.pal(length(biophys)- 2, "PuBu")
I_Cols <- brewer.pal(length(interact)- 2, "YlGn")
FlowerCol <- c(U_Cols, R_Cols, B_Cols, I_Cols)

## ploting
# quartz(width=7, height = 7, pointsize = 8)
par(mfrow = c(4,4), mai = c(.02, 0.2, 0.2, 0.2))
layout (matrix(c(1:14, 15,15), 4,4, byrow = T))

for ( i in 1:k) {
  varCols <- clusColors [(fitKM$cluster == freq [i]) + 1]
  plot (volta.shp, col = varCols, lwd = 0.1, main= letters[i], cex.main = 1.5 )

  stars (t (sqrt(fitKM$centers[freq[i], ])), len = 0.6, full = TRUE, scale = FALSE,
         radius = TRUE, draw.segments = TRUE, col.segments = FlowerCol)
}

# legend
par( mai = c(1,1,1,1))
leg <- data.frame (leg =rep (1, length(dat) -2 ), row.names =  colnames(dat[-c(1,2)]))
stars (t(leg), len = 1, full = TRUE, scale = FALSE, key.loc = c(2.3, 2.3), locations = c(2.3, 2.3),
         radius = TRUE, draw.segments = TRUE, col.segments = FlowerCol,
       axes = F, plot = T, cex = 0.7)

# quartz.save(file = '160912_SESarchetypes.png', type = 'png', dpi = 200,width=7, height=7, pointsize = 8 )

```

The intedepences [_any better word? we cannot use interaction because that's one of the ostroms variable set_] between different components of the Ostrom's framework were further investigated in our dataset by applying vector fitting to non-metric multi-dimensional scaling on the variable set of interest (Fig 3). The ordination method applied to each set of variables reveals that the clustering (Fig 1) is highly driven by the crops data.

```{r cube, include = FALSE, cache = TRUE}

######################
### The cube
######################

cube <- function(s1, s2, s3, s4){ # each side of the cube is s_
  mod1 <- metaMDS(s1, distance = 'manhattan', trymax = 1000)
  ef1 <- envfit(mod1, s2, permu=999)
  ef2 <- envfit(mod1, s3, permu=999)
  ef3 <- envfit(mod1, s4, permu=999)

  # ## plot
  # plot(mod1, type='p', display='sites', cex=0.8)
  # plot(ef1, p.max=0.05, col='blue', cex=0.8)
  # plot(ef2, p.max=0.05, col='purple', cex=.8)
  # plot(ef3, p.max=0.05, col='grey', cex=.8)

  return(list(mod1, ef1, ef2, ef3))
}


int.side <- cube(s1 = interact[-c(1,2)], s2 = resource[-c(1,2)], s3= biophys[-c(1,2)], s4 = users[-c(1,2)])
bio.side <- cube(s1 = biophys[-c(1,2)] , s2 =  resource[-c(1,2)], interact[-c(1,2)], s4 = users[-c(1,2)])
res.side <- cube(s1 =  resource[-c(1,2)] , s2 =  biophys[-c(1,2)], s3= interact[-c(1,2)], s4 = users[-c(1,2)])
soc.side <- cube(s1 = users[-c(1,2,4)] , s2 =  resource[-c(1,2)], s3= interact[-c(1,2)], s4 =  biophys[-c(1,2)])

```


```{r Fig3, echo = FALSE, fig.align='center', fig.height=7, fig.width=5, error=FALSE, warning=FALSE, message=FALSE, fig.cap = "**Interactions between SES framework components**. Each subsystem in the Ostrom's framework is ordered with non-metric multidimensional scaling and vectors are fitted for all other variables that significanlty (p < 0.001) explain the variation of the ordination", dev.args= list(pointsize = 8) }

# create colors
levelCols <- brewer.pal(k, "Set1") # check for better colours with alpha, k = 7

## an especial plot function that does all at once
plot.cube <- function (x, main, c1, c2, c3, sub, ... ){ #takes a cube object
  c <- c("purple",c1, c2, c3) # will never use purple on loop
  for (i in 2:4){
    ## plot
    ordiplot(x[[1]], type='none', main= paste( main, "vs", sub[i], sep = " "), xlab = NULL, ylab = NULL)#,
             #xlim = c(-1.2,1.2), ylim = c(-1,1))
    points(x[[1]]$points, cex=0.7, lwd=0.8,
           col= levelCols[fitKM$cluster])

    ordihull(x[[1]], groups= fitKM$cluster, label = FALSE, cex=0.2,
             col= levelCols, lty = 0, draw = 'polygon', alpha = 0.25)

    ## Plot environmental fitting
    plot(x[[i]], p.max=0.001, col= c[i], cex=1)
    # plot(x[[3]], p.max=0.05, col= c2, cex=.5)
    # plot(x[[4]], p.max=0.05, col= c3, cex=.5)
  }
}

# quartz(width = 5, height = 7, pointsize = 8)
par(mfrow = c(4,3), mai = c(0.2,0.2,0.2,0.2))
# Interactions
plot.cube (int.side, main = "Interactions", sub = c( " ","Resource", "Biophysic", "Users"),
           c1 = "gray20", # resource
           c2 = "dodgerblue4", # biophysic
           c3 = "darkred") # users
# Resouce
plot.cube(res.side, main = "Resource", sub = c("", "Biophysic", "Interactions", "Users"),
          c1 = "dodgerblue4", c2 = "darkgreen", c3 = "darkred")
# biophysic
plot.cube (bio.side, main = "Biophysic", sub = c("", "Resource", "Interactions", "Users"),
           c1 = "gray20", c2 = "darkgreen", c3 = "darkred")
# Users
plot.cube(soc.side, main = "Users", sub = c('', "Resource", "Interactions", "Biophysic"),
          c1 = "gray20", c2 = "darkgreen", c3 = "dodgerblue4")


# quartz.save(file = '160912_Cube.png', type = 'png', dpi = 200,width=5, height=7, pointsize = 8 )
```


## Discussion
(and how our work complements previous efforts)

+ We address the problem of operationalizing the Ostrom's SES framework for larger study areas, their change over time and how the results can inform policy relevant arenas.
+ We develop a methodological routine to identify the optimal number of clusters, compare different clustering available techniques and map SES bundles in space and time. The method relies on national data sources and can be applied in other developing context (both countries and developemnt problems, here we applied to food security in drought prone areas)
+ Limitations: at this scale is difficult to find governance indicators. Ground thruting is necessary step to follow up.

## Conclusion

## Methods
**Data**
**Clustering and mapping**
**Temporal trends**

### List of appendixes

1. Master table Katja
2. All variables mapped
3. Figure with clusters selection?

# References