# Methodology for the marine wilderness analysis

---
## The data
The [cumulative marine pressure](http://www.nature.com/ncomms/2015/150714/ncomms8615/full/ncomms8615.html) data (hereafter referred to as *the Marine Data* unless otherwise specified) was used as the aggregated indicator for wilderness in the marine realm.

## [The methods](./wilderness_analysis.ipynb)
Given the explorative nature of this analysis, i.e., the pressure threshold value under which marine areas are considered wilderness areas needs to be tried and tested, and that this value may change, it is imperative a sustainable method be utilised so that upon the change of such a threshold, subsequent iterations/calculations require minimal repetitino, especially spatial analysis that are usually costly in time and prone to error.

- **Preparing the input data**

    With the potential change in threshold value in mind, the input data were prepared as follows:
    - the Marine Data
    - within the bounds of the EEZ,compilation of a *synthesized* biogeography layer that incorporates the MEOW ([Marine Ecoregion of the World](http://bioscience.oxfordjournals.org/content/57/7/573.abstract)), up to 200 meter depth, MEOW up to 200 nautical miles, and [Pelagic Provinces](http://nora.nerc.ac.uk/18017/). The idea of such a layer is to defer the decision of which base units to group (for example, identifying gaps in MEOW or Pelagic or anything with EEZ regardless) at a later stage. The end result is a spatially disjoint boundaries with shared attributes. More details can be found in the [detailed methodology notebook](./wilderness_analysis.ipynb)
    - remove Antarctica in the biogeography, as the World Heritage Convention does not currently apply
    - a marine subset of World Hertage sites (47 sites)
    - the intersection of the biogeography layer and the marine World Heritage sites


- **Clipping rasters**

    Each feature of the above biogeography data was programatically used to clip a part of the Marine Data based on its spatial extent. The clipped raster data was then converted into a flattened one dimentional array, as subsequent analysis would be non-spatial. The same was done for the marine World Heritage sites and its intersection with the biogeography data.
    
    
- **Exploring thresholds**

    The threshold of classifying marine wilderness value was determined by using the 10 percentile value for all marine areas within EEZ. This value is empirical as it was chosen by comparing threshold values at 1, 3, 5, and 10 percentiles, and comparing distribution of wilderness extent with existing marine World Heritage sites. The 10% was decided as it reasonably highlights areas of marine wilderness per expert knowledge.


- **Small multiples**

    After multiplying the cell size for each qualifying pixel (as defined by the threshold) within each feature and grouping them, the total wilderness area of the feature was calculated. Similarly, by grouping features under the same province (or WH site), the total wilderness area and its percentage in relation to the province could be easily computed without running any spatial analysis. 

Step by step analysis is included in the [detailed methodology notebook](./wilderness_analysis.ipynb)

##  Observations of preliminary results
The distribution of marine wilderness areas within EEZ (green pixels below, as defined by the top 10% least pressured areas within EEZ, excluding Antarctica).
![dist-map](dist_EEZ.png)
Here is the overlay of existing marine World Heritage sites and marine wilderness areas. It seems quite apparent that large parts of the existing wilderness remains to be explored...
![dist-map-wh](dist_EEZ_WH.png)

**Top 20 MEOW (200m + 200nm) Provinces, by percentage of wilderness area**

`per_ltt`: percentage of less than threshold area, i.e. percentage of wilderness area in relation to the total area of the province

In [21]:
import pandas as pd
a = pd.read_csv('export_meow_province.csv',encoding='Latin1')
print(a.sort_values('per_ltt', ascending=False)[['PROVINCE', 'per_ltt']].head(20))
del a

                                PROVINCE   per_ltt
24                             Marquesas  0.681158
45              Subantarctic New Zealand  0.664097
44                  Subantarctic Islands  0.528809
3                                 Arctic  0.467926
41                  Southern New Zealand  0.340892
1                      Amsterdam-St Paul  0.303475
8                      Central Polynesia  0.242988
29            Northeast Australian Shelf  0.238981
34                           Sahul Shelf  0.209192
35                            Scotia Sea  0.161785
27                                  None  0.159536
28                    North Brazil Shelf  0.102526
40                   Southeast Polynesia  0.082473
23                            Magellanic  0.079355
54     Warm Temperate Northwest Atlantic  0.078015
57  Warm Temperate Southwestern Atlantic  0.071286
20      Juan Fern ndez and Desventuradas  0.069873
33              Red Sea and Gulf of Aden  0.059408
48                 Tropical Eas

**MEOW (200m + 200nm) gaps, by percentage of wilderness area covered by existing WH sites**

It is worth noting that although MEW provinces may have a high number of marine WH sites, their wilderness area is not necessarily represented by them. For example, Northern European Seas boasts four WH sites, however, only 5.5% of wilderness area in this province is covered by WH; while Northeast Australian Shelf, having only one WH site, has most its wilderness area inside the WH site (82%)

In [22]:
a = pd.read_csv('export_gap_meow_province.csv', encoding='Latin1')
# print(a.columns)
print(a.sort_values('per_wilderness_covered_by_WH', ascending=False)[['PROVINCE', 'per_wilderness_covered_by_WH', 'num_wh']])
del a

                                PROVINCE  per_wilderness_covered_by_WH  num_wh
59         West Central Australian Shelf                      0.974230     2.0
29            Northeast Australian Shelf                      0.820772     1.0
12         East Central Australian Shelf                      0.216204     1.0
58               West African Transition                      0.169010     1.0
53      Warm Temperate Northeast Pacific                      0.085135     2.0
30                Northern European Seas                      0.055703     4.0
15                             Galapagos                      0.042391     1.0
32            Northwest Australian Shelf                      0.031749     1.0
47                         Tristan Gough                      0.028653     1.0
17                                Hawaii                      0.020901     1.0
8                      Central Polynesia                      0.020306     1.0
52         Tropical Southwestern Pacific            