# Open Ocean
# Open Earth Fundation

<h1> Step 2: calculate different metrics for each modulating factor </h1>

This notebook is the second part of the `Step1_Curate_IUCN_RedList.ipynb`

<h2> Modulating Factor 1: Normalize Biodiversity Score </h2>

Species diversity refers to the variety of different species present in a given area, as well as their abundance and distribution. This includes the number of species, their relative abundances, and how evenly or unevenly distributed they are.
Our proposal is: apply the Simpson and Shannon Index to obtain a local value of the MPA and normalize each sqd km value

### Data needed for this project

- Species names
- Species abundance
- Species distribution

Next Steps:

1. Find a database or datasets with abundance and distribution information for the entire ACMC
2. If it isn't reallistic, try to simulate that data

Options:
1. IUCN RED List and simulate abundance information
2. GBIF species information and simulate abundance and distribution information

### 1. Importing libraries.

In [3]:
# load basic libraries
import os
import glob
import boto3

import math
import numpy as np
import pandas as pd

# to plot
import matplotlib.pyplot as plt

# to manage shapefiles
import shapely
import geopandas as gpd
from shapely.geometry import Polygon, Point, box
from shapely.ops import linemerge, unary_union, polygonize

In [4]:
import fiona; #help(fiona.open)

**Import OEF functions**

In [5]:
%load_ext autoreload

In [6]:
#Run this to reload the python file
%autoreload 2
from MBU_utils import *

### 2. Load data

In [7]:
ACMC = gpd.read_file('https://ocean-program.s3.amazonaws.com/data/raw/MPAs/ACMC.geojson')

In [8]:
df = gpd.read_file('/Users/maureenfonseca/Desktop/Data-Oceans/ACMC_IUCN_data/gdf_ACMC_IUCN_range_status_filtered.shp')

In [9]:
grid = create_grid(ACMC, grid_shape="hexagon", grid_size_deg=1.)

### 3. Preliminary calculations


In [10]:
len(df)

627

**Shannon Index**

$\text{H} = -\sum[{p_i}\times\ln(p_i)]$

where, pi is the proportion of the entire community made up of species i

In [11]:
#Polygons of species distribution to be clipped to roi
df2 = gpd.clip(df.set_crs(epsg=4326, allow_override=True), ACMC)

In [12]:
df2 = df2[0:100]

In [28]:
fake_abundance = np.random.randint(50, size = (len(df2)))

In [30]:
df2['abundance'] = fake_abundance

In [34]:
pi = (df2['abundance'])/(np.sum(df2['abundance'])) 

In [37]:
df2['pi_logpi'] = pi*np.log(pi)

  result = getattr(ufunc, method)(*inputs, **kwargs)


In [38]:
df2

Unnamed: 0,index,ASSESSMENT,ID_NO,BINOMIAL,PRESENCE,ORIGIN,SEASONAL,COMPILER,YEAR,CITATION,...,SUBSPECIES,SUBPOP,DIST_COMM,ISLAND,TAX_COMM,redlistCat,scientific,geometry,abundance,pi_logpi
277,2165,123324348,123324238,Scopelosaurus hubbsi,1,1,1,IUCN Marine Biodiversity Unit/GMSA,2019,IUCN Marine Biodiversity Unit/GMSA,...,,,,,,Least Concern,Scopelosaurus hubbsi,"POLYGON ((-86.03401 2.31691, -86.04534 2.31310...",3,-0.008309
154,1074,8177399,183789,Nexilosus latifrons,1,1,1,IUCN,2010,International Union for Conservation of Nature...,...,,,,,,Least Concern,Nexilosus latifrons,"POLYGON ((-86.12913 5.53945, -86.12950 5.51332...",40,-0.067897
346,2574,141564326,141364461,Scopeloberyx pequenoi,1,1,1,IUCN Marine Biodiversity Unit/GMSA,2019,IUCN Marine Biodiversity Unit/GMSA,...,,,,,,Least Concern,Scopeloberyx pequenoi,"POLYGON ((-86.03401 2.31691, -86.04534 2.31310...",41,-0.069176
275,2162,123323700,123323371,Bathypterois pectinatus,1,1,1,IUCN Marine Biodiversity Unit/GMSA,2019,IUCN Marine Biodiversity Unit/GMSA,...,,,,,,Least Concern,Bathypterois pectinatus,"POLYGON ((-89.31776 5.51181, -89.30912 5.51770...",1,-0.003224
390,292,42691774,190223,Manducus maderensis,1,1,1,GMSA,2015,International Union for Conservation of Nature,...,,,,,,Data Deficient,Manducus maderensis,"MULTIPOLYGON (((-86.03573 5.37152, -86.04910 5...",46,-0.075421
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
196,1385,15603090,190385,Antimora rostrata,1,1,1,GMSA,2015,International Union for Conservation of Nature,...,,,,,,Least Concern,Antimora rostrata,"POLYGON ((-86.03401 2.31691, -86.04534 2.31310...",19,-0.038106
461,11,1641229,180512,Holothuria impatiens,1,1,1,IUCN,2012,International Union for Conservation of Nature,...,,,,,,Data Deficient,Holothuria impatiens,"POLYGON ((-86.28201 6.04859, -86.25097 5.99806...",32,-0.057273
467,59,2365291,19488,Rhincodon typus,1,1,1,IUCN Marine Biodiversity Unit/GMSA,2016,IUCN Shark Specialist Group,...,,,,,,Endangered,Rhincodon typus,"POLYGON ((-86.03401 2.31691, -86.04534 2.31310...",12,-0.026349
408,5,1512937,177989,Acanthurus xanthopterus,1,1,1,Jonnell Sanciangco,2010,International Union for Conservation of Nature...,...,,,,,,Least Concern,Acanthurus xanthopterus,"POLYGON ((-86.28201 6.04859, -86.25097 5.99806...",7,-0.016932


In [23]:
#Join in a gdf all the geometries within ROI
joined = gpd.clip(df2.set_crs(epsg=4326, allow_override=True), ACMC)
    
#Count the number of overlappong geometries from joined gdf
overlap_geo = count_overlapping_geometries(joined)
    
#This is to count how many geometries are in each grid 
merged2 = gpd.sjoin(overlap_geo, grid, how='left')
merged2['n_species']= overlap_geo['count_intersections']

#Compute stats per grid cell
dissolve = merged2.dissolve(by="index_right", aggfunc="count")

#Put this into cell
grid.loc[dissolve.index, 'n_species'] = dissolve.n_species.values

    
    
#Calculate the normalize factor 
#normalized_factor = grid_gdf['n_habitats']/grid_gdf['n_habitats'].max()
    
#Convert area from degrees to square kilometers
#this case apply only for Central America
#https://epsg.io/31970
#grid_gdf['area_sqkm'] = (grid_gdf.to_crs(crs=31970).area)*10**(-6)
#grid_gdf['mbu_habitat_survey'] = normalized_factor*grid_gdf['area_sqkm']

  if this_row_boundary.type[:len('multi')].lower() == 'multi':

  new_gdf['geom_centroid'] = new_gdf.centroid


In [24]:
grid

Unnamed: 0,geometry,Grid_ID,n_species
0,"POLYGON ((-88.32201 2.15063, -88.82201 3.01666...",0,
1,"POLYGON ((-88.32201 3.88268, -88.82201 4.74871...",1,39.0
2,"POLYGON ((-88.32201 5.61474, -88.82201 6.48076...",2,18.0
3,"POLYGON ((-88.32201 7.34679, -88.82201 8.21281...",3,1.0
4,"POLYGON ((-86.82201 1.28461, -87.32201 2.15063...",4,
5,"POLYGON ((-86.82201 3.01666, -87.32201 3.88268...",5,111.0
6,"POLYGON ((-86.82201 4.74871, -87.32201 5.61474...",6,1186.0
7,"POLYGON ((-86.82201 6.48076, -87.32201 7.34679...",7,1446.0
8,"POLYGON ((-86.82201 8.21281, -87.32201 9.07884...",8,
9,"POLYGON ((-85.32201 2.15063, -85.82201 3.01666...",9,3.0
