# Livestock Mortality Index

This notebook will attempt to replicate and expand on a model to forecast the risk of livestock mortality in aimags (provinces) of Mongolia. The index is based on work completed by the People in Need NGO and the English report is available in this repository.

## Data import

The data used in the model is stored in a PDF file as tables. These tables were copied into an Excel spreadsheet and several cleaning and transformation steps took place. The original data copied from the PDF report is in **data/DATASET for MVDI report.xlsx** and the cleaned data is stored at **data/MVDI Tables Cleaned.xlsx**.

In [72]:
#Import required libraries
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('/storage/mds.mplstyle')

In [73]:
xls = pd.ExcelFile('data/MVDI Tables Cleaned.xlsx')

In [74]:
sheets = xls.sheet_names
sheets

['temp',
 'loss',
 'index',
 'pasture_anomaly',
 'pasture',
 'biomass_anomaly',
 'biomass',
 'zootechnical',
 'mortality',
 'fecundity',
 'snowfall_anomaly']

In [75]:
temp = xls.parse('temp')
loss = xls.parse('loss')
index = xls.parse('index')
pasture_anomaly = xls.parse('pasture_anomaly')
pasture = xls.parse('pasture')
biomass_anomaly = xls.parse('biomass_anomaly')
biomass = xls.parse('biomass')
zootechnical = xls.parse('zootechnical')
mortality = xls.parse('mortality')
fecundity = xls.parse('fecundity')
snowfall_anomaly = xls.parse('snowfall_anomaly')

In [76]:
dfs = [temp,loss, index, pasture_anomaly, pasture, biomass_anomaly, biomass, zootechnical, mortality,fecundity,snowfall_anomaly]

In [77]:
from functools import reduce
df = reduce(lambda  left,right: pd.merge(left,right,on=['Aimag','Year'], how='inner'), dfs)

In [80]:
df['Aimag'].unique()

array(['Arkhangai', 'Bayankhongor', 'Bayan‐Ulgii', 'Bulgan', 'Dornod',
       'Dornogovi', 'Dundgovi', 'Govisumber', 'Khentii', 'Khovd',
       'Khuvsgul', 'Orkhon', 'Selenge', 'Sukhbaatar', 'Tuv', 'Ulaanbaatar',
       'Umnugovi', 'Uvs', 'Uvurkhangai', 'Zavkhan'], dtype=object)

In [81]:
df.head()

Unnamed: 0,Aimag,Year,Temprature anomalies,Livestock Loss Rates from - 1998-2017,Past values of the vulnerability index according to Aimag - 1: Average weighed by livestock numbers in SFU,Pasture Use Anomaly,Pasture Use,Standing forage biomass anomaly,Standing forage biomass (tons),Zootechnical score,Mortality score,Fecundity score,Snowfall anomalies
0,Arkhangai,1999,2.75,0.01,1.69,0.04,0.11,0.05,2.96E+ 07,0,0,0,-3.8
1,Arkhangai,2000,-0.39,0.06,1.94,0.18,0.12,-0.13,2.44E+ 07,0,0,0,4.57
2,Arkhangai,2001,-1.62,0.24,2.63,-0.02,0.1,-0.22,2.20E+ 07,2,1,1,8.1
3,Arkhangai,2002,0.86,0.04,2.81,0.63,0.17,-0.57,1.22E+ 07,0,0,0,5.44
4,Arkhangai,2003,-0.1,0.07,1.15,-0.25,0.08,-0.07,2.61E+ 07,0,0,0,9.7


## Data Cleaning

Now that we have all of our features in one dataframe we can start cleaning the dataset so it is in a better format. 

To do:
- Rename features to be more simple and allow for easier reference
- Standing forage biomass was stored in the original report as scientific notation. This is now stored as text. 
- 

### Rename Columns

In [83]:
df.columns

Index(['Aimag', 'Year', 'Temprature anomalies',
       'Livestock Loss Rates from - 1998-2017',
       'Past values of the vulnerability index according to Aimag - 1: Average weighed by livestock numbers in SFU',
       'Pasture Use Anomaly', 'Pasture Use', 'Standing forage biomass anomaly',
       'Standing forage biomass (tons)', 'Zootechnical score',
       'Mortality score', 'Fecundity score', 'Snowfall anomalies'],
      dtype='object')

In [85]:
df.rename(index=str,columns={'Aimag':'aimag', 'Year':'year', 'Temprature anomalies':'temperature_anomalies',
       'Livestock Loss Rates from - 1998-2017':'livestock_loss',
       'Past values of the vulnerability index according to Aimag - 1: Average weighed by livestock numbers in SFU':'index',
       'Pasture Use Anomaly':'pasture_use_anomaly', 'Pasture Use':'pasture_use', 'Standing forage biomass anomaly':'biomass_anomaly',
       'Standing forage biomass (tons)':'biomass', 'Zootechnical score':'zootechnical',
       'Mortality score':'mortality', 'Fecundity score':'fecundity', 'Snowfall anomalies':'snowfall_anomaly'},inplace=True)

### Convert biomass to numeric feature

The biomass feature is stored in the format '2.96E+ 07' (scientific notation). We do not have access to the complete number, so we will concatenate the number and take the left three characters. 

In [93]:
df['biomass'] = df['biomass'].str[:4]

In [96]:
df['biomass'] = pd.to_numeric(df['biomass'])