#### Livestock Health Impact on Nutrient Shadow Pricing

Two analyses were done to evaluate impacts of livestock health on nutrient shadow prices. The first analysis used a fixed effect model to see if livestock health and disease explained significant shifts in nutrient prices. However, the results of each fixed effect model for bovine, goat, and sheep showed no significant effects. The second analysis focused on logistic regression in order to see odds ratio impacts of livestock general illness on nutrient price.

#### Fixed Effect Model Analysis

For each livestock species (bovine, goat, and sheep) a fixed effect model is specified as the nutrient shadow price being a function of general livestock illness and disorders consisting of reproductive, respiratory, digestive, urogenital, muscle, skin, and nerve.

The output for bovine health observation impacts on protein nutrient shadow price is provided below as an example. The results are not informative and are consistent with goat and sheep health observation impacts on nutrient shadow prices (protein, lipid, and carbohydrates as the dependent variables).

(Results for the logistic analysis, I believe, are informative)

In [4]:
#### livestock health impact on nutrient prices ####
####################################################

def get_y(df_species):
    return df_species[[p for p in df_species.columns if '_defl' in p]]

def get_X(df_species):
    return df_species[[d for d in df_species.columns if 'Disorders' in d or 'Illness' in d]]

y_bovine, X_bovine = get_y(df_p_bovine), get_X(df_p_bovine)
y_goat, X_goat = get_y(df_p_goat), get_X(df_p_goat)
y_sheep, X_sheep = get_y(df_p_sheep), get_X(df_p_sheep)

# check to see if non-values are entered in observations
print('Specie unique health observation values:')
print()
for i in X_bovine.columns:
    print('Bovine ' + i, X_bovine[i].unique())
print()
for i in X_goat.columns:
    print('Goat ' + i, X_goat[i].unique())
print()
for i in X_sheep.columns:
    print('Sheep ' + i, X_sheep[i].unique())
print()

# for muscle, skin, and nerve disorders, change '77' for 'don't know' to '0' for 'no'
for i in X_bovine.columns:
    if 77 in X_bovine[i].unique():
        print('Bovine', i, 'value changed from 77 to 0 at index', X_bovine[X_bovine[i] == 77].index.tolist())
        X_bovine[i].replace(77, 0, inplace=True)
print()
for i in X_goat.columns:
    if 77 in X_goat[i].unique():
        print('Goat', i, 'value changed from 77 to 0 at index', X_goat[X_goat[i] == 77].index.tolist())
        X_goat[i].replace(77, 0, inplace=True)
print()
for i in X_sheep.columns:
    if 77 in X_sheep[i].unique():
        print('Sheep', i, 'value changed from 77 to 0 at index', X_sheep[X_sheep[i] == 77].index.tolist())
        X_sheep[i].replace(77, 0, inplace=True)
print()

# fixed effect model for livestock health impacts on nutrient shadow prices
def fe_mod(y, X):
    '''
    Performs linear regression for livestock health fixed effects on nutrient shadow prices. Dependent variables
    include protein, fat, and carb shadow prices. Model results are stored in list format accessible by index.

    Parameters: endogenous array, y
                exogenous array, X
    '''

    cols = y_bovine.columns
    model_list = []

    for i in cols:
        mod = sm.OLS(y[i], sm.add_constant(X)).fit()
        model_list.append(mod)

    return model_list

bovine_lm = fe_mod(y_bovine, X_bovine)
goat_lm = fe_mod(y_goat, X_goat)
sheep_lm = fe_mod(y_sheep, X_sheep)

bovine_lm[0].summary()

Specie unique health observation values:
Bovine GeneralIllness [1 0]
Bovine RepDisorders [0 1]
Bovine RespDisorders [1 0]
Bovine DigestDisorders [1 0]
Bovine UrogenDisorders [0 1]
Bovine MuscleDisorders [ 0  1 77]
Bovine SkinDisorders [ 0  1 77]
Bovine NerveDisorders [ 0  1 77]

Goat GeneralIllness [1 0]
Goat RepDisorders [ 0 77  1]
Goat RespDisorders [1 0]
Goat DigestDisorders [0 1]
Goat UrogenDisorders [0 1]
Goat MuscleDisorders [0 1]
Goat SkinDisorders [0 1]
Goat NerveDisorders [0 1]

Sheep GeneralIllness [0 1]
Sheep RepDisorders [0 1]
Sheep RespDisorders [0 1]
Sheep DigestDisorders [1 0]
Sheep UrogenDisorders [0 1]
Sheep MuscleDisorders [ 0  1 77]
Sheep SkinDisorders [0 1]
Sheep NerveDisorders [0 1]

Bovine MuscleDisorders value changed from 77 to 0 at index [248]
Bovine SkinDisorders value changed from 77 to 0 at index [381]
Bovine NerveDisorders value changed from 77 to 0 at index [133, 174]

Goat RepDisorders value changed from 77 to 0 at index [34]

Sheep MuscleDisorders value 

0,1,2,3
Dep. Variable:,protein_p_defl,R-squared:,0.006
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.4283
Date:,"Fri, 07 Jun 2019",Prob (F-statistic):,0.904
Time:,15:27:29,Log-Likelihood:,-97.388
No. Observations:,621,AIC:,212.8
Df Residuals:,612,BIC:,252.7
Df Model:,8,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.4027,0.033,12.060,0.000,0.337,0.468
GeneralIllness,-0.0180,0.026,-0.690,0.490,-0.069,0.033
RepDisorders,0.0498,0.168,0.296,0.767,-0.281,0.380
RespDisorders,0.0262,0.036,0.738,0.461,-0.044,0.096
DigestDisorders,0.0285,0.034,0.834,0.405,-0.039,0.096
UrogenDisorders,-0.0143,0.095,-0.151,0.880,-0.200,0.172
MuscleDisorders,0.0520,0.041,1.265,0.206,-0.029,0.133
SkinDisorders,0.0362,0.036,1.020,0.308,-0.034,0.106
NerveDisorders,-0.0624,0.084,-0.741,0.459,-0.228,0.103

0,1,2,3
Omnibus:,491.875,Durbin-Watson:,1.803
Prob(Omnibus):,0.0,Jarque-Bera (JB):,12190.562
Skew:,3.319,Prob(JB):,0.0
Kurtosis:,23.665,Cond. No.,20.7


#### Logistic Analysis

For performing the logistic analysis of livestock general illness impact on nutrient shadow prices, each nutrient's shadow price is coded as 1 if the price is greater than or equal to its mean and 0 if it's below its mean. All protein, lipid, and carbohydrate shadow prices are then transformed to either 1 or 0, respective to each nutrient's shadow price mean. The logistic function is specified with a vector containing a constant and general livestock illness observations. Each species has its own logistic function corresponding to that specie's illness observations.

In [16]:
# logistic construction and analysis for odds ratio impacts of livestock health on nutrient shadow prices
def get_bin_p(y_species):
    bin_p_list = ['bin_' + p for p in nutr_p_list]
    for i in range(len(bin_p_list)):
        
        print('Observation', y_species.columns[i], 'nutrient price values >=',
              round(np.mean(y_species[y_species.columns[i]]), 3),
              'coded as 1 with values less than the mean coded as 0')
        
        y_species[bin_p_list[i]] = np.where(y_species[y_species.columns[i]] >=
                                            np.mean(y_species[y_species.columns[i]]), 1, 0)
              

Y = [y_bovine, y_goat, y_sheep]
for s, y in zip(species_list, Y):
    print(s + ':')
    get_bin_p(y)
    print()

def get_logit_results(y_species, X_species):
    '''
    Performs logistic regression to evaluate effects of livestock general illness on nutrient prices greater than or
     equal to its mean or less than its mean.

    Returns a list with elements consisting of intercept and general illness coefs, odds ratio, probability
     of nutrient price being greater than the mean given general livestock illness occurs, partial effects of
     general livestock illness given nutrient prices greater than the mean.

    Nutrient list positions:
    0 - protein shadow price
    1 - fat shadow price
    2 - carb shadow price
    '''

    y_bin_list = [p for p in y_species.columns if 'bin_' in p]
    params = []

    for i in y_bin_list:
        lr = LogisticRegression(solver='lbfgs').fit(np.array(X_species[X_species.columns[0]]).reshape(-1, 1),
                                                    y_species[i])

        # probability nutrient price is above mean given general livestock illness
        prob = stats.norm.cdf(float(lr.intercept_ + lr.coef_))
        # partial effect between healthy and sick ivestock on probability of nutrient price being above the mean
        partial_effect = stats.norm.cdf(float(lr.intercept_)) - prob
        results = [float(lr.intercept_), float(lr.coef_), np.exp(float(lr.coef_)),
                   prob, partial_effect]

        params.append(results)

    return params


bovine_lr = get_logit_results(y_bovine, X_bovine)
goat_lr = get_logit_results(y_goat, X_goat)
sheep_lr = get_logit_results(y_sheep, X_sheep)

df_lr_results = pd.DataFrame()
df_lr_results['Species'] = np.array(['Bovine', '', '',
                                     'Goat', '', '',
                                     'Sheep', '', ''])
df_lr_results['Nutrient'] = np.array(['Protein', 'Fat', 'Carbohydrate'] * 3)


odds_list = []
prob_list = []
pe_list = []
lr_list = [bovine_lr, goat_lr, sheep_lr]
for species in lr_list:
    for i in range(len(species)):
        odds_list.append(species[i][2])
        prob_list.append(species[i][3])
        pe_list.append(species[i][4])

df_lr_results['Odds_Ratio'] = np.round(odds_list, 4)
df_lr_results['Probability'] = np.round(prob_list, 4)
df_lr_results['Partial_Effect_Healthy'] = np.round(pe_list, 4)
df_lr_results

bovine:
Observation protein_p_defl nutrient price values >= 0.428 coded as 1 with values less than the mean coded as 0
Observation fat_p_defl nutrient price values >= 0.58 coded as 1 with values less than the mean coded as 0
Observation carb_p_defl nutrient price values >= 0.089 coded as 1 with values less than the mean coded as 0

goat:
Observation protein_p_defl nutrient price values >= 0.473 coded as 1 with values less than the mean coded as 0
Observation fat_p_defl nutrient price values >= 0.64 coded as 1 with values less than the mean coded as 0
Observation carb_p_defl nutrient price values >= 0.109 coded as 1 with values less than the mean coded as 0

sheep:
Observation protein_p_defl nutrient price values >= 0.441 coded as 1 with values less than the mean coded as 0
Observation fat_p_defl nutrient price values >= 0.599 coded as 1 with values less than the mean coded as 0
Observation carb_p_defl nutrient price values >= 0.095 coded as 1 with values less than the mean coded as 0



Unnamed: 0,Species,Nutrient,Odds_Ratio,Probability,Partial_Effect_Healthy
0,Bovine,Protein,0.7001,0.3167,0.1354
1,,Fat,0.7603,0.3207,0.1033
2,,Carbohydrate,0.7237,0.2039,0.1031
3,Goat,Protein,1.2579,0.4351,-0.0879
4,,Fat,1.3039,0.3184,-0.088
5,,Carbohydrate,1.0157,0.1767,-0.004
6,Sheep,Protein,0.9231,0.382,0.0308
7,,Fat,1.2386,0.4758,-0.084
8,,Carbohydrate,0.8792,0.178,0.0355


The impact of goat general illness on nutrient shadow prices show odds ratios greater than one. The odds that each respective nutrient price is greater than its mean given goat general illness is present is more than 1 times the odds that each respective nutrient price is greater than its mean when goat general illness is not present.

The probability values show the probabilities that a nutrient price will be greater than its mean given respective livestock general illness is present.

The partial effect values show the difference in probability of a nutrient's price being greater than its mean when no general livestock illness is present and when general illness is present. Positive partial effects show that probabilities for nutrient prices being greater than their means are higher when livestock are healthy. The opposite occurs for negative partial effects - probabilities that nutrient prices are greater than their means when livestock are unhealthy are greater.

I believe there's a story to be told with these results. For example, looking at bovine consumption and nutrient prices, the positive partial effects reveal greater probabilities of higher nutrient prices when consuming healthy livestock. These greater prices can be a result of lower healthy livestock consumption or greater expense when consuming healthy livestock. Increasing the supply of healthy livestock (i.e. providing the means to produce healthier livestock) can result in lowering the price of healthy livestock consumption.

#### Data management for production code above

In [2]:
### livestock health impacts on nutrient shadow pricing ###

import numpy as np
import pandas as pd
import scipy.stats as stats
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
import scipy.stats
import glob
import warnings
warnings.filterwarnings('ignore')


#### data management ####
#########################

def get_files(subpath):
    return glob.glob('/home/ajkappes/research/africa/' + subpath)

health_dfs = np.array(get_files('Livestock_Human_Health/data/*.csv'))
nutrient_dfs = np.array(get_files('Nutrient_Demand/*.csv'))

healthcare_df = pd.read_csv(health_dfs[0])
liv_gen_health_df = pd.read_csv(health_dfs[1])
liv_health_df = pd.read_csv(health_dfs[2])


for i in range(len(nutrient_dfs)):
    if 'pricing' in nutrient_dfs[i]:
        nutr_pricing_df = pd.read_csv(nutrient_dfs[i])
        print('Nutrient pricing data read at position', i)

    if 'cpi' in nutrient_dfs[i]:
        cpi_df = pd.read_csv(nutrient_dfs[i])
        print('CPI data read at position', i)
print()

month_year = pd.DataFrame(nutr_pricing_df['date'].unique().tolist()).rename(columns={0: 'date'})

d = {}
for m_y in month_year['date']:
    d[m_y] = pd.DataFrame()

for key in d:
    d[key] = nutr_pricing_df[nutr_pricing_df['date'] == key].drop_duplicates(['HousehldID'], keep='last').merge(

        healthcare_df[healthcare_df['date'] == key].drop_duplicates(['HousehldID'], keep='last'),
        how='inner', on='HousehldID').merge(

        liv_gen_health_df[liv_gen_health_df['date'] == key].drop_duplicates(['HousehldID'], keep='last'),
        how='inner', on='HousehldID')

def remove_no_obs(data_dict):
    no_obs = []
    for key in data_dict:
        if len(data_dict[key]) == 0:
            no_obs.append(key)

    for key in no_obs:
        del data_dict[key]
        print(key, 'has been removed due to no entries')

print('Entries removed from dictionary d')
remove_no_obs(d)
print()

def data_combine(data_dict):
    key_list = [key for key in data_dict]
    data_list = []
    for key in key_list:
        data_list.append(data_dict[key])

    return pd.concat(data_list, axis=0).reset_index().drop(columns='index')

# aggregate data construction
df = data_combine(d)

# micro livestock health data construnction
bovine_health = liv_health_df[liv_health_df['Species'] == 'BO']
goat_health = liv_health_df[liv_health_df['Species'] == 'OV']
sheep_health = liv_health_df[liv_health_df['Species'] == 'CP']

# merge livestock health data with aggregate data
d_bov_health = {}
d_goat_health = {}
d_sheep_health = {}
for m_y in month_year['date']:
    d_bov_health[m_y] = pd.DataFrame()
    d_goat_health[m_y] = pd.DataFrame()
    d_sheep_health[m_y] = pd.DataFrame()

for key in d_bov_health:
    d_bov_health[key] = df[df['date'] == key].drop_duplicates(['HousehldID'], keep='last').merge(
        bovine_health[bovine_health['date'] == key].drop_duplicates(['HousehldID'], keep='last'),
        how='inner', on='HousehldID')

    d_goat_health[key] = df[df['date'] == key].drop_duplicates(['HousehldID'], keep='last').merge(
        goat_health[goat_health['date'] == key].drop_duplicates(['HousehldID'], keep='last'),
        how='inner', on='HousehldID')

    d_sheep_health[key] = df[df['date'] == key].drop_duplicates(['HousehldID'], keep='last').merge(
        sheep_health[sheep_health['date'] == key].drop_duplicates(['HousehldID'], keep='last'),
        how='inner', on='HousehldID')

species_dict_list = [d_bov_health, d_goat_health, d_sheep_health]
species_list = ['bovine', 'goat', 'sheep']

for i in range(len(species_dict_list)) and range(len(species_list)):
    print('Entries removed from', species_list[i], 'data')
    remove_no_obs(species_dict_list[i])
    print()

df_p_bovine = data_combine(d_bov_health)
df_p_goat = data_combine(d_goat_health)
df_p_sheep = data_combine(d_sheep_health)

# df_p_bovine.to_csv('/home/ajkappes/research/africa/Nutrient_Demand/np_bovine_health.csv')
# df_p_goat.to_csv('/home/ajkappes/research/africa/Nutrient_Demand/np_goat_health.csv')
# df_p_sheep.to_csv('/home/ajkappes/research/africa/Nutrient_Demand/np_sheep_health.csv')

nutr_p_list = ['protein_p', 'fat_p', 'carb_p']

def defl_n_prices(df_species):
    df_key_list = df_species.iloc[:, 0].unique().tolist()
    df_defl_p = pd.DataFrame(columns=[p + '_defl' for p in nutr_p_list], index=df_species.index, dtype='float64')

    for i in range(len(nutr_p_list)):
        for key in df_key_list:
            df_defl_p.iloc[df_species[df_species.iloc[:, 0] == key].index.tolist(), i] = \
                df_species[df_species.iloc[:, 0] == key][nutr_p_list[i]] * float(cpi_df[cpi_df['date'] == key]['defl'])

    return pd.concat([df_species, df_defl_p], axis=1)

df_p_bovine = defl_n_prices(df_p_bovine)
df_p_goat = defl_n_prices(df_p_goat)
df_p_sheep = defl_n_prices(df_p_sheep)


def norm_price(df_species):
    norm_price_list = ['norm_' + p for p in nutr_p_list]
    defl_p_list = [p for p in df_species.columns if '_defl' in p]

    for i in range(len(norm_price_list)):
        df_species[norm_price_list[i]] = (df_species[defl_p_list[i]] - np.min(df_species[defl_p_list[i]])) / \
                                         (np.max(df_species[defl_p_list[i]]) - np.min(df_species[defl_p_list[i]]))

norm_price(df_p_bovine)
norm_price(df_p_goat)
norm_price(df_p_sheep)

Nutrient pricing data read at position 1
CPI data read at position 6

Entries removed from dictionary d
Jul-16 has been removed due to no entries

Entries removed from bovine data
Jun-15 has been removed due to no entries
Jul-16 has been removed due to no entries

Entries removed from goat data
May-15 has been removed due to no entries
Jun-15 has been removed due to no entries
Oct-15 has been removed due to no entries
Nov-15 has been removed due to no entries
May-16 has been removed due to no entries
Jul-16 has been removed due to no entries

Entries removed from sheep data
Feb-16 has been removed due to no entries
Jun-16 has been removed due to no entries
Jul-16 has been removed due to no entries

