## Cost Benefit Analysis of Spraying

In this notebook, we attempt to see what the efficacy of spraying is and provide recommendations on the cost/ benefit of spraying. 

In [None]:
import pandas as pd
import requests
import time
import random
import string
import scipy.stats as stats
import numpy as np
import math 
from datetime import datetime, timedelta


from ipywidgets import *
from IPython.display import display

import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import pyplot as plt
from sklearn.metrics import plot_roc_curve
%matplotlib inline

In [None]:
# We first read in our 2 files
train = pd.read_csv('./data/train_clean.csv')
spray = pd.read_csv('./data/spray.csv')

In [None]:
# We then create an interactive plot to see where traps and spray areas lie
mapdata = np.loadtxt("./data/mapdata_copyright_openstreetmap_contributors.txt")
origin = [41.6, -88.1]       
upperRight = [42.4, -87.5]


def plot(date_obvs, date_spray):
    lats = np.array(train.loc[train['Date'] == date_obvs]['Latitude'])
    longs = np.array(train.loc[train['Date'] == date_obvs]['Longitude'])
    
    lats_spray = np.array(spray.loc[spray['Date'] == date_spray]['Latitude'])
    longs_spray = np.array(spray.loc[spray['Date'] == date_spray]['Longitude'])
    
    plt.imshow(mapdata, cmap=plt.get_cmap('gray'), extent=[origin[1], upperRight[1], origin[0], upperRight[0]])
    plt.scatter(x=longs, y=lats, c='r', alpha =0.5, s=20)
    plt.scatter(x=longs_spray, y=lats_spray, c='b',alpha =0.1, s=10)

In [None]:
interact(plot,
         date_obvs = train['Date'].unique(),
         date_spray = spray['Date'].unique(),
        )

interactive(children=(Dropdown(description='date_obvs', options=('2007-05-29', '2007-06-05', '2007-06-26', '20…

<function __main__.plot(date_obvs, date_spray)>

From the plot and toggling the dates, we can see that the trap areas lie between latitude of ~41.6 to 42.1, with longitude of -88.0 to -87.5. 

From this, we can first ignore the first spray date in the data (29-08-2011) as it lies very far away from the traps and we can not make any sense of this.

Secondly, we can also see that spraying occurs in one or two concentrated areas on each day spraying is done. It is also important to note thta spraying occurs at night and when we analyse the effects of spraying, we have to take this into consideration.

We move on to attempt to look at the effects of spraying. For this, we will look at a few key data points. We segment the data out to look at the impact spraying has on specific areas and we take the areas outside the spray area as a 'control'. We will be using the timeframe of 10 days before and 10 days after spraying as the entire moquito lifecycle spans approximately 10 days.


In [None]:
# We convert the dates to date time format
spray['Date'] = pd.to_datetime(spray['Date'])
train['Date'] = pd.to_datetime(train['Date'])

In [None]:
# Let us look at all the spraying dates
spray['Date'].unique()

array(['2011-08-29T00:00:00.000000000', '2011-09-07T00:00:00.000000000',
       '2013-07-17T00:00:00.000000000', '2013-07-25T00:00:00.000000000',
       '2013-08-08T00:00:00.000000000', '2013-08-15T00:00:00.000000000',
       '2013-08-16T00:00:00.000000000', '2013-08-22T00:00:00.000000000',
       '2013-08-29T00:00:00.000000000', '2013-09-05T00:00:00.000000000'],
      dtype='datetime64[ns]')

In [None]:
# We create a function to give us some values we are interested in finding

def spray_effect(spray_date):
    spray_date = datetime.strptime(spray_date, '%Y-%m-%d')
    start_date = spray_date - timedelta(days=10)
    end_date = spray_date + timedelta(days=10)
    
    # Create dataframe of dates we are interested in
    mask = (train['Date'] >= start_date) & (train['Date'] <= end_date)
    train_1 = train.loc[mask]
    
    # First, find the max and min coordinates of the spray
    margin = 0.05
    min_lat = spray.loc[spray['Date'] == spray_date]['Latitude'].min() - margin
    max_lat = spray.loc[spray['Date'] == spray_date]['Latitude'].max() + margin
    min_long = spray.loc[spray['Date'] == spray_date]['Longitude'].min() - margin
    max_long = spray.loc[spray['Date'] == spray_date]['Longitude'].max() + margin
    
    # We filter our DF into spray and non spray areas
    mask_2 = (train_1['Latitude'] >= min_lat) & (train_1['Latitude'] <= max_lat) & (train_1['Longitude'] >= min_long) & (train_1['Longitude']<= max_long)
    spray_area = train_1.loc[mask_2]
    mask_3 = ((train_1['Latitude'] < min_lat) | (train_1['Latitude'] > max_lat)) & ((train_1['Longitude'] < min_long) | (train_1['Longitude']> max_long))
    non_spray_area = train_1.loc[mask_3]
    
    # We filter these 2 DF into before and after spray
    spray_area_before = spray_area.loc[spray_area['Date']<= spray_date]
    spray_area_after = spray_area.loc[spray_area['Date']> spray_date]
    non_spray_area_before = non_spray_area.loc[non_spray_area['Date']<= spray_date]
    non_spray_area_after = non_spray_area.loc[non_spray_area['Date']> spray_date]
    
    # We then create a dataframe to house the values we are interested in
    results = pd.DataFrame(columns = ['Location','Avg Num Mosquitos', 'Wnv Rate'])
    new_row1 = {'Location': 'Non spray area (before spray date)',
               'Avg Num Mosquitos': non_spray_area_before['tot_mosquitos'].mean(),
               'Wnv Rate': non_spray_area_before['WnvPresent'].mean()}
    new_row2 = {'Location': 'Non spray area (after spray date)',
               'Avg Num Mosquitos': non_spray_area_after['tot_mosquitos'].mean(),
               'Wnv Rate': non_spray_area_after['WnvPresent'].mean()}
    new_row3 = {'Location': 'Spray area (before spray date)',
               'Avg Num Mosquitos': spray_area_before['tot_mosquitos'].mean(),
               'Wnv Rate': spray_area_before['WnvPresent'].mean()}
    new_row4 = {'Location': 'Spray area (after spray date)',
               'Avg Num Mosquitos': spray_area_after['tot_mosquitos'].mean(),
               'Wnv Rate': spray_area_after['WnvPresent'].mean()}
    results = results.append(new_row1, ignore_index=True)
    results = results.append(new_row2, ignore_index=True)
    results = results.append(new_row3, ignore_index=True)
    results = results.append(new_row4, ignore_index=True)
    return results

In [None]:
spray_effect('2011-09-07')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),5.851852,0.018519
1,Non spray area (after spray date),4.895833,0.020833
2,Spray area (before spray date),7.047619,0.095238
3,Spray area (after spray date),4.722222,0.027778


From the table above, we look at the spray date of 7 Sep 2011. We can see that there seems to be a reduction in the number of mosquitos and presence of West Nile Virus before and after spraying. For our control, we see that the westnile virus infection detection rate increased very slightly butfor the areas where spraying was done, we saw almost a 7% decrease in the West Nile Virus infection rate.

We move on to look at other dates in 2013.

In [None]:
spray_effect('2013-07-25')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),25.797468,0.050633
1,Non spray area (after spray date),8.333333,0.044444
2,Spray area (before spray date),24.493151,0.09589
3,Spray area (after spray date),17.583333,0.166667


In [None]:
spray_effect('2013-08-08')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),15.559524,0.083333
1,Non spray area (after spray date),15.357143,0.107143
2,Spray area (before spray date),12.752809,0.146067
3,Spray area (after spray date),14.659091,0.159091


In [None]:
spray_effect('2013-08-08')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),13.717172,0.141414
1,Non spray area (after spray date),10.090909,0.113636
2,Spray area (before spray date),19.845361,0.226804
3,Spray area (after spray date),17.295455,0.272727


In [None]:
spray_effect('2013-08-15')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),13.717172,0.141414
1,Non spray area (after spray date),10.090909,0.113636
2,Spray area (before spray date),19.845361,0.226804
3,Spray area (after spray date),17.295455,0.272727


In [None]:
spray_effect('2013-09-05')

Unnamed: 0,Location,Avg Num Mosquitos,Wnv Rate
0,Non spray area (before spray date),16.870968,0.177419
1,Non spray area (after spray date),15.311258,0.139073
2,Spray area (before spray date),48.384615,0.346154
3,Spray area (after spray date),33.659574,0.276596


We can see that across the various spray dates, the effectiveness of spraying is not consistent and there isn't a significant reduction in the WNV rate.

## External Research

To get a semblance of the cost and benefits of spraying, we first need to estimate the cost of West Nile Virus. [A paper](https://www.sciencedaily.com/releases/2014/02/140210184713.htm) released by the American Society of Tropical Medicine and Hygiene studied the the economic impact of West Nile virus in the United States across the span of 13 years from 1999 to 2012.

The 37,088 WNV disease cases reported to CDC from 1999 through 2012 included more than 16,000 patients with neurologic disease, over 18,000 patients who required hospitalization, and over 1,500 deaths. According to the CDC, individuals over 50 years of age are more likely to develop severe neurologic disease if infected. The total cost estimated was cumulative $778 million in health care expenditures and lost productivity.

This gives us an average cost of approximately $21,000 per case reported.

A [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241786/) published on the US National Library of Medicine stated 66 human cases of WNV for Chicago, Illinois in 2013. This brings the total cost to approximately $1,386,000.

The cost for spraying itself is not specified anywhere, however, the [2013 Chicago government budget](https://www.chicago.gov/content/dam/city/depts/obm/supp_info/2013%20Budget/2013Overview.pdf) listed the budget allocated to Environmental Health to be $1,191,811. This is defined as performing routine and complaint-generated inspections of facilities to ensure the City's ordinances related to environmental hazards are enforced. Coordinates mosquito surveillance and control activities and provides public education to reduce the risk of vector-borne diseases, principally the West Nile virus.

It is hard to do a dollar to dollar comparison of the cost/ benefit of spraying but we can construct a simple equation to estimate this.





![image.png](attachment:image.png)

We propose a simple equation above as a guide to estimate the cost/ benefits of spraying. From our modelling, we have a good sense of the accuracy of our prediction. [Mosquito infection rate](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241786/) and population density information is widely available.

The only pieces of information we will need is the actual dollar cost for spraying as well as an quantification of how effective the spray is at eradicating mosquitos and WNV. From our data, the results were inconclusive.

We can run a quick calculation with ficticious numbers:

In [None]:
# We first estimate cost for spraying
spray_cost = 500000              # estimate $500,000 for spraying
prediction_accuracy = 0.73       # 73% model prediction accuracy
spraying_efficacy = 0.65         # 65% spraying efficacy

cost_of_spraying = spray_cost * (1/prediction_accuracy) * (1/spraying_efficacy)

# We then estimate the potential economic impact
mosquito_infection_rate = 0.005  # 0.5%
population = 10000               # We estimate 10,000 people living in affected area
cost_per_case = 21000            # provided by study mentioned above

economic_impact = mosquito_infection_rate * population * cost_per_case

print(f'The cost for spraying will be ${cost_of_spraying}.')
print(f'The potential economic impact will be ${economic_impact}.')

The cost for spraying will be $1053740.7797681768.
The potential economic impact will be $1050000.0.


We can see above that the costs more or less even out and it will be up to the department of health to make a call in such instances. Our calculations also do not take into account the various intabgible benefits and costs arising from spraying and the West Nile Virus.

## Other Recommendations

There are also other 'cheaper' ways for the government to help reduce the spread of WNV. The Chicago budgetin 2013 showed $9,000,000 dedicated to community engaged care - promoting health through education, policy and service. The government can ramp up education on West Nile Virus and how to prevent mosquito breeding. They can educate the public on preventing stagnant water, applying insect repellent, wearing the proper attire to reduce the chances of getting bitten by mosquitos.

Other measures which they have already undertaken but can place more emphasis on is the reporting of dead birds (a key in transmission of WNV), reporting on tall grass and weeds.