In [1]:
import pandas as pd
import numpy as np
import re

# Flag for Extreme Temperatures

## Motivation
With climates rapidly changing, failures to sufficiently protect animals from both low and high temperatures may become more common. This flag captures narratives related to extreme ambient temperatures and their consequences on animals. 

## Conditions

We use two conditions to capture relevant reports. Condition 1 AND Condition 2 must both be true OR condition 2 must be true for a narrative to be flagged. 

1.  Recorded Temperatures
    - We use a Regex pattern to flag narratives containing recorded temperatures (80.5 degrees, 52 F, 25 Celsius) but exclude non-temperature related mentions of "degrees" (degrees of rust, degrees of hair loss). We also exclude mentions of "180 degrees" which is the water temperature required for proper sanitization and an extremely improbable ambient temperature on Earth. 
2. Keywords 
    - We use keywords related to overheating/excessive cooling to flag relevant narratives. To avoid capture of narratives mentioning heating equipment (heating pads, heated blankets), the majority of keywords are results of extreme ambient temperatures: heat stress, hypothermia, heat stroke, frostbite, etc.
    
This logic is a starting point; we will continue to iterate on improving accurate capture of narratives.

## Output
The code will create a copy of the inspections-citations CSV file in the flagged_citations folder. The file will have the following new columns:
- 'flag_cond_1': indicator column for flag condition 1
- 'flag_cond_2': indicator column for flag condition 2
- 'flag_extreme_temperatures': indicator column for flag

In [2]:
# Read in most recent aphis inspection-citations.csv
combined_dir = '../aphis-inspection-reports/data/combined/'

citations = pd.read_csv(combined_dir + 'inspections-citations.csv')
citations.shape

(38749, 6)

In [3]:
# Condition 1
temperature_pattern = re.compile(r'\b(?!180\b)\d+(\.\d+)?\s*(f|degrees|fahrenheit|deg f|celsius)\b')

citations['flag_cond_1'] = citations['narrative'].apply(lambda x: bool(temperature_pattern.search(x.lower())))
citations['flag_cond_1'].value_counts()

flag_cond_1
False    38094
True       655
Name: count, dtype: int64

In [4]:
# Condition 2
extreme_temps_keywords = [
    'climatic',
    'ambient temperature', 
    'temperature extremes', 
    'atmospheric temperature', 
    
    'extreme heat', 
    'heat index'
    'heat warning', 
    'excessive heat',
    'hot weather',
    'heat stroke', 
    'heat stress',
    
    'extreme cold',
    'cold temperature', 
    'cold weather', 
    'low temperature',
    'cold stress', 
    'frostbite', 
    'hyperthermia',
    'hypothermic', 
   
    'weather service',
    'accuweather', 
    'noaa'
]

citations['flag_cond_2'] = citations['narrative'].apply(lambda x: any(keyword in x.lower() for keyword in extreme_temps_keywords))
citations['flag_cond_2'].value_counts()

flag_cond_2
False    38019
True       730
Name: count, dtype: int64

## Creating Extreme Temperatures Flag Column

In [5]:
# Overheating flag
citations['flag_extreme_temperatures'] = ((citations['flag_cond_1'] & citations['flag_cond_2'])| (citations['flag_cond_2']))
citations['flag_extreme_temperatures'].value_counts()

flag_extreme_temperatures
False    38019
True       730
Name: count, dtype: int64

## Spot-Checking Flag

In [6]:
# Spot-check for positives
citations[citations['flag_extreme_temperatures'] == True]['narrative'].sample(100).tolist()

["The Raccoon Building has a high ambient temperature and humidity. The building is constructed of metal and is not\ninsulated. There are large fans running at both ends of the building and there are windows that are open along the\nsides. Towards the middle of the building there is little to no movement of air and the temperature in this part of the\nbuilding as taken by the Kestrel at approximately 2:00 pm is 96.1 degrees F and the heat index is 119.4 degrees F.\nThe outdoor air temperature at the time of the inspection as read by the Kestrel 101 degrees and the heat index is\n113 degrees. Multiple raccoons were displaying behaviors such as panting and lying on their sides and abdomens\nwith their legs splayed out that could indicate they are uncomfortable with the temperature. There are\napproximately 290 raccoons housed in this building. High temperature and humidity could have a negative impact\non the health and well-being of the animals.\nWhen climatic conditions present a threa

In [7]:
# Spot-check for negatives
citations[citations['flag_extreme_temperatures'] == False]['narrative'].sample(100).tolist()

['A female yellow Lab named "Evie" (chip #956000012765446) had an open sore with hair loss on its right front foot. The\nlicensee had observed the hair loss and noted it two days ago, but was not planning on notifying the Attending\nVeterinarian for a few more days. An open sore could become infected. Left untreated, this infection could become\nsystemic and result in serious health issues. The Attending Veterinarian must be contacted in a timely manner and the\ndog must be treated per his/her instructions.',
 'The freezer storing meat for the big cats and exotics at the facility needs to be addressed. There are boxes of\nunidentified meat in bags in the freezer. There is one box with a pig covered with a towel stacked on other boxes.\nThere are no dates on any of the meat and no rotation schedule to assure food is used in a timely manner.\nAdditionally there is no way to clean the area and there is ice build up along several boxes indicating that moisture\nis getting into the freezer 

In [8]:
# Save citations with new flag columns
citations.to_csv('../flagged_citations/extreme_temperatures.csv')