In [1]:
import pandas as pd
import numpy as np
import re

# Flag for Air Transport

## Motivation

We noticed that the most frequent APHIS inspection sites were airlines. While being transported, animals can be subject to unsafe conditions, lack of food and water, or otherwise improper handling. Investigation into report narratives could reveal critical areas of improvement for airlines.  

## Conditions

1. Keywords 
    - We use keywords to capture relevant narratives.
    
This logic is a starting point. We will continue to iterate on improving accurate capture of narratives.

## Output
The code will create a copy of the inspections-citations CSV file in the flagged_citations folder. The file will have the following new columns: 
- 'flag_cond_1': indicator column for flag condition 1
- 'flag_air_transport': indicator column for the air transport flag

In [2]:
# Read in most recent aphis inspection-citations.
combined_dir = '../aphis-inspection-reports/data/combined/'

citations = pd.read_csv(combined_dir + 'inspections-citations.csv')
citations.shape

(38749, 6)

In [3]:
# Condition 1
air_transport_keywords = [
    'airport',
    'airline', 
    'waybill',
    'airway bill'
    'air transport',
    'flight', 
    'passenger terminal',
    ' awb', 
    'aircraft', 
    'international terminal',
    'air cargo'
    
]

citations['flag_cond_1'] = citations['narrative'].apply(lambda x: any(word in x.lower() for word in air_transport_keywords))
citations['flag_cond_1'].value_counts()

flag_cond_1
False    38362
True       387
Name: count, dtype: int64

## Creating Air Transport Flag Column

In [4]:
citations['flag_air_transport'] = citations['flag_cond_1']
citations['flag_air_transport'].value_counts()

flag_air_transport
False    38362
True       387
Name: count, dtype: int64

## Spot-Checking Flag

In [5]:
# Spot-check positives
citations[citations['flag_air_transport'] == True]['narrative'].sample(100).tolist()

['Although the certifications of the last offerings of food and water, with a signature, for 20 dogs on flight CI512, AWB\n29716472912, from PEK to TPE to LAX, were available, the specific instructions for the next feeding and watering for a\n24-hour period were not. This information is required to complete the certifications. A system should be in place to ensure\nfood and water certifications, for dogs in transit, contain the required information.',
 'A puppy had been received by Aloha Air Cargo for shipment from Honolulu to Maui - with a waybill number 687\nHNL 0736 1406. There was a "Live Animal" sticker on top of the carrier, but not any other "Live Animal" stickers on\nthe enclosure. A "Live Animal" sticker should be applied to the top, and one or more sides of the enclosure.\nTo be corrected immediately.',
 "An adult cat was observed in the international baggage claim area at SFO, upon arrival from Mumbai. Its baggage\nticket was # 0220287967. The owner(cid:25)'s feeding and wat

In [6]:
# Spot-check negatives
citations[citations['flag_air_transport'] == False]['narrative'].sample(100).tolist()

['A responsible adult was not available to accompany APHIS Officials during the inspection process at 5:05 pm on\n02-Oct-19.\nThe inspector arrived at the facility at approximately 5:05pm and knocked on the door to the house, honked the\nhorn, and called the phone number and left a message. The inspector waited 30 minutes and then left the facility.',
 'The food receptacle boxes for the enclosure housing two American black bears and the enclosure housing two\nAsian black bears contained a large amount of rotting food debris. Any food fed in this box is going to be\ncontaminated by the rotting material present in the box. These feeders need to be cleaned and made sanitary to\nensure that the food fed to the bears is wholesome and free from bacterial or pest infestation which could result in\nhealth issues for the bears.',
 'A responsible adult was not available to accompany APHIS Officials during the inspection process at 11:05 am on\n18-DEC-17. The gate was locked and no one answered t

In [7]:
# Save citations with new flag column
citations.to_csv('../flagged_citations/air_transport.csv')