In [1]:
import pandas as pd
import numpy as np

# Flag for Air Transit

## Motivation

This flag captures violations related to the transportation of animals via air. We noticed a number of violations tracing back to specific waybills. Further regulation and policies. 

## Logic Overview

We use several keywords to capture relevant reports. 

## Output
Will create Citations csv in flagged_citations folder with the following new columns, all with prefix "flag_". 
Indicator columns: 
- 'flag_include_1'
- 'flag_air_transit'

In [2]:
# Read in most recent aphis inspection-citations.
combined_dir = '../aphis-inspection-reports/data/combined/'

citations = pd.read_csv(combined_dir + 'inspections-citations.csv')
citations.shape

(38749, 6)

## Flag Conditions

In [3]:
air_transit_keywords = [
    'airport',
    'airlines', 
    'waybill', 
    'plane', 
    'air transport'
]

citations['flag_include_1'] = citations['narrative'].apply(lambda x: any(word in x.lower() for word in air_transit_keywords))
citations['flag_include_1'].value_counts()

flag_include_1
False    38425
True       324
Name: count, dtype: int64

## Creating Air Transit Flag Column

In [4]:
citations['flag_air_transit'] = citations['flag_include_1']
citations['flag_air_transit'].value_counts()

flag_air_transit
False    38425
True       324
Name: count, dtype: int64

## Spot-checking Flag

In [5]:
# Spot-check positives
citations[citations['flag_air_transit'] == True]['narrative'].sample(100).tolist()

['The food and water certificate attached to the primary enclosure housing a miniature pincher puppy was not\ncomplete ( see air waybill number 001TUL69811416). The food and water instructions were not filled out for the\nnext feeding(s) and watering(s) for a 24 hour period. Failure to complete the food and water instructions at the\nairport origin may result in the animal not being feed and watered during transport. The food and water certificate\nmust be complete and attached to the primary enclosure with instructions for the next feeding(s) and watering(s) for\na 24-hour period.',
 'On May 11, 2023, dog “Poplar” (Microchip #933082607296029) arrived at the Dulles International Airport as excess\nbaggage on Emirates flight #EK231 AWB 176-05112023 from Afghanistan (approximately 17 hours). Upon examination\nby CDC officials, it was noted that “Poplar was physically unable to right himself, stand, or walk due to a medical\ncondition. Poplar also had a swollen ulcerated skin wound on the

In [6]:
# Spot-check negatives
citations[citations['flag_air_transit'] == False]['narrative'].sample(100).tolist()

['Section 2.126(b) - Access and inspection of records and property: A responsible adult shall be made available to\naccompany APHIS officials during the inspection process.\nA responsible adult was not available to accompany APHIS Officials during the inspection process at 12:15pm on\n04/02/2014.',
 'On 1/3/2014, the Licensee was cited for three adult Rottweiler(cid:25)s that are housed in the upper outdoor housing\nfacility without large enough shelters.\nDuring today(cid:25)s inspection, 12/17/2014, all dogs that were housed in the upper and lower outdoor housing facilities\nhave shelters that are not large enough for the Rottweiler or Boxer dogs that are housed in these enclosures. The\nLicensee is continuing to use plastic barrels that do not allow these dogs to sit, stand, and lie in a normal manner\nand to turn about freely while in these shelters. The Inspector observed one of the larger Rottweiler(cid:25)s in the lower\noutside facility struggling to get out of the barrel. The 

In [7]:
# Filtering for flag
flagged_citations = citations[citations['flag_air_transit'] == True]
flagged_citations.shape

(324, 8)

In [8]:
# Save citations with new flag column
flagged_citations.to_csv('../flagged_citations/air_transit.csv')