# Project: Long Beach Animal Shelter Intakes and Outcomes

## Description

**Objective**: \
\
Answer questions for various shareholders in the city of Long Beach, CA concerning intakes and outcomes at the local animal shelter.

**Dataset**: \
\
This dataset was pulled from the [Long Beach Open Data Portal](https://data.longbeach.gov/explore/dataset/animal-shelter-intakes-and-outcomes/). \
It is a 7.8MB CSV file containing intake and outcome data for animals captured by or surrendered to the city.

**Tools Used**:

- pandas
- Matplotlib
- Seaborn

# Introduction

For any city that has at least one animal shelter, there are various shareholders interested in how that shelter is run and what happens to the animals that pass through the shelter's doors.\
\
This analysis looks to answer questions for the following parties in Long Beach, CA:
- **Shelter managers:**
  - How long do animals typically stay in the shelter by species or intake condition?
  - What intake reasons are most strongly correlated with negative outcomes (e.g., euthanasia)?
  - Are there seasonal trends in animal intakes or outcomes?
- **Animal welfare advocates:**
  - What percentage of animals are adopted vs. euthanized, and how does that vary by type, sex, or condition?
  - Are there disparities in outcomes for specific breeds or geographic areas?
  - How many animals are returned to owners vs. adopted?
- **Local government officials:**
  - Is there a correlation between specific neighborhoods and high intake rates?
  - Has the shelter’s performance improved over time (e.g., reduced euthanasia rates)?
  - What’s the annual intake/output volume and trend?
- **Local citizenry:**
  - When is the best time of year to adopt (e.g., more animals available)?
  - What types of animals are most commonly available for adoption?
  - Can geographic patterns inform community outreach for fostering or adoption?
- **Internal analysts:**
  - What features best predict positive outcomes using logistic regression or clustering?
  - Can intake condition be used to forecast outcome types?

# Data handling

## Preview

In [124]:
# Import packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import re

# Import helper functions and variables
from utilities.config import get_path_obj, raw_data_path, processed_data_path, products_dir, images_dir, data_dir

In [125]:
# Load data
df = pd.read_csv(raw_data_path, parse_dates=['DOB', 'Intake Date', 'Outcome Date'],)

# Preview
df.head()

Unnamed: 0,Animal ID,Animal Name,Animal Type,Primary Color,Secondary Color,Sex,DOB,Intake Date,Intake Condition,Intake Type,...,Outcome Type,Outcome Subtype,latitude,longitude,intake_is_dead,outcome_is_dead,was_outcome_alive,geopoint,intake_duration,is_current_month
0,A594350,*HEAVY CREAM,CAT,BLACK,,Neutered,2014-07-28,2017-07-28,NORMAL,STRAY,...,ADOPTION,REPEAT ADT,33.79976,-118.126388,Alive on Intake,False,1,"33.7997598, -118.1263884",81.0,0
1,A347815,DUKE,DOG,BLACK,TAN,Neutered,2005-04-14,2018-11-30,NORMAL,OWNER SURRENDER,...,RESCUE,LIVELOVE,33.79976,-118.126388,Alive on Intake,False,1,"33.7997598, -118.1263884",27.0,0
2,A707449,*TABITHA,DOG,BLACK,WHITE,Spayed,2022-10-23,2023-09-23,NORMAL,STRAY,...,ADOPTION,,33.798953,-118.167334,Alive on Intake,False,1,"33.7989532, -118.167334",18.0,0
3,A712850,*KIWI,DOG,BLONDE,GOLD,Spayed,2022-07-06,2024-02-03,NORMAL,RETURN,...,ADOPTION,WEB,33.798936,-118.195889,Alive on Intake,False,1,"33.7989357, -118.1958891",0.0,0
4,A738972,KITTEN 2,CAT,BLACK,,Unknown,2025-03-28,2025-04-04,NORMAL,STRAY,...,RESCUE,LITTLELION,33.798936,-118.195889,Alive on Intake,False,1,"33.7989357, -118.1958891",0.0,0


### Structure

In [126]:
# Structure and summary
display(df.dtypes)
display(df.columns)
display(df.describe(include='all'))

Animal ID                    object
Animal Name                  object
Animal Type                  object
Primary Color                object
Secondary Color              object
Sex                          object
DOB                  datetime64[ns]
Intake Date          datetime64[ns]
Intake Condition             object
Intake Type                  object
Intake Subtype               object
Reason for Intake            object
Outcome Date         datetime64[ns]
Crossing                     object
Jurisdiction                 object
Outcome Type                 object
Outcome Subtype              object
latitude                    float64
longitude                   float64
intake_is_dead               object
outcome_is_dead                bool
was_outcome_alive             int64
geopoint                     object
intake_duration             float64
is_current_month              int64
dtype: object

Index(['Animal ID', 'Animal Name', 'Animal Type', 'Primary Color',
       'Secondary Color', 'Sex', 'DOB', 'Intake Date', 'Intake Condition',
       'Intake Type', 'Intake Subtype', 'Reason for Intake', 'Outcome Date',
       'Crossing', 'Jurisdiction', 'Outcome Type', 'Outcome Subtype',
       'latitude', 'longitude', 'intake_is_dead', 'outcome_is_dead',
       'was_outcome_alive', 'geopoint', 'intake_duration', 'is_current_month'],
      dtype='object')

Unnamed: 0,Animal ID,Animal Name,Animal Type,Primary Color,Secondary Color,Sex,DOB,Intake Date,Intake Condition,Intake Type,...,Outcome Type,Outcome Subtype,latitude,longitude,intake_is_dead,outcome_is_dead,was_outcome_alive,geopoint,intake_duration,is_current_month
count,33707,19956,33707,33707,15964,33707,29433,33707,33707,33707,...,33374,29842,33707.0,33707.0,33707,33707,33707.0,33707,33381.0,33707.0
unique,32557,9996,10,80,44,5,,,16,12,...,18,240,,,1,2,,10154,,
top,A637086,*,CAT,BLACK,WHITE,Male,,,NORMAL,STRAY,...,RESCUE,SPCALA,,,Alive on Intake,False,,"33.8096122, -118.0826161",,
freq,8,104,16083,8548,9380,7739,,,15297,23719,...,7842,4074,,,33707,26766,,570,,
mean,,,,,,,2018-11-03 22:44:42.295383040,2021-02-04 00:22:07.771679488,,,...,,,33.815444,-118.149526,,,0.794078,,18.741949,0.012075
min,,,,,,,1993-09-15 00:00:00,2017-01-01 00:00:00,,,...,,,19.297815,-122.695911,,,0.0,,0.0,0.0
25%,,,,,,,2016-09-16 00:00:00,2018-09-29 00:00:00,,,...,,,33.78399,-118.190865,,,1.0,,0.0,0.0
50%,,,,,,,2019-03-28 00:00:00,2021-01-02 00:00:00,,,...,,,33.806783,-118.173175,,,1.0,,5.0,0.0
75%,,,,,,,2022-04-06 00:00:00,2023-05-26 00:00:00,,,...,,,33.85121,-118.128915,,,1.0,,16.0,0.0
max,,,,,,,2025-07-06 00:00:00,2025-07-15 00:00:00,,,...,,,45.521885,-73.99236,,,1.0,,1410.0,1.0


In [127]:
# Rename columns

def rename(name: str):
    """Formats "name" by replacing spaces with underscores and changing the case to lower

    Args:
        name (str): the name to be formatted

    Returns:
        str: the formatted name
    """    
    name = name.replace(' ', '_')
    name = name.lower()
    if name == 'dob':
        name = 'date_of_birth'
    return name

df = df.rename(columns=rename)
df.columns

Index(['animal_id', 'animal_name', 'animal_type', 'primary_color',
       'secondary_color', 'sex', 'date_of_birth', 'intake_date',
       'intake_condition', 'intake_type', 'intake_subtype',
       'reason_for_intake', 'outcome_date', 'crossing', 'jurisdiction',
       'outcome_type', 'outcome_subtype', 'latitude', 'longitude',
       'intake_is_dead', 'outcome_is_dead', 'was_outcome_alive', 'geopoint',
       'intake_duration', 'is_current_month'],
      dtype='object')

### Variables (columns)

In [128]:
# Organize variables by attributes: animal, intake, outcome, datetime

def check_type_date(name: str):
    """Checks if a column is a datetime or timedelta type by searching the name for keywords

    Args:
        name (str): The string to be checked

    Returns:
        Match|None: A Match object if a match is found
    """      
    return re.search(r'.*date|month|duration.*', name)

animal_vars = [
    'animal_type',
    'primary_color',
    'secondary_color',
    'sex',
]
intake_vars = [x for x in df.columns if 'intake' in x and not check_type_date(x)]
outcome_vars = [x for x in df.columns if 'outcome' in x and not check_type_date(x)]
datetime_vars = [x for x in df.columns if check_type_date(x)]
geography_vars = [
    'latitude',
    'longitude',
    'geopoint',
    'crossing',
    'jurisdiction'
]
print(animal_vars, intake_vars, outcome_vars, datetime_vars, geography_vars, sep='\n')
vars_dict = {
    'animal': animal_vars,
    'intake': intake_vars,
    'outcome': outcome_vars,
    'datetime': datetime_vars,
    'geography': geography_vars
}

['animal_type', 'primary_color', 'secondary_color', 'sex']
['intake_condition', 'intake_type', 'intake_subtype', 'reason_for_intake', 'intake_is_dead']
['outcome_type', 'outcome_subtype', 'outcome_is_dead', 'was_outcome_alive']
['date_of_birth', 'intake_date', 'outcome_date', 'intake_duration', 'is_current_month']
['latitude', 'longitude', 'geopoint', 'crossing', 'jurisdiction']


#### Inspection

In [129]:
# Get counts for each variable and print to a CSV for visual inspection
for vars in vars_dict.values():
    for var in vars:
        df[var].value_counts().to_csv(get_path_obj(data_dir, 'variable counts', f'{var}.csv'))

In [130]:
# Export animal names to CSV for visual inspection
df.loc[~df.animal_name.isna()][['animal_id', 'animal_name']].to_csv(get_path_obj(data_dir, 'raw_animal_names.csv'), index=False)

In [131]:
# Export animal names with numbers in them for visual inspection
df.loc[df.animal_name.str.contains(r'.*\d.*', regex=True, na=False), ['animal_id', 'animal_name']].to_csv(get_path_obj(data_dir, 'raw_number_names.csv'), index=False)

In [132]:
# Export crossings which are missing zip codes to CSV for inspection
df_crossing = df.loc[~df.crossing.str.contains(r'\b\d{5} *$', regex=True)]
df_crossing.loc[~df_crossing.crossing.str.contains(r'TRANSFER', regex=True), ['animal_id', 'crossing']].to_csv(get_path_obj(data_dir, 'raw_crossing_wo_zip.csv'), index=False)

## Cleaning and preparation

#### Fix/remove data

In [133]:
# Remove superfluous columns
df_clean = df.drop(['was_outcome_alive'], axis=1)
df_clean.columns

Index(['animal_id', 'animal_name', 'animal_type', 'primary_color',
       'secondary_color', 'sex', 'date_of_birth', 'intake_date',
       'intake_condition', 'intake_type', 'intake_subtype',
       'reason_for_intake', 'outcome_date', 'crossing', 'jurisdiction',
       'outcome_type', 'outcome_subtype', 'latitude', 'longitude',
       'intake_is_dead', 'outcome_is_dead', 'geopoint', 'intake_duration',
       'is_current_month'],
      dtype='object')

In [134]:
# Compare latitude and longitude against geopoint

def check_lat_long():
    """Checks the "latitude" and "longitude" columns against the "geopoint" column to see if they match.
    Raises an AssertionError if they do not.
    """    
    df_clean['lat_from_geopoint'] = df.geopoint.str.split(', ').str[0].astype('float64')
    df_clean['long_from_geopoint'] = df.geopoint.str.split(', ').str[1].astype('float64')

    errors = []
    try: 
        assert df_clean.latitude.equals(df_clean.lat_from_geopoint), 'Latitude columns do not match'
    except AssertionError as err:
        errors.append(err)
    try: 
        assert df_clean.longitude.equals(df_clean.long_from_geopoint), 'Longitude columns do not match'
    except AssertionError as err:
        errors.append(err)
    if errors:
        return errors
    else:
        return ['Columns match']

check = check_lat_long()
print(*check, sep='\n')

Longitude columns do not match


In [135]:
# Update latitude, longitude, from geopoint
if check[0] != 'Columns match':
    df_clean.latitude = df_clean.lat_from_geopoint
    df_clean.longitude = df_clean.long_from_geopoint
check = check_lat_long()
print(*check, sep='\n')
if check[0] == 'Columns match':
    df_clean = df_clean.drop(['lat_from_geopoint', 'long_from_geopoint', 'geopoint'], axis=1)

Columns match


In [136]:
# Set specific columns' types to boolean
df_clean.intake_is_dead = df.intake_is_dead.apply(lambda x: False if x in ['Alive on Intake',] else True)
df_clean.is_current_month = df_clean.is_current_month.astype(bool)


In [137]:
# Check intake and outcome dates against intake_duration
df_clean['duration'] = df.outcome_date - df.intake_date
intake_duration_series = pd.to_timedelta(df.intake_duration, unit='day')

assert df_clean.duration.equals(intake_duration_series), '"Intake duration", "Intake date", "Outcome date" are inconsistent'

df_clean.intake_duration = pd.to_timedelta(df.intake_duration, unit='day')
df_clean.drop('duration', axis=1, inplace=True)

In [138]:
# Find animals whose names indicate that they were deceased on or shortly after arrival as indicated by name

filter_doa = df_clean.animal_name.str.contains(r'\bDOA|DEAD\b', regex=True, na=False)
not_dead = [  # These names are ambiguous, since they are indicated as "alive on intake" and not "dead on outcome"
    '*DEAD*BABADOOK',
    '*DEAD*HAZEL',
    '*DEAD CHOMPS',
    '*DEAD*MAUI',
    '*DEAD-TEDDY',
]
df_doa = df_clean.loc[(filter_doa) & ~(df_clean.animal_name.isin(not_dead))]
print('"Dead on or shortly after arrival" (as indicated by name):', df_doa.shape[0])

# Export details about these animals to a CSV file
df_doa[['animal_id', 'outcome_type', 'outcome_subtype', 'intake_date', 'outcome_date']].to_csv(get_path_obj(data_dir, 'processed_dead_by_name.csv'), index=False)

# Remove the "dead" indicator from their name
df_doa.loc[:,'animal_name'] = df_doa.animal_name.str.replace(r'^\**(?:DEAD|DOA)\** *', '', regex=True)
df_clean.loc[df_clean.animal_id.isin(df_doa.animal_id), 'animal_name'] = df_doa.animal_name

"Dead on or shortly after arrival" (as indicated by name): 14


In [139]:
# Fix "alive on intake" and "outcome is dead" values for animals who were dead on arrival
ids = [
    'A717145',
    'A717146'
]
df_clean.loc[df_clean.animal_id.isin(ids), 'intake_is_dead'] = True
df_clean.loc[df_clean.animal_id.isin(ids), 'outcome_is_dead'] = True

In [140]:
# Remove blank spaces from the beginning and end of names
df_clean.animal_name = df_clean.animal_name.str.replace('^\\** +', '*', regex=True)
df_clean.animal_name = df_clean.animal_name.str.replace(' +$', '', regex=True)

assert df_clean.animal_name.str.contains('^\\* +', regex=True, na=False).sum() == 0, 'Names starting with blanks found'
assert df_clean.animal_name.str.contains(' +$', regex=True, na=False).sum() == 0, 'Names ending with blanks found'

In [141]:
# Remove animal ID from animal name, if necessary

def check_id_in_name(row):
    string = row.animal_name
    if not pd.isna(string):
        id = re.search(r'\d+', row.animal_id)
        if id:
            pattern = r'\w?' + id.group()
            return bool(re.search(pattern, string))
    return False

def remove_id_from_name(row):
    string = row.animal_name
    if not pd.isna(string):
        id_ = row.animal_id
        num_id = re.search(r"\d+", id_)
        if num_id:
            pattern = fr'\w?{num_id.group()}'
            return re.sub(pattern, '', string)
    return string

name_filter = df_clean.apply(check_id_in_name, axis=1)
df_clean.loc[name_filter, 'animal_name'] = df_clean.loc[name_filter].apply(remove_id_from_name, axis=1)
assert df_clean.apply(check_id_in_name, axis=1).sum() == 0, 'Names with IDs found'

In [142]:
# Filter legitimate names (e.g. "50 CENT") out of the names with numbers and nullify names with numbers

legit_names = [
    '50 CENT',
    'MARSHAL MATHERS 4TH',
    '7-UP'
]

def remove_names_with_numbers(row):
    string = row.animal_name
    pattern = r'\d+'
    if re.search(pattern, string) and string.strip('*') not in legit_names:
        return np.nan
    return string

number_names = df_clean.animal_name.str.contains(r'\d', regex=True, na=False)
df_clean.loc[number_names, 'animal_name'] = df_clean.loc[number_names].apply(remove_names_with_numbers, axis=1)

number_names = df_clean.animal_name.str.contains(r'\d', regex=True, na=False)
legit_number_names = df_clean.animal_name.str.strip('*').isin(legit_names)
assert df_clean.loc[(number_names) & ~(legit_number_names)].shape[0] == 0, 'Illegitimate number names found'

In [143]:
# Remove uncaught blank names
df_clean.loc[df.animal_name.str.contains(r'^\* *$', regex=True, na=False), 'animal_name'] = np.nan

assert df_clean.animal_name.str.contains(r'^\* *$', regex=True, na=False).sum() == 0, 'Uncaught blank names found'

In [144]:
# Remove periods from "crossing" values

# Remove periods from end of string
df_clean.crossing = df_clean.crossing.str.replace(r'\.$', '', regex=True)

# Remove periods sandwiched between non-whitespace characters
df_clean.crossing = df_clean.crossing.str.replace(r'(\w)(\.)(\w)', r'\1 \3', regex=True)

# Remove all other periods
df_clean.crossing = df_clean.crossing.str.replace('.', '')

assert df_clean.loc[df_clean.crossing.str.contains('.', regex=False), 'crossing'].sum() == 0, 'Crossings with periods found'

In [145]:
# Replace "&" and "AND" with "/"
df_clean.crossing = df_clean.crossing.str.replace('&', '/')
df_clean.crossing = df_clean.crossing.str.replace(r'(\s)(AND)(\s)', r'\1/\3', regex=True)

condition_1 = df_clean.crossing.str.contains('&').sum() == 0
condition_2 = df_clean.crossing.str.contains(r'\sAND\s', regex=True).sum() == 0
assert condition_1 and condition_2, 'Crossings with "&" or "AND" found'

In [146]:
# Fix "crossing" values so that "/" is always enclosed by spaces

def refactor_slash(string: str):
    array = string.split('/')
    array = [s.strip() for s in array]
    return ' / '.join(array)

df_clean.crossing = df_clean.crossing.apply(refactor_slash)
assert df_clean.crossing.str.contains(r'\w/\w', regex=True).sum() == 0, 'Crossings with sandwiched "/" found'

In [147]:
# Replace "STREET", "ROAD", "AVENUE" with their appropriate abbreviations

In [148]:
# Fix all references to "Pacific Coast Highway"

# Replace "PCH HWY" with "PACIFIC COAST HWY"
df_clean.crossing = df_clean.crossing.str.replace('PCH HWY', 'PACIFIC COAST HWY')

# Replace "PACIFIC COAST HIGHWAY" with "PACIFIC COAST HWY"
df_clean.crossing = df_clean.crossing.str.replace('PACIFIC COAST HIGHWAY', 'PACIFIC COAST HWY')

In [149]:
# Change "LBB" to "LONG BEACH BLVD" in "crossing"
df_clean.crossing = df_clean.crossing.str.replace('LBB', 'LONG BEACH BLVD')

assert df_clean.crossing.str.contains('LBB').sum() == 0, 'Crossings with "LBB" found'

In [150]:
# Fix references to "Long Beach"

# Change "LB" to "LONG BEACH" for "crossing" values
df_clean.crossing = df_clean.crossing.str.replace(r'\bLB\b', 'LONG BEACH', regex=True)

# Change "LONGBEACH" to "LONG BEACH"
df_clean.crossing = df_clean.crossing.str.replace('LONGBEACH', 'LONG BEACH', regex=True)

In [151]:
# Replace "BLK" with "BLOCK" in "crossing"
df_clean.crossing = df_clean.crossing.str.replace(r'BLK\b', 'BLOCK', regex=True)
df_clean.crossing = df_clean.crossing.str.replace(r'BLK[KL]\b', 'BLOCK', regex=True)
df_clean.crossing = df_clean.crossing.str.replace(r'BLKW\b', 'BLOCK W', regex=True)
df_clean.crossing = df_clean.crossing.str.replace(r'BLKE\b', 'BLOCK E', regex=True)
df_clean.loc[2023, 'crossing'] = '3700 BLOCK ELM AVE,LONG BEACH, CA 90807'
df_clean.loc[26272, 'crossing'] = '20100 BLOCK BOUMA CT, CERRITOS, CA 90703'
df_clean.loc[29902, 'crossing'] = '300 BLOCK MEDITERRANEAN WAY LONG BEACH, CA 90802'
df_clean.loc[32825, 'crossing'] = '900 BLOCK MAINE AVE, LONG BEACH, CA 90813'

assert df_clean.crossing.str.contains('BLK').sum() == 0, 'Crossings with with word "BLK" found'

In [152]:
# Fix "crossing" values where "BLOCK" occurs more than once
df_clean.loc[910, 'crossing'] = '300 BLOCK E JANICE ST, LONG BEACH, CA 90805'
df_clean.loc[1003, 'crossing'] = '300 BLOCK ROYCROFT AVE, LONG BEACH, CA 90814'  # Fixed zip code
df_clean.crossing = df_clean.crossing.str.replace('BLOCK LINDEN BLOCK', 'LINDEN BLOCK')

assert df_clean.crossing.str.contains(r'(?:BLOCK.*){2,}').sum() == 0, 'Crossings with multiple occurences of "BLOCK" found'

In [153]:
# Fix "crossing" values where "BLOCK" is sandwiched against another non-whitespace character
df_clean.crossing = df_clean.crossing.str.replace(r'(\w)(BLOCK)', r'/1 /2', regex=True)
df_clean.crossing = df_clean.crossing.str.replace(r'(BLOCK)(\w)', r'/1 /2', regex=True)

condition_1 = df_clean.crossing.str.contains(r'\wBLOCK').sum() == 0
condition_2 = df_clean.crossing.str.contains(r'BLOCK\w').sum() == 0

assert condition_1 and condition_2, 'Crossings found where "BLOCK" sandwiched against leading or trailing non-whitespace character'

In [154]:
# Fix "crossing" values that have "BKL"
df_clean.crossing = df_clean.crossing.str.replace('BKL', 'BLOCK')

assert df_clean.crossing.str.contains('BKL').sum() == 0, 'Crossings with "BKL" found'

In [155]:
# Replace two or more consecutive spaces with a single space in "crossing"
df_clean.crossing = df_clean.crossing.str.replace(r'\s+', ' ', regex=True)

assert df_clean.crossing.str.contains(r'\s\s+', regex=True).sum() == 0, 'Crossings with multiple whitespaces found'

#### Add features

In [156]:
# Add an "age at intake" column
df_clean['age_at_intake'] = df.intake_date - df.date_of_birth

In [157]:
# Add an "age at outcome" column
df_clean['age_at_outcome'] = df.outcome_date - df.date_of_birth

In [158]:
# Add a "has_name" column
df_clean['has_name'] = ~df.animal_name.isna()

In [159]:
# Add a "named_by_shelter" column
df_clean['named_by_shelter'] = df.animal_name.str.contains(r'\*').astype(bool)

In [160]:
# Remove "named by shelter indicator" from animal name
df_clean.animal_name = df_clean.animal_name.str.replace(r'^\*', '', regex=True)

In [161]:
# Export processed animal names to CSV
df_clean.loc[:, ['animal_id', 'animal_name']].to_csv(get_path_obj(data_dir, 'processed_animal_names.csv'), index=False)

In [172]:
# Export unique processed crossings to CSV
df_clean['crossing'].value_counts().to_csv(get_path_obj(data_dir, 'processed_crossings_unique.csv'))

In [163]:
# Check data types
df_clean.dtypes

animal_id                     object
animal_name                   object
animal_type                   object
primary_color                 object
secondary_color               object
sex                           object
date_of_birth         datetime64[ns]
intake_date           datetime64[ns]
intake_condition              object
intake_type                   object
intake_subtype                object
reason_for_intake             object
outcome_date          datetime64[ns]
crossing                      object
jurisdiction                  object
outcome_type                  object
outcome_subtype               object
latitude                     float64
longitude                    float64
intake_is_dead                  bool
outcome_is_dead                 bool
intake_duration      timedelta64[ns]
is_current_month                bool
age_at_intake        timedelta64[ns]
age_at_outcome       timedelta64[ns]
has_name                        bool
named_by_shelter                bool
d

In [164]:
# Export processed data to CSV
df_clean.to_csv(get_path_obj(processed_data_path), index=False)

## Exploratory Data Analysis (EDA)

## Deeper analysis and modeling

# Analysis

## Insights and recommendations

### Insights

### Recommedations

## Summary

*This report can also be found [here](../products/report.md).*

## Appendix