![pandas](https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Pandas_logo.svg/2880px-Pandas_logo.svg.png)

# Objectives

- Load .csv files into `pandas` DataFrames
- Describe and manipulate data in Series and DataFrames

# What is Pandas?

![I have no idea what I'm doing panda](https://cdn-images-1.medium.com/max/1600/1*oBx032ncOwLmCFX3Epo3Zg.jpeg)

Just kidding - not actual literal pandas.

Pandas, as [the Anaconda docs](https://docs.anaconda.com/anaconda/packages/py3.7_osx-64/) tell us, offers us "High-performance, easy-to-use data structures and data analysis tools." It's something like "Excel for Python", but it's quite a bit more powerful. The name comes from "panel data", a common way to describe the kind of multidimensional data we'll be working with in certain academic circles (namely, statistics and econometrics) [[Source]](https://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2011/submissions/pyhpc2011_submission_9.pdf)

In order to use pandas, we'll need to import it into our notebook first.

In [1]:
# Import - using the common alias
import pandas as pd


## Accessing Data

![pandas documentation image showcasing the kinds of data it can both read and write to](https://pandas.pydata.org/docs/_images/02_io_readwrite.svg)

[[Image Source]](https://pandas.pydata.org/docs/getting_started/intro_tutorials/02_read_write.html)

Pandas can access a ton of different data types, including some that should be familiar: CSVs and JSONs! That's right, no more `with` / `open` statements now that we're using pandas!

Most of the time, we'll see CSVs - so let's access a 'toy' data set quickly just to familiarize ourselves with using pandas. There's a heart dataset available in the data folder on this repository - let's read that in.

In [2]:
# Use read_csv to read in the heart csv file
# Need to assign it to a variable too - let's call this heart_df
heart_df = pd.read_csv('data/heart.csv')

Find out more about this dataset [here](https://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29).

The output of the `.read_csv()` function is a pandas *DataFrame*, which has a familiar tabaular structure of rows and columns.

In [3]:
# Let's check this variable out
heart_df

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
298,57,0,0,140,241,0,1,123,1,0.2,1,0,3,0
299,45,1,3,110,264,0,1,132,0,1.2,1,0,3,0
300,68,1,0,144,193,1,1,141,0,3.4,1,2,3,0
301,57,1,0,130,131,0,1,115,1,1.2,1,1,3,0


In [4]:
# What type is this variable?
type(heart_df)

pandas.core.frame.DataFrame

## DataFrames and Series

Two main types of pandas objects are the DataFrame and the Series, the latter being in effect a single column of the former:

In [9]:
# Let's grab just one column
age_series = heart_df['age']

In [6]:
age_series

0      63
1      37
2      41
3      56
4      57
       ..
298    57
299    45
300    68
301    57
302    57
Name: age, Length: 303, dtype: int64

Notice how we can isolate a column of our DataFrame simply by using square brackets together with the name of the column. We can also access columns as an attribute of the DataFrame - but that only works if the name of the column doesn't have any spaces or weird characters!

In [11]:
heart_df['age']

0      63
1      37
2      41
3      56
4      57
       ..
298    57
299    45
300    68
301    57
302    57
Name: age, Length: 303, dtype: int64

In [12]:
# What type is the column?
type(age_series)

pandas.core.series.Series

Both Series and DataFrames have an *index* as well:

In [13]:
heart_df.index

RangeIndex(start=0, stop=303, step=1)

In [14]:
age_series.index

RangeIndex(start=0, stop=303, step=1)

DataFrames have columns - but a Series is just a single column, so it doesn't have the columns attribute.

In [15]:
heart_df.columns

Index(['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach',
       'exang', 'oldpeak', 'slope', 'ca', 'thal', 'target'],
      dtype='object')

In [16]:
heart_df.keys()

Index(['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach',
       'exang', 'oldpeak', 'slope', 'ca', 'thal', 'target'],
      dtype='object')

In [None]:

# This will throw an error!

Pandas is built on top of NumPy, and we can always access the NumPy array underlying a DataFrame using `.values`.

In [17]:
heart_df.values

array([[63.,  1.,  3., ...,  0.,  1.,  1.],
       [37.,  1.,  2., ...,  0.,  2.,  1.],
       [41.,  0.,  1., ...,  0.,  2.,  1.],
       ...,
       [68.,  1.,  0., ...,  2.,  3.,  0.],
       [57.,  1.,  0., ...,  1.,  3.,  0.],
       [57.,  0.,  1., ...,  1.,  2.,  0.]])

## Basic DataFrame Attributes and Methods

### `.head()` : first 5 rows

In [24]:
heart_df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [19]:
first_five = heart_df.head()

In [20]:
first_five

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


### `.tail()` : last 5 rows

In [21]:
heart_df.tail()
heart_df

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
298,57,0,0,140,241,0,1,123,1,0.2,1,0,3,0
299,45,1,3,110,264,0,1,132,0,1.2,1,0,3,0
300,68,1,0,144,193,1,1,141,0,3.4,1,2,3,0
301,57,1,0,130,131,0,1,115,1,1.2,1,1,3,0


### `.info()` : information about the columns, including about nulls in those columns

In [25]:
heart_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   age       303 non-null    int64  
 1   sex       303 non-null    int64  
 2   cp        303 non-null    int64  
 3   trestbps  303 non-null    int64  
 4   chol      303 non-null    int64  
 5   fbs       303 non-null    int64  
 6   restecg   303 non-null    int64  
 7   thalach   303 non-null    int64  
 8   exang     303 non-null    int64  
 9   oldpeak   303 non-null    float64
 10  slope     303 non-null    int64  
 11  ca        303 non-null    int64  
 12  thal      303 non-null    int64  
 13  target    303 non-null    int64  
dtypes: float64(1), int64(13)
memory usage: 33.3 KB


### `.describe()` : statistics about the data

In [26]:
heart_df.describe()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
count,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0
mean,54.366337,0.683168,0.966997,131.623762,246.264026,0.148515,0.528053,149.646865,0.326733,1.039604,1.39934,0.729373,2.313531,0.544554
std,9.082101,0.466011,1.032052,17.538143,51.830751,0.356198,0.52586,22.905161,0.469794,1.161075,0.616226,1.022606,0.612277,0.498835
min,29.0,0.0,0.0,94.0,126.0,0.0,0.0,71.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,47.5,0.0,0.0,120.0,211.0,0.0,0.0,133.5,0.0,0.0,1.0,0.0,2.0,0.0
50%,55.0,1.0,1.0,130.0,240.0,0.0,1.0,153.0,0.0,0.8,1.0,0.0,2.0,1.0
75%,61.0,1.0,2.0,140.0,274.5,0.0,1.0,166.0,1.0,1.6,2.0,1.0,3.0,1.0
max,77.0,1.0,3.0,200.0,564.0,1.0,2.0,202.0,1.0,6.2,2.0,4.0,3.0,1.0


### `.dtypes` : data types of each column

In [27]:
heart_df.dtypes

age           int64
sex           int64
cp            int64
trestbps      int64
chol          int64
fbs           int64
restecg       int64
thalach       int64
exang         int64
oldpeak     float64
slope         int64
ca            int64
thal          int64
target        int64
dtype: object

### `.shape` : number of rows and columns

In [28]:
heart_df.shape

(303, 14)

### Statistics

We saw them above, in the `.describe`, but we can also calculate statistics by calling them individually.

In [30]:
# Calculate the mean - for the whole dataframe!
heart_df.mean()

age          54.366337
sex           0.683168
cp            0.966997
trestbps    131.623762
chol        246.264026
fbs           0.148515
restecg       0.528053
thalach     149.646865
exang         0.326733
oldpeak       1.039604
slope         1.399340
ca            0.729373
thal          2.313531
target        0.544554
dtype: float64

In [31]:
# Now min
heart_df.min()

age          29.0
sex           0.0
cp            0.0
trestbps     94.0
chol        126.0
fbs           0.0
restecg       0.0
thalach      71.0
exang         0.0
oldpeak       0.0
slope         0.0
ca            0.0
thal          0.0
target        0.0
dtype: float64

In [32]:
# And max
heart_df.max()

age          77.0
sex           1.0
cp            3.0
trestbps    200.0
chol        564.0
fbs           1.0
restecg       2.0
thalach     202.0
exang         1.0
oldpeak       6.2
slope         2.0
ca            4.0
thal          3.0
target        1.0
dtype: float64

In [33]:
#and max
heart_df['age'].min()

29

## Enough With The Small Stuff - Bring On Real Data!

Let's access an open data portal and get some real live data!

Austin Animal Center Intake Data: https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Intakes/wter-evkm/

In [39]:
# Accessing a CSV from a url
intakes_url = pd.read_csv('https://data.austintexas.gov/resource/wter-evkm.csv') 
intakes_url.head()

Unnamed: 0,animal_id,name,datetime,datetime2,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color
0,A875253,,2023-02-24T09:33:00.000,2023-02-24T09:33:00.000,5300 Sh 71 Eastbound in Travis (TX),Stray,Injured,Dog,Unknown,2 years,Chow Chow/German Shepherd,Black/Black
1,A839506,Beau,2023-02-24T08:18:00.000,2023-02-24T08:18:00.000,13705 Fuchs Grove Road in Travis (TX),Stray,Injured,Dog,Neutered Male,8 years,Bull Terrier Mix,Gray/White
2,A875242,Butterfly,2023-02-24T08:14:00.000,2023-02-24T08:14:00.000,6816 Boyce Lane in Travis (TX),Public Assist,Normal,Dog,Intact Female,1 year,Jack Russell Terrier Mix,White/Black
3,A875239,A875239,2023-02-24T08:03:00.000,2023-02-24T08:03:00.000,4434 Frontier Trail in Austin (TX),Stray,Sick,Dog,Intact Male,2 years,German Shepherd Mix,Tan/White
4,A839495,Sammy,2023-02-24T00:43:00.000,2023-02-24T00:43:00.000,13705 Fuchs Grove Rd in Travis (TX),Public Assist,Injured,Dog,Neutered Male,3 years,Pit Bull Mix,Gray/White


In [40]:
# Same as the JSON output from this API endpoint, but different levels of detail for dates!
intakes_url = pd.read_json('https://data.austintexas.gov/resource/wter-evkm.json') 
intakes_url.head()

Unnamed: 0,animal_id,datetime,datetime2,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,name
0,A875253,2023-02-24 09:33:00,2023-02-24T09:33:00.000,5300 Sh 71 Eastbound in Travis (TX),Stray,Injured,Dog,Unknown,2 years,Chow Chow/German Shepherd,Black/Black,
1,A839506,2023-02-24 08:18:00,2023-02-24T08:18:00.000,13705 Fuchs Grove Road in Travis (TX),Stray,Injured,Dog,Neutered Male,8 years,Bull Terrier Mix,Gray/White,Beau
2,A875242,2023-02-24 08:14:00,2023-02-24T08:14:00.000,6816 Boyce Lane in Travis (TX),Public Assist,Normal,Dog,Intact Female,1 year,Jack Russell Terrier Mix,White/Black,Butterfly
3,A875239,2023-02-24 08:03:00,2023-02-24T08:03:00.000,4434 Frontier Trail in Austin (TX),Stray,Sick,Dog,Intact Male,2 years,German Shepherd Mix,Tan/White,A875239
4,A839495,2023-02-24 00:43:00,2023-02-24T00:43:00.000,13705 Fuchs Grove Rd in Travis (TX),Public Assist,Injured,Dog,Neutered Male,3 years,Pit Bull Mix,Gray/White,Sammy


In [41]:
# But this is only 1000 rows... website says there's 136K rows!
intakes_url.shape

(1000, 12)

In [43]:
# It's a limitation of the API - let's just download the data instead
# It's in the data folder
df = pd.read_csv('data/Austin_Animal_Center_Outcomes_022822.csv')
df.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby


In [45]:
# Now let's explore those earlier attributes and methods on this dataset!
# Check the first 5 rows
df.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby


In [46]:
# Check the last 5 rows
df.tail()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color
137092,A850166,Rainey,01/24/2022 06:20:00 PM,Jan 2022,11/19/2021,Adoption,,Cat,Intact Male,2 months,Siamese,Seal Point
137093,A852031,Noodle,02/28/2022 12:50:00 PM,Feb 2022,02/23/2020,Transfer,Partner,Dog,Neutered Male,2 years,Pomeranian/Chihuahua Longhair,Buff
137094,A845839,*Carmen,02/28/2022 01:49:00 PM,Feb 2022,05/05/2020,Adoption,Foster,Dog,Spayed Female,1 year,Pit Bull Mix,Brown
137095,A844321,Mia Marie,02/28/2022 01:04:00 PM,Feb 2022,10/15/2013,Adoption,Foster,Dog,Spayed Female,8 years,Pit Bull,Black/White
137096,A813933,Lucille,02/28/2022 02:19:00 PM,Feb 2022,12/21/2018,Adoption,,Dog,Spayed Female,3 years,Belgian Malinois,Brown/Black


In [47]:
# Check the shape
df.shape

(137097, 12)

In [48]:
# Check the datatypes
df.dtypes

Animal ID           object
Name                object
DateTime            object
MonthYear           object
Date of Birth       object
Outcome Type        object
Outcome Subtype     object
Animal Type         object
Sex upon Outcome    object
Age upon Outcome    object
Breed               object
Color               object
dtype: object

In [49]:
# Check more general information on the dataframe
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 137097 entries, 0 to 137096
Data columns (total 12 columns):
 #   Column            Non-Null Count   Dtype 
---  ------            --------------   ----- 
 0   Animal ID         137097 non-null  object
 1   Name              96095 non-null   object
 2   DateTime          137097 non-null  object
 3   MonthYear         137097 non-null  object
 4   Date of Birth     137097 non-null  object
 5   Outcome Type      137073 non-null  object
 6   Outcome Subtype   62653 non-null   object
 7   Animal Type       137097 non-null  object
 8   Sex upon Outcome  137095 non-null  object
 9   Age upon Outcome  137092 non-null  object
 10  Breed             137097 non-null  object
 11  Color             137097 non-null  object
dtypes: object(12)
memory usage: 12.6+ MB


In [50]:
# Check summary/descriptive statistics on the dataframe
df.describe()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color
count,137097,96095,137097,137097,137097,137073,62653,137097,137095,137092,137097,137097
unique,122587,23022,113939,101,7499,9,26,5,5,53,2735,617
top,A721033,Max,04/18/2016 12:00:00 AM,Jun 2019,05/01/2016,Adoption,Partner,Dog,Neutered Male,1 year,Domestic Shorthair Mix,Black/White
freq,33,617,39,2244,119,62707,33259,77091,48299,23780,32387,14315


#### Any Observations?

- Just shows the 'Year' column - it's the only numeric column!

In [51]:
# We can run describe on just the string columns! Gives a different kind of output
#THE COLUMN IS NOT YEAR
df.describe(include=[object])

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color
count,137097,96095,137097,137097,137097,137073,62653,137097,137095,137092,137097,137097
unique,122587,23022,113939,101,7499,9,26,5,5,53,2735,617
top,A721033,Max,04/18/2016 12:00:00 AM,Jun 2019,05/01/2016,Adoption,Partner,Dog,Neutered Male,1 year,Domestic Shorthair Mix,Black/White
freq,33,617,39,2244,119,62707,33259,77091,48299,23780,32387,14315


#### Any Observations?

- Can see non-null counts, number of uniques, plus the most frequent and how frequent it is
- Showcases the different kind of useful data you can explore when it's object vs numeric columns

## Adding to a DataFrame

### Adding Rows

We have a new animal coming in, captured here in a Python dictionary:

In [52]:
# Dictionary, where keys match the column names and values are the row values
# Note that the values are list-like - you could easily add more rows by adding to the lists!
next_row = {
    'Animal ID': ['A851755'],
    'Name': ["T'Challa"],
    'DateTime': ['2/28/2022 11:25:00 AM'],
    'MonthYear': [2022],
    'Found Location': ['Houston (TX)'],
    'Intake Type': ['Public Assist'],
    'Intake Condition': ['Normal'],
    'Animal Type': ['Cat'],
    'Sex upon Intake': ['Neutered Male'],
    'Age upon Intake': ['4 years'],
    'Breed': ['Domestic Shorthair'],
    'Color': ['Black']
}
next_row

{'Animal ID': ['A851755'],
 'Name': ["T'Challa"],
 'DateTime': ['2/28/2022 11:25:00 AM'],
 'Year': [2022],
 'Found Location': ['Houston (TX)'],
 'Intake Type': ['Public Assist'],
 'Intake Condition': ['Normal'],
 'Animal Type': ['Cat'],
 'Sex upon Intake': ['Neutered Male'],
 'Age upon Intake': ['4 years'],
 'Breed': ['Domestic Shorthair'],
 'Color': ['Black']}

How can we add this to the bottom of our dataset?

In [53]:
# Let's first turn this into a DataFrame.
# We can use the .from_dict() method.
new2 = pd.DataFrame(next_row)
new2

Unnamed: 0,Animal ID,Name,DateTime,Year,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color
0,A851755,T'Challa,2/28/2022 11:25:00 AM,2022,Houston (TX),Public Assist,Normal,Cat,Neutered Male,4 years,Domestic Shorthair,Black


In [56]:
# Now we just need to concatenate the two DataFrames together.
# Note the `ignore_index` parameter! We'll set that to True.

df_augmented = pd.concat([df,new2], ignore_index=True)

In [57]:
# Let's check the end to make sure we were successful!
df_augmented.tail()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake
137093,A852031,Noodle,02/28/2022 12:50:00 PM,Feb 2022,02/23/2020,Transfer,Partner,Dog,Neutered Male,2 years,Pomeranian/Chihuahua Longhair,Buff,,,,,,
137094,A845839,*Carmen,02/28/2022 01:49:00 PM,Feb 2022,05/05/2020,Adoption,Foster,Dog,Spayed Female,1 year,Pit Bull Mix,Brown,,,,,,
137095,A844321,Mia Marie,02/28/2022 01:04:00 PM,Feb 2022,10/15/2013,Adoption,Foster,Dog,Spayed Female,8 years,Pit Bull,Black/White,,,,,,
137096,A813933,Lucille,02/28/2022 02:19:00 PM,Feb 2022,12/21/2018,Adoption,,Dog,Spayed Female,3 years,Belgian Malinois,Brown/Black,,,,,,
137097,A851755,T'Challa,2/28/2022 11:25:00 AM,,,,,Cat,,,Domestic Shorthair,Black,2022.0,Houston (TX),Public Assist,Normal,Neutered Male,4 years


### Adding (and Deleting) Columns

Adding a column is very easy in `pandas`. Let's add a new column to our dataset called "test", and set all of its values to 0.

In [58]:
# Create a new column, 'test', where every value in the col is 0
df_augmented['test'] = 0

In [59]:
# Sanity check
df_augmented.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,,0
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,,0
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,,,,,,,0
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,,0


But we don't need that - let's drop that column.

In [60]:
# Drop that test column
df_augmented.drop(columns=['test'])

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,,,,,,
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137093,A852031,Noodle,02/28/2022 12:50:00 PM,Feb 2022,02/23/2020,Transfer,Partner,Dog,Neutered Male,2 years,Pomeranian/Chihuahua Longhair,Buff,,,,,,
137094,A845839,*Carmen,02/28/2022 01:49:00 PM,Feb 2022,05/05/2020,Adoption,Foster,Dog,Spayed Female,1 year,Pit Bull Mix,Brown,,,,,,
137095,A844321,Mia Marie,02/28/2022 01:04:00 PM,Feb 2022,10/15/2013,Adoption,Foster,Dog,Spayed Female,8 years,Pit Bull,Black/White,,,,,,
137096,A813933,Lucille,02/28/2022 02:19:00 PM,Feb 2022,12/21/2018,Adoption,,Dog,Spayed Female,3 years,Belgian Malinois,Brown/Black,,,,,,


In [61]:
# Alternate way using axis arguement
df_augmented.drop('test', axis=1)

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,,,,,,
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137093,A852031,Noodle,02/28/2022 12:50:00 PM,Feb 2022,02/23/2020,Transfer,Partner,Dog,Neutered Male,2 years,Pomeranian/Chihuahua Longhair,Buff,,,,,,
137094,A845839,*Carmen,02/28/2022 01:49:00 PM,Feb 2022,05/05/2020,Adoption,Foster,Dog,Spayed Female,1 year,Pit Bull Mix,Brown,,,,,,
137095,A844321,Mia Marie,02/28/2022 01:04:00 PM,Feb 2022,10/15/2013,Adoption,Foster,Dog,Spayed Female,8 years,Pit Bull,Black/White,,,,,,
137096,A813933,Lucille,02/28/2022 02:19:00 PM,Feb 2022,12/21/2018,Adoption,,Dog,Spayed Female,3 years,Belgian Malinois,Brown/Black,,,,,,


In [None]:
# Sanity check
df_augmented.head()

We can also do math with columns, or use mathematical notation to combine columns even when they aren't numerical!

We don't have any numeric data in this current dataset. But we can still create a combined "Type" column that combines the values of our Intake Type and Animal Type columns.

In [66]:
# Create a new column, 'Type', from the two 'Type' columns
df_augmented['Type'] = df_augmented['Intake Type'] + " " + df_augmented['Animal Type']

In [67]:
# Sanity check
df_augmented.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,,0,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,,0,
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,,,,,,,0,
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,,0,


## Filtering

We can use filtering techniques to see only certain rows of our data. Let's look at only animals taken into the center during or after 2020:

In [71]:
# Check which rows have an intake year greater than or equal to 2020
df_augmented['Animal Type'] == 'Cat' 

0          True
1         False
2         False
3         False
4          True
          ...  
137093    False
137094    False
137095    False
137096    False
137097     True
Name: Animal Type, Length: 137098, dtype: bool

In [72]:
# Let's explore an interesting property of boolean columns...
# Find out the total sum of animals taken in during or after 2020
sum(df_augmented['Animal Type'] == 'Cat')

52093

But this only gives us True/False outputs... what if we want to really filter?

### `.loc` 

We can locate and segment down to only rows where some condition is true using `.loc`. This takes in a condition, and only outputs the rows where that condition is True! 

> **Note:** locate (`.loc`) uses square brackets, not parentheses! Often, square brackets denote location-focused actions, like this one.

Let's try this first with the condition we just built, and locate all animals taken in during or after 2020.

In [73]:
# Create a subset dataframe of animals taken in during or after 2020
subset_cat = df_augmented.loc[df_augmented['Animal Type'] == 'Cat'] 

In [74]:
subset_cat.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0,
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,,0,
7,A689724,*Donatello,10/18/2014 06:52:00 PM,Oct 2014,08/01/2014,Adoption,,Cat,Neutered Male,2 months,Domestic Shorthair Mix,Black,,,,,,,0,
8,A680969,*Zeus,08/05/2014 04:59:00 PM,Aug 2014,06/03/2014,Adoption,,Cat,Neutered Male,2 months,Domestic Shorthair Mix,White/Orange Tabby,,,,,,,0,
10,A684617,,07/27/2014 09:00:00 AM,Jul 2014,07/26/2012,Transfer,SCRP,Cat,Intact Female,2 years,Domestic Shorthair Mix,Black,,,,,,,0,


We can return only certain columns when we do this, by adding an argument after the condition:

In [77]:
# Let's return just the 'Animal ID', 'DateTime' and 'Type' columns
subset_cat2 = df_augmented.loc[df_augmented['Animal Type'] == 'Cat',['Animal ID','Intake Type','Color']]
subset_cat2

Unnamed: 0,Animal ID,Intake Type,Color
0,A794011,,Brown Tabby/White
4,A674754,,Orange Tabby
7,A689724,,Black
8,A680969,,White/Orange Tabby
10,A684617,,Black
...,...,...,...
137086,A852166,,Brown Tabby/White
137088,A851184,,Orange Tabby/White
137090,A847804,,Brown Tabby/White
137092,A850166,,Seal Point


What if I want to segment using multiple conditions? Use `&` for "and" and `|` for "or" - and use parentheses around individual conditions!

In [84]:
# Find all the Stray Cats taken in during or after 2020
df_augmented.loc[(df_augmented['Animal Type'] == 'Cat') & (df_augmented['Color'] == 'Black')]

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
7,A689724,*Donatello,10/18/2014 06:52:00 PM,Oct 2014,08/01/2014,Adoption,,Cat,Neutered Male,2 months,Domestic Shorthair Mix,Black,,,,,,,0,
10,A684617,,07/27/2014 09:00:00 AM,Jul 2014,07/26/2012,Transfer,SCRP,Cat,Intact Female,2 years,Domestic Shorthair Mix,Black,,,,,,,0,
67,A696409,*Hans,02/09/2015 06:46:00 PM,Feb 2015,11/19/2014,Adoption,,Cat,Neutered Male,2 months,Domestic Shorthair Mix,Black,,,,,,,0,
83,A783412,*Yams,12/16/2018 12:45:00 PM,Dec 2018,10/30/2008,Adoption,,Cat,Neutered Male,10 years,Domestic Shorthair Mix,Black,,,,,,,0,
157,A662741,*Todd,10/05/2013 12:42:00 PM,Oct 2013,08/11/2013,Transfer,Partner,Cat,Intact Male,1 month,Domestic Shorthair Mix,Black,,,,,,,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
136970,A851776,*Frito,02/23/2022 05:54:00 PM,Feb 2022,02/18/2021,Adoption,,Cat,Neutered Male,1 year,Domestic Medium Hair Mix,Black,,,,,,,0,
137048,A800430,Joey,02/26/2022 07:56:00 AM,Feb 2022,07/21/2014,Adoption,Foster,Cat,Neutered Male,7 years,Domestic Medium Hair,Black,,,,,,,0,
137074,A852015,,02/27/2022 08:17:00 AM,Feb 2022,02/23/2020,Died,Emergency,Cat,Unknown,2 years,Domestic Shorthair,Black,,,,,,,0,
137079,A852023,*Rain Shadow,02/28/2022 09:48:00 AM,Feb 2022,10/23/2021,Transfer,Snr,Cat,Unknown,4 months,Domestic Shorthair,Black,,,,,,,0,


## Your turn!

### Exercise 1

You need to find dogs that need extra attention - How would you find all dogs where the intake condition is NOT normal?

In [None]:
# Your code here

<details>
    <summary>Answer</summary>

```python
df_augmented.loc[(df_augmented['Animal Type'] == 'Dog') & (df_augmented['Intake Condition'] != 'Normal')]
```
</details>

### Exercise 2

You need to find animals that might need to be fixed - How would you find all animals that are either Intact Male or Intact Female?

In [88]:
# Your code here
df_augmented.iloc[:]

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,,0,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,,0,
3,A720371,Moose,02/13/2016 05:59:00 PM,Feb 2016,10/08/2015,Adoption,,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,,,,,,,0,
4,A674754,,03/18/2014 11:47:00 AM,Mar 2014,03/12/2014,Transfer,Partner,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,,,,,,,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137093,A852031,Noodle,02/28/2022 12:50:00 PM,Feb 2022,02/23/2020,Transfer,Partner,Dog,Neutered Male,2 years,Pomeranian/Chihuahua Longhair,Buff,,,,,,,0,
137094,A845839,*Carmen,02/28/2022 01:49:00 PM,Feb 2022,05/05/2020,Adoption,Foster,Dog,Spayed Female,1 year,Pit Bull Mix,Brown,,,,,,,0,
137095,A844321,Mia Marie,02/28/2022 01:04:00 PM,Feb 2022,10/15/2013,Adoption,Foster,Dog,Spayed Female,8 years,Pit Bull,Black/White,,,,,,,0,
137096,A813933,Lucille,02/28/2022 02:19:00 PM,Feb 2022,12/21/2018,Adoption,,Dog,Spayed Female,3 years,Belgian Malinois,Brown/Black,,,,,,,0,


<details>
    <summary>Answer</summary>

```python
df_augmented[(df_augmented['Sex upon Intake'] == 'Intact Male') |
             (df_augmented['Sex upon Intake'] == 'Intact Female')]
```
</details>

### `.iloc`

`.iloc` is used for integer-location based indexing, aka locate by number. It can take in lists of numbers, python slices, or specific numbers - but sometimes it can be a bit tricky!

In [86]:
# Find the first 3 rows
df_augmented.iloc[0:3]

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,,0,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,,0,


In [92]:
# Same as using head(3)
df_augmented.head(3)

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
0,A794011,Chunk,05/08/2019 06:20:00 PM,May 2019,05/02/2017,Rto-Adopt,,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,,,,,,,0,
1,A776359,Gizmo,07/18/2018 04:02:00 PM,Jul 2018,07/12/2017,Adoption,,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,,,,,,,0,
2,A821648,,08/16/2020 11:38:00 AM,Aug 2020,08/16/2019,Euthanasia,,Other,Unknown,1 year,Raccoon,Gray,,,,,,,0,


In [None]:
# Can look exactly where the 0 index is


In [None]:
# But what about our subset dataframe above? It doesn't have an index 0
subset_cat.head()

In [89]:
# Try it...
subset_cat.iloc[0]

Animal ID                          A794011
Name                                 Chunk
DateTime            05/08/2019 06:20:00 PM
MonthYear                         May 2019
Date of Birth                   05/02/2017
Outcome Type                     Rto-Adopt
Outcome Subtype                        NaN
Animal Type                            Cat
Sex upon Outcome             Neutered Male
Age upon Outcome                   2 years
Breed               Domestic Shorthair Mix
Color                    Brown Tabby/White
Year                                   NaN
Found Location                         NaN
Intake Type                            NaN
Intake Condition                       NaN
Sex upon Intake                        NaN
Age upon Intake                        NaN
test                                     0
Type                                   NaN
Name: 0, dtype: object

## Series Methods

### `.value_counts()`

How many different values does the Animal Type column have? What about Breed?

In [94]:
# Check the value counts for Animal Type
df_augmented['Animal Type'].value_counts()

Dog          77091
Cat          52093
Other         7253
Bird           636
Livestock       25
Name: Animal Type, dtype: int64

In [95]:
# Now check Breed
df_augmented['Breed'].value_counts()

Domestic Shorthair Mix              32387
Domestic Shorthair                  10365
Pit Bull Mix                         8937
Labrador Retriever Mix               7397
Chihuahua Shorthair Mix              6518
                                    ...  
Blue Lacy/Basset Hound                  1
Bluetick Hound/Australian Kelpie        1
Belgian Hare                            1
Swiss Hound Mix                         1
Australian Shepherd/Beagle              1
Name: Breed, Length: 2735, dtype: int64

Sometimes, this is more useful than others... but, can check the percentage of the total, which might be more useful!

In [98]:
# Use the normalize argument to change how the count is displayed
df_augmented['Animal Type'].value_counts(normalize=True)

Dog          0.562306
Cat          0.379969
Other        0.052904
Bird         0.004639
Livestock    0.000182
Name: Animal Type, dtype: float64

### `.sort_values()`

As you can imagine, this works differently whether you're using it on a numeric or non-numeric column

In [101]:
# Let's sort the year column

In [102]:
# Now, sort the Animal Type col
df_augmented['Animal Type'].sort_values()

81093      Bird
86254      Bird
65348      Bird
122172     Bird
60452      Bird
          ...  
117408    Other
106416    Other
82089     Other
96760     Other
32949     Other
Name: Animal Type, Length: 137098, dtype: object

In [107]:
# We can do this on the whole dataframe, it just needs to know what to sort by
# We can always choose ascending or descending
sorted_df = df_augmented.sort_values(by='Animal Type')
sorted_df

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Date of Birth,Outcome Type,Outcome Subtype,Animal Type,Sex upon Outcome,Age upon Outcome,Breed,Color,Year,Found Location,Intake Type,Intake Condition,Sex upon Intake,Age upon Intake,test,Type
81093,A818041,,06/09/2020 09:47:00 PM,Jun 2020,05/31/2019,Adoption,,Bird,Intact Male,1 year,Chicken,White/Red,,,,,,,0,
86254,A739191,Prince,11/29/2016 06:03:00 PM,Nov 2016,11/29/2015,Return to Owner,,Bird,Intact Male,1 year,Turkey Mix,Black/Red,,,,,,,0,
65348,A719013,,01/17/2016 06:01:00 PM,Jan 2016,01/11/2014,Euthanasia,Suffering,Bird,Unknown,2 years,Hawk,Brown/Yellow,,,,,,,0,
122172,A770612,,04/26/2018 07:08:00 PM,Apr 2018,04/22/2017,Adoption,,Bird,Unknown,1 year,Cockatiel Mix,Gray/Yellow,,,,,,,0,
60452,A778193,,08/26/2018 03:43:00 PM,Aug 2018,08/09/2017,Transfer,Partner,Bird,Intact Male,1 year,Bantam,Tricolor,,,,,,,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
117408,A766860,,02/20/2018 09:10:00 AM,Feb 2018,02/18/2015,Euthanasia,Rabies Risk,Other,Unknown,3 years,Bat Mix,Brown,,,,,,,0,
106416,A791924,,04/03/2019 03:05:00 PM,Apr 2019,04/03/2018,Euthanasia,Rabies Risk,Other,Unknown,1 year,Bat Mix,Brown,,,,,,,0,
82089,A670713,,01/12/2014 05:56:00 PM,Jan 2014,01/12/2011,Euthanasia,Rabies Risk,Other,Unknown,3 years,Bat,Brown/Black,,,,,,,0,
96760,A819484,,06/29/2020 05:13:00 PM,Jun 2020,06/29/2018,Euthanasia,Suffering,Other,Unknown,2 years,Raccoon,Gray/Black,,,,,,,0,


# Extra Credit: Find a .csv file online and experiment with it.

Head to [dataportals.org](https://dataportals.org) to find a .csv file.