# Importing, Reading and Manipulating Data with ACTUAL LITERAL PANDAS

![I have no idea what I'm doing panda](https://cdn-images-1.medium.com/max/1600/1*oBx032ncOwLmCFX3Epo3Zg.jpeg)

Just kidding - but Pandas is a great library to work with relational data. 

[Check out the documentation!](https://pandas.pydata.org/pandas-docs/stable/) (always a great idea)

Before we get into Pandas, let's talk about Numpy for a second. Numpy stands for "Numerical Python" and it's designed for high-level mathematical functions and computing functionality. It uses multi-dimensional arrays to store data, which is more convenient and more optimized than a 'pure' Python type. What that means is that Numpy is fast - faster than base Python. 

We're not going to go into a lot of Numpy functionality here, but here's something cool - Pandas is built on top of Numpy! That means they work really well together, and that Pandas has some math functionality already built in.

If you'd like to read more about Numpy and Pandas, [here is an interesting blog post](https://cloudxlab.com/blog/numpy-pandas-introduction/) discussing them.

Let's dive into some data from the Austin Animal Shelter. 

Data source: [intakes data](https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Intakes/wter-evkm) and [outcomes data](https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Outcomes/9t4d-g238).

Today we'll be working with the intakes data, which I've already downloaded and included in the repository.

In [3]:
# Import
import pandas as pd



In [4]:
intakes = pd.read_csv('data/Austin_Animal_Center_Intakes_10-08-20.csv',
                     parse_dates=['DateTime'])
outcomes = pd.read_csv('data/Austin_Animal_Center_Outcomes_10-14-20.csv',
                      parse_dates=['DateTime'])

Before reading in the data, we need to know what format the data is in and where exactly the data can be found, so we can tell Pandas what to do.

In [5]:
# Where is our data?
!ls 

00-Orientation.pdf
01-WhatIsDataScience.pdf
02-Terminal-Git-Github.pdf
03-Python-Loops-and-Functions-Matthew.ipynb
04-Pandas-Introduction-Matthew.ipynb
05-Pandas-Practice-Matthew.ipynb
[34mWeek1-PythonReview[m[m
[34mWeek2-PythonChallenge[m[m
[34mcomplete-notebooks[m[m
[34mdata[m[m
[34mimages[m[m


In [8]:
# Read in the comma-separated-value (csv) document as df
#See ABOVE 
#ex. 
# intakes = pd.read_csv('data/Austin_Animal_Center_Intakes_10-08-20.csv')

What options do we have when we read in a csv? Let's look at the documentation!

[Convenient link](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

I happen to know that there is a column in the data named 'DateTime' - let's use an argument to read it in as a datetime object, then discuss.

### Initial Exploration of a Dataframe

Questions to ask yourself:

- How big is the data?
- Are there any empty cells? 
- What are the datatypes of the columns of data?

In [9]:
# What does this dataframe look like?
# Check out the first 5 rows
intakes.head(50)

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color
0,A786884,*Brock,2019-01-03 16:19:00,01/03/2019 04:19:00 PM,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor
1,A706918,Belle,2015-07-05 12:59:00,07/05/2015 12:59:00 PM,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver
2,A724273,Runster,2016-04-14 18:43:00,04/14/2016 06:43:00 PM,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White
3,A665644,,2013-10-21 07:59:00,10/21/2013 07:59:00 AM,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico
4,A682524,Rio,2014-06-29 10:38:00,06/29/2014 10:38:00 AM,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray
5,A743852,Odin,2017-02-18 12:46:00,02/18/2017 12:46:00 PM,Austin (TX),Owner Surrender,Normal,Dog,Neutered Male,2 years,Labrador Retriever Mix,Chocolate
6,A635072,Beowulf,2019-04-16 09:53:00,04/16/2019 09:53:00 AM,415 East Mary Street in Austin (TX),Public Assist,Normal,Dog,Neutered Male,6 years,Great Dane Mix,Black
7,A708452,Mumble,2015-07-30 14:37:00,07/30/2015 02:37:00 PM,Austin (TX),Public Assist,Normal,Dog,Intact Male,2 years,Labrador Retriever Mix,Black/White
8,A774147,,2018-06-11 07:45:00,06/11/2018 07:45:00 AM,6600 Elm Creek in Austin (TX),Stray,Injured,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Black/White
9,A731435,*Casey,2016-08-08 17:52:00,08/08/2016 05:52:00 PM,Austin (TX),Owner Surrender,Normal,Cat,Neutered Male,5 months,Domestic Shorthair Mix,Cream Tabby


In [10]:
# Check out the shape of the df
intakes.shape

(121051, 12)

In [11]:
# And then the size
intakes.size

1452612

In [12]:
# And then look at some info on the df
intakes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 121051 entries, 0 to 121050
Data columns (total 12 columns):
Animal ID           121051 non-null object
Name                82843 non-null object
DateTime            121051 non-null datetime64[ns]
MonthYear           121051 non-null object
Found Location      121051 non-null object
Intake Type         121051 non-null object
Intake Condition    121051 non-null object
Animal Type         121051 non-null object
Sex upon Intake     121050 non-null object
Age upon Intake     121051 non-null object
Breed               121051 non-null object
Color               121051 non-null object
dtypes: datetime64[ns](1), object(11)
memory usage: 11.1+ MB


In [13]:
# Describe the columns
intakes.describe()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color
count,121051,82843,121051,121051,121051,121051,121051,121051,121050,121051,121051,121051
unique,108222,19359,85336,85336,52519,6,10,5,5,51,2597,587
top,A721033,Max,2016-09-23 12:00:00,09/23/2016 12:00:00 PM,Austin (TX),Stray,Normal,Dog,Intact Male,1 year,Domestic Shorthair Mix,Black/White
freq,33,548,64,64,22227,84625,105629,68571,39184,21295,30770,12679
first,,,2013-10-01 07:51:00,,,,,,,,,
last,,,2020-10-05 22:48:00,,,,,,,,,


**A note on `.describe()`:** this function behaves differently whether we feed in objects or numeric types. We'll explore this more later.

**And a question:** You see that some of the ways we dealt with our dataframe required `()` and some did not - why is that?

- Why do some have () and others don't?
    - Functions take arguments *have methods that are associated with the dataframe* () *something you can put arguments into* 
    - The ones that don't have () are attributes. They are simply sttributes of the dataframe, *info about the dataframe*


### Accessing Columns

Use brackets and the exact column name to access a particular column.

In [14]:
# Access specific columns here
type(intakes['Color'])
type(intakes)
intakes[['Color']] #turn it into a DataFrame prove with type([['col_name']])
                   #Can run arguments off of this too!

Unnamed: 0,Color
0,Tricolor
1,White/Liver
2,Sable/White
3,Calico
4,Tan/Gray
...,...
121046,Black
121047,Fawn/White
121048,Brown/White
121049,Gray/White


In [38]:
intakes.Found Location #Will not work...why?
#Can use this method of accessing columns as long
#as the column name is one word
#Can fix this by updating column names

SyntaxError: invalid syntax (<ipython-input-38-d8c98ba7ca0d>, line 1)

### Dealing with Datetime Objects

You can access parts of a datetime object using `.dt` - an attribute of the column, not a method!

In [1]:
# Let's check out the intake year, then other aspects of the date
intakes['DateTime'].dt.year

NameError: name 'intakes' is not defined

In [15]:
#Can run methods on a column
intakes['Name'].str.lower()

0          *brock
1           belle
2         runster
3             NaN
4             rio
           ...   
121046        NaN
121047    duchess
121048        NaN
121049        NaN
121050      rocky
Name: Name, Length: 121051, dtype: object

In [16]:
# How do we create a new column?
# Let's create a new column for intake year
intakes['Intake Year'] = intakes['DateTime'].dt.year

In [18]:
# Check our work
intakes.head()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color,Intake Year
0,A786884,*Brock,2019-01-03 16:19:00,01/03/2019 04:19:00 PM,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019
1,A706918,Belle,2015-07-05 12:59:00,07/05/2015 12:59:00 PM,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver,2015
2,A724273,Runster,2016-04-14 18:43:00,04/14/2016 06:43:00 PM,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White,2016
3,A665644,,2013-10-21 07:59:00,10/21/2013 07:59:00 AM,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico,2013
4,A682524,Rio,2014-06-29 10:38:00,06/29/2014 10:38:00 AM,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray,2014


In [22]:
# What datatype is the data in our new column?
intakes[['Intake Year']].describe()
#Using this for describe will show you median (50%) and mean can also use
# .mean()
intake_year_desc = intakes[['Intake Year']].describe()

In [35]:
intake_year_desc.iloc[7]
intake_year_desc.index

Index(['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], dtype='object')

### Checking for Null Values

Can use `.isna` or `.isnull` - same thing!

In [40]:
# Check it - is the result what you expect?
intakes.isna()

Unnamed: 0,Animal ID,Name,DateTime,MonthYear,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color,Intake Year
0,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,True,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
121046,False,True,False,False,False,False,False,False,False,False,False,False,False
121047,False,False,False,False,False,False,False,False,False,False,False,False,False
121048,False,True,False,False,False,False,False,False,False,False,False,False,False
121049,False,True,False,False,False,False,False,False,False,False,False,False,False


In [42]:
# How can you make that result more usable?
intakes.isna().sum()
#True = 1 and False = 0. This will show us how many are actually null values

Animal ID               0
Name                38208
DateTime                0
MonthYear               0
Found Location          0
Intake Type             0
Intake Condition        0
Animal Type             0
Sex upon Intake         1
Age upon Intake         0
Breed                   0
Color                   0
Intake Year             0
dtype: int64

### Checking for Duplicate Rows

In [None]:
# Function is called duplicated - check the documentation!
intakes.isna().sum()

In [43]:
null_names = intakes['Name'].isna().sum()

In [48]:
total_rows = len(intakes)

In [50]:
null_names/total_rows

0.31563555856622416

In [51]:
intakes['Name'].isna().mean() #% of null (proof above)

0.31563555856622416

In [93]:
# Can use same trick as above on duplicated
intakes.duplicated(keep = False).sum

<bound method Series.sum of 0         False
1         False
2         False
3         False
4         False
          ...  
121046    False
121047    False
121048    False
121049    False
121050    False
Length: 121051, dtype: bool>

### Dropping Columns or Rows

Several different methods depending on what we're doing - but the to discuss right now is `.drop`

In [58]:
# Let's drop the MonthYear column, which is the same as our DateTime
intakes.drop(columns='MonthYear')

Unnamed: 0,Animal ID,Name,DateTime,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color,Intake Year
0,A786884,*Brock,2019-01-03 16:19:00,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019
1,A706918,Belle,2015-07-05 12:59:00,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver,2015
2,A724273,Runster,2016-04-14 18:43:00,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White,2016
3,A665644,,2013-10-21 07:59:00,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico,2013
4,A682524,Rio,2014-06-29 10:38:00,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray,2014
...,...,...,...,...,...,...,...,...,...,...,...,...
121046,A824073,,2020-10-05 17:05:00,Manor (TX),Owner Surrender,Normal,Cat,Intact Male,1 year,Domestic Shorthair,Black,2020
121047,A824076,Duchess,2020-10-05 17:28:00,Manor (TX),Owner Surrender,Injured,Dog,Intact Female,5 years,Pit Bull,Fawn/White,2020
121048,A824083,,2020-10-05 20:57:00,4400 Avenue A in Austin (TX),Public Assist,Normal,Dog,Unknown,2 years,Pit Bull,Brown/White,2020
121049,A824084,,2020-10-05 20:57:00,4400 Avenue A in Austin (TX),Public Assist,Normal,Dog,Unknown,2 years,Pit Bull,Gray/White,2020


In [62]:
# Check our work here...
intakes.head()
#won't save without you telling it to (rename the dataframe)

Unnamed: 0,Animal ID,Name,DateTime,Found Location,Intake Type,Intake Condition,Animal Type,Sex upon Intake,Age upon Intake,Breed,Color,Intake Year
0,A786884,*Brock,2019-01-03 16:19:00,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019
1,A706918,Belle,2015-07-05 12:59:00,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver,2015
2,A724273,Runster,2016-04-14 18:43:00,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White,2016
3,A665644,,2013-10-21 07:59:00,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico,2013
4,A682524,Rio,2014-06-29 10:38:00,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray,2014


In [61]:
intakes = intakes.drop(columns='MonthYear') 
#or intakes.drop(columns='MonthYear', inplace=True) 
#but this could have memory issues with larger datasets

Why won't my changes save ???

Fun thing about pandas - time to discuss resetting variables, or using `inplace`

### Renaming Columns

[Documentation for `.rename`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html)

In [63]:
# Let's remove spaces from the columns, and make all column names lowercase to be easier
intakes.columns 

Index(['Animal ID', 'Name', 'DateTime', 'Found Location', 'Intake Type',
       'Intake Condition', 'Animal Type', 'Sex upon Intake', 'Age upon Intake',
       'Breed', 'Color', 'Intake Year'],
      dtype='object')

In [None]:
# Can use a dictionary to rename


In [65]:
# Check your work
list(intakes.columns)

['Animal ID',
 'Name',
 'DateTime',
 'Found Location',
 'Intake Type',
 'Intake Condition',
 'Animal Type',
 'Sex upon Intake',
 'Age upon Intake',
 'Breed',
 'Color',
 'Intake Year']

In [67]:
# Can also use a lambda function
intakes = intakes.rename(columns = lambda x: x.lower().replace(" ", "_"))

### Slicing and Dicing

Perhaps your biggest tool for exploring around your dataframes will be `.loc` (and it's accompanying `.iloc`). This allows you to use conditionals to explore your data!

In [68]:
# Example: look only at animals with intake type 'Stray'
intakes.loc[intakes['intake_type'] == 'Stray'].head()

Unnamed: 0,animal_id,name,datetime,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,intake_year
0,A786884,*Brock,2019-01-03 16:19:00,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019
1,A706918,Belle,2015-07-05 12:59:00,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver,2015
2,A724273,Runster,2016-04-14 18:43:00,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White,2016
3,A665644,,2013-10-21 07:59:00,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico,2013
4,A682524,Rio,2014-06-29 10:38:00,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray,2014


In [71]:
# Second example: animals where the animal type is not dog
intakes.loc[intakes['animal_type'] != 'Dog'].head()

Unnamed: 0,animal_id,name,datetime,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,intake_year
3,A665644,,2013-10-21 07:59:00,Austin (TX),Stray,Sick,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Calico,2013
8,A774147,,2018-06-11 07:45:00,6600 Elm Creek in Austin (TX),Stray,Injured,Cat,Intact Female,4 weeks,Domestic Shorthair Mix,Black/White,2018
9,A731435,*Casey,2016-08-08 17:52:00,Austin (TX),Owner Surrender,Normal,Cat,Neutered Male,5 months,Domestic Shorthair Mix,Cream Tabby,2016
13,A790209,Ziggy,2019-03-06 14:31:00,4424 S Mopac Expwy in Austin (TX),Public Assist,Normal,Cat,Intact Female,4 years,Domestic Shorthair Mix,Brown Tabby/White,2019
14,A743114,,2017-02-04 10:10:00,208 Beaver St in Austin (TX),Stray,Injured,Cat,Intact Female,2 years,Domestic Shorthair Mix,Black/White,2017


In [73]:
# And a third - animals found before 2018
intakes.loc[intakes.intake_year > 2018]

Unnamed: 0,animal_id,name,datetime,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,intake_year
0,A786884,*Brock,2019-01-03 16:19:00,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019
6,A635072,Beowulf,2019-04-16 09:53:00,415 East Mary Street in Austin (TX),Public Assist,Normal,Dog,Neutered Male,6 years,Great Dane Mix,Black,2019
13,A790209,Ziggy,2019-03-06 14:31:00,4424 S Mopac Expwy in Austin (TX),Public Assist,Normal,Cat,Intact Female,4 years,Domestic Shorthair Mix,Brown Tabby/White,2019
21,A754715,Rheia,2019-07-29 17:19:00,Austin (TX),Owner Surrender,Normal,Dog,Spayed Female,2 years,Labrador Retriever Mix,Black/White,2019
22,A810994,,2019-12-25 00:05:00,7900 Rm 1826 Rd in Travis (TX),Wildlife,Normal,Other,Unknown,2 years,Bat,Brown,2019
...,...,...,...,...,...,...,...,...,...,...,...,...
121046,A824073,,2020-10-05 17:05:00,Manor (TX),Owner Surrender,Normal,Cat,Intact Male,1 year,Domestic Shorthair,Black,2020
121047,A824076,Duchess,2020-10-05 17:28:00,Manor (TX),Owner Surrender,Injured,Dog,Intact Female,5 years,Pit Bull,Fawn/White,2020
121048,A824083,,2020-10-05 20:57:00,4400 Avenue A in Austin (TX),Public Assist,Normal,Dog,Unknown,2 years,Pit Bull,Brown/White,2020
121049,A824084,,2020-10-05 20:57:00,4400 Avenue A in Austin (TX),Public Assist,Normal,Dog,Unknown,2 years,Pit Bull,Gray/White,2020


## Let's Start to Answer Questions!

#### Question 1: What is the most common Animal Type?

In [111]:
# Let's explore the Animal Type column to find out
def most_frequent(List): 
    counter = 0
    num = List[] 
      
    for i in List.items():
        for key, value in i:
            value = List.count(i) 
            if(curr_frequency> counter): 
            counter = curr_frequency 
            num = i 
  
    return num 

SyntaxError: invalid syntax (<ipython-input-111-08fd04836ef1>, line 4)

In [112]:
most_frequent(intakes['animal_type'])

KeyError: 'Requested level (Dog) does not match index name (None)'

In [None]:
# Another way - look above at describe, or run another describe
# 'Top' for an object column means 'most common'
intakes.animal_type.describe()

#### Question 2: What is the most common dog breed to come into the shelter?

In [119]:
# Let's create a new df, dogs, for all dogs in the original data
dog_df = intakes.loc[intakes.animal_type == 'Dog']

In [131]:
dog_df.breed.describe()


count            68571
unique            2296
top       Pit Bull Mix
freq              8311
Name: breed, dtype: object

In [156]:
# Now it's easier to look at common dog breeds
top_dog = dog_df['breed'].describe()[3]

In [157]:
top_dog

8311

#### Question 3: What percentage of animals have come into the shelter in a condition other than "Normal"?

In [None]:
df = df.groupby('domain')['ID'].nunique()

In [172]:
# Need to explore the proper column
intakes.intake_condition.nunique()
intakes.groupby(intakes.intake_condition).nunique()

Unnamed: 0_level_0,animal_id,name,datetime,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,intake_year
intake_condition,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Aged,410,328,383,271,5,1,3,5,21,134,86,8
Behavior,10,10,10,10,3,1,1,3,5,10,10,2
Feral,106,38,74,70,4,1,3,5,17,15,35,8
Injured,6282,2373,5909,5150,6,1,4,5,45,552,249,8
Medical,48,36,44,35,3,1,2,4,18,26,20,2
Normal,93846,18177,74265,44963,6,1,5,5,49,2499,543,8
Nursing,3385,934,1009,768,5,1,3,5,31,134,145,8
Other,223,152,150,134,5,1,4,5,35,70,68,8
Pregnant,76,44,70,58,3,1,2,3,11,39,34,8
Sick,4833,1503,4184,3268,6,1,4,5,44,336,205,8


In [None]:
# Want to use pandas to calculate, not inputting number manually


In [None]:
# Calculate percentage


In [None]:
# Other way to calculate


## Now - Outtake Data!

Let's explore together if we have time! If not - extra credit!