# Snowshoe hares at Bonanza Creek Experimental Forest

Date: 10/16/2024

[Link to website](https://meds-eds-220.github.io/MEDS-eds-220-course/discussion-sections/ds3-hares.html)

## 1. Archive Explorations

Citation:

Kielland, K., F.S. Chapin, R.W. Ruess, and Bonanza Creek LTER. 2017. Snowshoe hare physical data in Bonanza Creek Experimental Forest: 1999-Present ver 22. Environmental Data Initiative. https://doi.org/10.6073/pasta/03dce4856d79b91557d8e6ce2cbcdc14 (Accessed 2024-10-17).
    
Date of access: 10/16/2024
    
[Link to archive](https://portal.edirepository.org/nis/mapbrowse?packageid=knb-lter-bnz.55.22)

# 2. Snowshoe hare image

![hare](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg/1089px-SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg?20170313021652)

# 3. Data loading and preliminary exploration

In [1]:
# Load libraries

import numpy as np
import pandas as pd

# Read in data
hares = pd.read_csv('https://portal.edirepository.org/nis/dataviewer?packageid=knb-lter-bnz.55.22&entityid=f01f5d71be949b8c700b6ecd1c42c701')

In [2]:
# Data exploration

# View first few rows
hares.head()

Unnamed: 0,date,time,grid,trap,l_ear,r_ear,sex,age,weight,hindft,notes,b_key,session_id,study
0,11/26/1998,,bonrip,1A,414D096A08,,,,1370.0,160.0,,917.0,51,Population
1,11/26/1998,,bonrip,2C,414D320671,,M,,1430.0,,,936.0,51,Population
2,11/26/1998,,bonrip,2D,414D103E3A,,M,,1430.0,,,921.0,51,Population
3,11/26/1998,,bonrip,2E,414D262D43,,,,1490.0,135.0,,931.0,51,Population
4,11/26/1998,,bonrip,3B,414D2B4B58,,,,1710.0,150.0,,933.0,51,Population


In [3]:
# View column information
hares.info

<bound method DataFrame.info of             date      time    grid trap       l_ear r_ear  sex  age  weight  \
0     11/26/1998       NaN  bonrip   1A  414D096A08   NaN  NaN  NaN  1370.0   
1     11/26/1998       NaN  bonrip   2C  414D320671   NaN    M  NaN  1430.0   
2     11/26/1998       NaN  bonrip   2D  414D103E3A   NaN    M  NaN  1430.0   
3     11/26/1998       NaN  bonrip   2E  414D262D43   NaN  NaN  NaN  1490.0   
4     11/26/1998       NaN  bonrip   3B  414D2B4B58   NaN  NaN  NaN  1710.0   
...          ...       ...     ...  ...         ...   ...  ...  ...     ...   
3375    8/8/2002  18:00:00  bonrip  1b         1201  1202  NaN  NaN  1400.0   
3376    8/8/2002   6:00:00  bonrip  4b         1201  1202  NaN  NaN     NaN   
3377    8/7/2002       NaN  bonrip   4b        1217  1218  NaN  NaN  1000.0   
3378    8/8/2002       NaN  bonrip   6d        1217  1218  NaN  NaN   990.0   
3379    8/6/2002       NaN  bonrip   4b        1058  1060    M  NaN  1460.0   

      hindft notes 

In [4]:
# Return the number of unique values from each column
hares.nunique()

date           256
time           165
grid             5
trap           121
l_ear         1020
r_ear          963
sex             12
age             30
weight         266
hindft          90
notes          118
b_key         1026
session_id     113
study            4
dtype: int64

# 4. Detecting Messy Values

| Code | Definition | 
|------|------------|
| M    | Male       |
| F    | Female     |
| M?   | Male not confirmed|

In [5]:
# Count of each unique sex value

hares['sex'].value_counts()

F     1161
M      730
f      556
m      515
?       40
F?      10
f        4
m        4
f?       3
M?       2
m?       2
pf       1
Name: sex, dtype: int64

In [6]:
# # Count of each unique sex value while dropping NA
hares['sex'].value_counts(dropna=False)

F      1161
M       730
f       556
m       515
NaN     352
?        40
F?       10
f         4
m         4
f?        3
M?        2
m?        2
pf        1
Name: sex, dtype: int64

In [13]:
# Clean values

# hares.sex == 'f' | hares.sex == 'F'....
condition=[hares['sex'].isin(['F','f', 'f ']),
           hares['sex'].isin(['M','m', 'm '])]
gender=['female','male']
hares['sex_simple']=np.select(condition, gender, default=np.nan)

print(hares['sex_simple'].unique())

['nan' 'male' 'female']


In [14]:
# Calculate mean weight

hares.groupby('sex_simple').weight.mean()

sex_simple
female    1365.164792
male      1349.935542
nan       1193.364055
Name: weight, dtype: float64