# Snowshoe hare physical data in Bonanza Creek Experimental Forest: 1999-Present

Citation for publication:

Kielland, K., F.S. Chapin, R.W. Ruess, and Bonanza Creek LTER. 2017. Snowshoe hare physical data in Bonanza Creek Experimental Forest: 1999-Present ver 22. Environmental Data Initiative. https://doi.org/10.6073/pasta/03dce4856d79b91557d8e6ce2cbcdc14 (Accessed 2025-10-17).

## Data Description
Using capture-recapture studies of snowshoe hares to study population dynamics from 1999-2017

Date of access: 10/16/2025

archive: https://edirepository.org/

![Snowshoe Hare (cute)](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg/1089px-SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg?20170313021652)
Copyright: Publicly available under Creative Commons CC0 1.0 Public Domain

ALAN SCHMIERER, Set 72157600401137773, ID 18988734889, Original title SNOWSHOE HARE (Lepus americanus) (5-28-2015) quoddy head, washington co, maine -01


In [2]:
# Import libraries
import pandas as pd
import numpy as np

In [3]:
# Read in hares data
hares = pd.read_csv("https://pasta.lternet.edu/package/data/eml/knb-lter-bnz/55/22/f01f5d71be949b8c700b6ecd1c42c701")

# Look at head of df
hares.head()

Unnamed: 0,date,time,grid,trap,l_ear,r_ear,sex,age,weight,hindft,notes,b_key,session_id,study
0,11/26/1998,,bonrip,1A,414D096A08,,,,1370.0,160.0,,917.0,51,Population
1,11/26/1998,,bonrip,2C,414D320671,,M,,1430.0,,,936.0,51,Population
2,11/26/1998,,bonrip,2D,414D103E3A,,M,,1430.0,,,921.0,51,Population
3,11/26/1998,,bonrip,2E,414D262D43,,,,1490.0,135.0,,931.0,51,Population
4,11/26/1998,,bonrip,3B,414D2B4B58,,,,1710.0,150.0,,933.0,51,Population


In [4]:
hares.dtypes

date           object
time           object
grid           object
trap           object
l_ear          object
r_ear          object
sex            object
age            object
weight        float64
hindft        float64
notes          object
b_key         float64
session_id      int64
study          object
dtype: object

In [5]:
hares.columns

Index(['date', 'time', 'grid', 'trap', 'l_ear', 'r_ear', 'sex', 'age',
       'weight', 'hindft', 'notes', 'b_key', 'session_id', 'study'],
      dtype='object')

In [7]:
hares.shape

(3380, 14)

In [8]:
hares.trap.unique

<bound method Series.unique of 0        1A
1        2C
2        2D
3        2E
4        3B
       ... 
3375    1b 
3376    4b 
3377     4b
3378     6d
3379     4b
Name: trap, Length: 3380, dtype: object>

In [10]:
hares.isna().sum()

date             0
time          3116
grid             0
trap            12
l_ear           48
r_ear          169
sex            352
age           2111
weight         535
hindft        1747
notes         3137
b_key           47
session_id       0
study          163
dtype: int64

In [5]:
# Check maximum weight value
hares["weight"].max()

2365.0

In [6]:
# Check minimum weight
hares["weight"].min()

0.0

In [7]:
# Check the minimum hind foot measurement
hares["hindft"].min()

60.0

In [8]:
# CHeck the maximum hind foot measurement
hares["hindft"].max()

160.0

In [10]:
# Look at categorical unique values
hares["sex"].unique()

array([nan, 'M', 'F', '?', 'F?', 'M?', 'pf', 'm', 'f', 'f?', 'm?', 'f ',
       'm '], dtype=object)

In [9]:
hares["notes"].unique()

array([nan, 'No right ear tag', 'Escapee', 'Mortality', 'Mortality ',
       'Old tag lost in L ear',
       'Bunny escaped before second ear tag was added',
       'Rabbit too bloody, released', 'R Front Foot Injured',
       'L Hind Leg Injured',
       'Left Front Foot Injured by Mink. Mink Still Around, Not Shy',
       'Injured Bunny, Released, No Tags', 'Died after release',
       'Dead in trap', 'Dead', 'non-pregnant',
       'pregnant (2 peanut sized babies)', 'pregnant', 'Pregnant',
       'Pregnant; last collar was chewed off',
       '149.074 recapture; collar loose, removed and replaced; non-pregnant',
       'previous collar was chewed off',
       '149.013 came off/removed; replaced',
       '149.033 recapture; collar loose, removed and replaced',
       'previous collar fell off',
       'collar previously chewed off (put back on the same bunny!)',
       'collar broke off, caught in cage', 'dead in trap',
       '149.754 recapture; no VHF signal, removed and replaced',

In [19]:

hares["age"].value_counts()

age
A            564
J            267
a            183
j            128
1/2/2013      21
1/4/2013      21
3/4/2013      18
1             12
U             11
?             10
a 3/4 yr.      4
2 yrs.         3
u              3
A 3/4          2
1.5            2
3.5 yrs.       2
2.25 yrs       2
1 yr.          2
1 yr           2
a 1 yr.        2
3.25 yrs.      1
2.5 yrs        1
3 yrs.         1
a 2 yrs.       1
J 3/4          1
2 yrs          1
A 1/2          1
a 1 yr         1
1.25           1
A 1.5          1
Name: count, dtype: int64

## Study question:
Is there a corellation between snowshoe hare weight and hind foot size?

| Value      | Description |
| ----------- | ----------- |
| f      | female       |
| m   | male        |
| ?   | unconfirmed|
| p    | unknown|

In [20]:
# How many times does each unique value in sex appear?
hares["sex"].value_counts()

sex
F     1161
M      730
f      556
m      515
?       40
F?      10
f        4
m        4
f?       3
M?       2
m?       2
pf       1
Name: count, dtype: int64

In [23]:
# Checking number without NAs 
hares["sex"].value_counts(dropna=False)

# 352 NA values

sex
F      1161
M       730
f       556
m       515
NaN     352
?        40
F?       10
f         4
m         4
f?        3
M?        2
m?        2
pf        1
Name: count, dtype: int64

Exploring 'sex' as a variable:
- values in the column do not exactly match the metadata- there are various cominations and capitalizations
- cause might be poor data collection training and standardization
- There are 4 duplicate rows in the data frame, this might be because of accidental recapture of the same hare multiple times on the same day

In [30]:
hares[hares.duplicated()]

Unnamed: 0,date,time,grid,trap,l_ear,r_ear,sex,age,weight,hindft,notes,b_key,session_id,study
2893,7/1/2011,,bonbs,10a,,,,,,,juvenile,,23,Population
2894,7/1/2011,,bonbs,10a,,,,,,,juvenile,,23,Population
2895,7/1/2011,,bonbs,10a,,,,,,,juvenile,,23,Population
3071,9/11/2012,,bonbs,10d,b2834,b2835,f,j,840.0,114.0,,838.0,31,Population


Instructions for wrangling
1. Gather hares sex column
2. Use function `replace()` to reassign values in column to standard name
3. f and F = female, m and M = male, ? = unknown

In [31]:
# Set conditions to select from
conditions = [
    (hares['sex'].isin(['m', 'M', 'm_'])),
    (hares['sex'].isin(['f', 'F', 'f_']))
]

# Set choices corresponding to array
choices = ['male', 'female']

# Otherwise, sex is "unknown"
default = 'unknown'

# Use `np.select()` to index through sex column and make a new col with outputs
hares['sex_simple'] = np.select(conditions, choices, default = default)

In [33]:
# Check that this worked and we only have our desired values
hares['sex_simple'].value_counts()

sex_simple
female     1717
male       1245
unknown     418
Name: count, dtype: int64

In [37]:
# Calculate mean weight by sex 
hares.groupby("sex_simple").mean("weight")

Unnamed: 0_level_0,weight,hindft,b_key,session_id
sex_simple,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
female,1366.920372,131.011161,482.617407,57.75597
male,1352.145553,133.38301,490.851406,49.228112
unknown,1176.511111,103.469697,615.119681,46.576555


The mean weight of female rabbits was the largest at 1,366 g, where males averaged 1,352 g, and unsexed hares were an average 1,176 g

In [None]:
# Streamlined workflow 

# Import libraries
import pandas as pd
import numpy as np

# Read in hares data
hares = pd.read_csv("https://pasta.lternet.edu/package/data/eml/knb-lter-bnz/55/22/f01f5d71be949b8c700b6ecd1c42c701")

# Number of rows and cols
hares.shape

# Checking number without NAs 
hares["sex"].value_counts(dropna=False)


# Set conditions to select from
conditions = [
    (hares['sex'].isin(['m', 'M', 'm_'])),
    (hares['sex'].isin(['f', 'F', 'f_']))
]

# Set choices corresponding to array
choices = ['male', 'female']

# Otherwise, sex is "unknown"
default = 'unknown'

# Use `np.select()` to index through sex column and make a new col with outputs
hares['sex_simple'] = np.select(conditions, choices, default = default)

# Calculate mean weight by sex 
hares.groupby("sex_simple").mean("weight")