# Week 3 Discussion: Snowshoe hares at Bonanza Creek Experimental Forest

This dataset contains snowshoe hare physical data from Bonanza Creek Experimental Forest, 1999-present. 


Citation: 

Kielland, K., F.S. Chapin, R.W. Ruess, and Bonanza Creek LTER. 2017. Snowshoe hare physical data in Bonanza Creek Experimental Forest: 1999-Present ver 22. Environmental Data Initiative. https://doi.org/10.6073/pasta/03dce4856d79b91557d8e6ce2cbcdc14 (Accessed 2024-10-17).

[link to snowshoe data](Link to Snowshoe Data)[https://portal.edirepository.org/nis/mapbrowse?packageid=knb-lter-bnz.55.22]

![Image of a hare](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg/1452px-SNOWSHOE_HARE_%28Lepus_americanus%29_%285-28-2015%29_quoddy_head%2C_washington_co%2C_maine_-01_%2818988734889%29.jpg?20170313021652)

In [39]:
import pandas as pd
import numpy as np

In [6]:
url = "https://portal.edirepository.org/nis/dataviewer?packageid=knb-lter-bnz.55.22&entityid=f01f5d71be949b8c700b6ecd1c42c701"

hares = pd.read_csv(url)

In [8]:
hares.shape

(3380, 14)

In [10]:
hares.head()

Unnamed: 0,date,time,grid,trap,l_ear,r_ear,sex,age,weight,hindft,notes,b_key,session_id,study
0,11/26/1998,,bonrip,1A,414D096A08,,,,1370.0,160.0,,917.0,51,Population
1,11/26/1998,,bonrip,2C,414D320671,,M,,1430.0,,,936.0,51,Population
2,11/26/1998,,bonrip,2D,414D103E3A,,M,,1430.0,,,921.0,51,Population
3,11/26/1998,,bonrip,2E,414D262D43,,,,1490.0,135.0,,931.0,51,Population
4,11/26/1998,,bonrip,3B,414D2B4B58,,,,1710.0,150.0,,933.0,51,Population


In [14]:
hares.dtypes

date           object
time           object
grid           object
trap           object
l_ear          object
r_ear          object
sex            object
age            object
weight        float64
hindft        float64
notes          object
b_key         float64
session_id      int64
study          object
dtype: object

In [16]:
hares.isna().sum()

date             0
time          3116
grid             0
trap            12
l_ear           48
r_ear          169
sex            352
age           2111
weight         535
hindft        1747
notes         3137
b_key           47
session_id       0
study          163
dtype: int64

In [21]:
hares['hindft'].max()

160.0

In [23]:
hares['hindft'].min()

60.0

In [26]:
hares['sex'].unique()

array([nan, 'M', 'F', '?', 'F?', 'M?', 'pf', 'm', 'f', 'f?', 'm?', 'f ',
       'm '], dtype=object)

#### 4. Detecting messy values

| value      | definition |
| ----------- | ----------- |
|    |       |
|    |       |

In [31]:
hares['sex'].value_counts()

F     1161
M      730
f      556
m      515
?       40
F?      10
f        4
m        4
f?       3
M?       2
m?       2
pf       1
Name: sex, dtype: int64

In [33]:
hares['sex'].value_counts(dropna = False)

F      1161
M       730
f       556
m       515
NaN     352
?        40
F?       10
f         4
m         4
f?        3
M?        2
m?        2
pf        1
Name: sex, dtype: int64

#### 6. Clean values

In [55]:
hs = hares['sex']

condlist = [hs.isin(['F', 'f', 'f_']), 
            hs.isin(['M', 'm', 'm_'])]
choicelist = ['female', 'male']

hares['sex_simple'] = np.select(condlist, choicelist, default = np.nan)

In [57]:
hares['sex_simple']

0        nan
1       male
2       male
3        nan
4        nan
        ... 
3375     nan
3376     nan
3377     nan
3378     nan
3379    male
Name: sex_simple, Length: 3380, dtype: object

In [59]:
hares.groupby('sex_simple').weight.mean()

sex_simple
female    1366.920372
male      1352.145553
nan       1176.511111
Name: weight, dtype: float64

In [61]:
pwd

'/Users/rubinstein'

In [63]:
cd MEDS

[Errno 2] No such file or directory: 'MEDS'
/Users/rubinstein/MEDS


In [65]:
ls

[0m[01;34mEDS-214-REPRO[0m/  [01;34mEDS-217-PYTHON[0m/  [01;34mEDS-220-ENV-DATASETS[0m/


In [67]:
cd EDS-220-ENV-DATASETS/

[Errno 2] No such file or directory: 'EDS-220-ENV-DATASETS/'
/Users/rubinstein/MEDS/EDS-220-ENV-DATASETS


In [71]:
cd eds220-2024-sections/

[Errno 2] No such file or directory: 'eds220-2024-sections/'
/Users/rubinstein/MEDS/EDS-220-ENV-DATASETS/eds220-2024-sections


README.md  [0m[01;34meds220-2024-in-class-notebooks[0m/
[01;34mdata[0m/      section-1-data-selection-drylands.ipynb
