In [7]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import seaborn as sns
plt.style.use('ggplot')

# IGBST Bear Capture Data Exploration

This is the data exploration phase for the Bear Capture data provided in the annual reports by the IGBST, Table 1. There are some abbreviations of note in order to better understand this data.

## Notation
### Locations
- **BDNF**: Beaverhead-Deerlodge National Forest
- **BLM**: Bureau of Land Management
- **BTNF**: Bridger-Teton National Forest
- **CTNF**: Caribou-Targhee National Forest
- **CGNF**: Custer Gallatin National Forest
- **Crk**: Creek
- **GTNP**: Grand Teton National Park
- **SNF**: Shoshone National Forest
- **YNP** Yellowstone National Park
- **WRIR**: Wind River Reservation
- **Pr**: Private
- **ID, MT, WY**: State abbreviations for Idaho, Montana, and Wyoming respectively
  - So *Pr-WY* means private Wyoming


### Organizations
- **IDFG**: Idaho Department of Fish and Game
- **IGBST**: International Grizzly Bear Study Team
- **GTNP**: Grand Teton Natiaonl Park
- **MTFWP**: Montana Fish, Wildlife, and Parks
- **WS**: Wildlife Services
- **WYGFD**: Wyoming Game and Fish Department
  - There is a variation in the abbreviation throughout the years in the actual annual summaries, but was changed to this one, in Excel, for consistent naming conventions
- **WRIR**: Wind River Reservation
- **YNP**: Yellowstone National Park

## Data Exploration

In [13]:
bearCapture = pd.read_csv('bearCapture.csv') # Read CSV

bearCapture.dtypes # Show datatypes of columns 

Bear                object
Sex                 object
Age                 object
Date                object
General Location    object
Capture type        object
Release Site        object
Handler             object
Year                 int64
dtype: object

Note that we want our Date column to have date data type

In [9]:
bearCapture.columns # All columns in the table

Index(['Bear', 'Sex', 'Age', 'Date ', 'General Location', 'Capture type',
       'Release Site', 'Handler', 'Year'],
      dtype='object')

In [10]:
bearCapture.isna().sum() # Show the empty records and the column that is empty

Bear                  0
Sex                   0
Age                   0
Date                  0
General Location      1
Capture type        186
Release Site          0
Handler               1
Year                  0
dtype: int64

### Year Data

In [12]:
pd.DataFrame(bearCapture['Year'].value_counts()) # Bears captured per year

Unnamed: 0,Year
2018,129
2021,123
2020,113
2010,111
2015,109
2016,108
2011,107
2012,104
2022,100
2017,99


In [None]:
bearCapture.hist() # Graph of bears captures per year

### General Location Data

In [None]:
pd.DataFrame(bearCapture['General Location'].value_counts()) # Count of all locations

In [None]:
pd.DataFrame(bearCapture["General Location"].str.contains("South Fork Shoshone").value_counts()) # Finding total number of records that contain South Fork Shoshone in General Location

We can see there are many locations that are used and some of this disparity are the abbreviations after certain locations, for instance South Fork Shoshone. We can see that there are a total of 59 items that contain South Fork Shoshone as the general location, but only 41 contain South Fork Shoshone, Pr-WY. This is important to note. 

In [None]:
pd.DataFrame(bearCapture['General Location'].value_counts()).plot(kind='bar', figsize=(80,10))

### Handler Data

We can see that WYGFD makes up the greatest majority even collaborating with some other organizations, but they seem to handle most of these captures indepenently. 

**Note**: WTGF seems to be some sort of error as there is no listed meaning to the abbreviation, for this reason this data should be ommitted 

In [None]:
pd.DataFrame(bearCapture['Handler'].value_counts()) # Analayzing handler data

In [None]:
pd.DataFrame(bearCapture['Handler'].value_counts()).plot(kind='bar', figsize=(20,10)) # Visual Representation of handler captures

In [None]:
bearCapture[bearCapture['Handler']=='ws'] # Finding data that needs to be changed

Need to change ws to WS

In [None]:
bearCapture[bearCapture['Handler']=='YNP'] # Records when handler is YNP

#### Year Data with Handler

In [None]:
wygfdCapture = bearCapture[bearCapture["Handler"] == "WYGFD"] # Records with handler WYGFD

pd.DataFrame(wygfdCapture['Year'].value_counts()) # Count per year

#### Handler and General Location

In [None]:
gibbonRiverCapture = bearCapture[bearCapture["General Location"] == "Gibbon River, YNP"] # Records with handler General Location Gibbon River, YNP

pd.DataFrame(gibbonRiverCapture['Handler'].value_counts()) # Handler count for Gibbon River, YNP

In [None]:
cascadeCrkCapture = bearCapture[bearCapture["General Location"] == "Cascade Crk, YNP"] # Records with handler General Location Cascade Crk, YNP

pd.DataFrame(cascadeCrkCapture['Handler'].value_counts()) # Handler count for Cascade Crk, YNP

In [None]:
sForkShoshoneCapture = bearCapture[bearCapture["General Location"] == "South Fork Shoshone, Pr-WY"] # Records with General Location South Fork Shoshone Pr-WY

pd.DataFrame(sForkShoshoneCapture['Handler'].value_counts()) # Handler count for South Fork Shoshone Pr-WY

In [None]:
antelopeCrkCapture = bearCapture[bearCapture["General Location"] == "Antelope Crk, YNP"] # Records with handler General Location Antelope Crk, YNP

pd.DataFrame(antelopeCrkCapture['Handler'].value_counts()) # Handler count for Antelope Crk, YNP

In [None]:
stephensCrkCapture = bearCapture[bearCapture["General Location"] == "Stephens Crk, YNP"] # Records with handler General Location Stephens Crk, YNP

pd.DataFrame(stephensCrkCapture['Handler'].value_counts()) # Handler count for Stephens Crk, YNP

Interesting enough many of the general locations where captures most frequently happend were conducted by IGFD, not WYGFD even though WYGFD makes up the most captures, but keep in mine the most frequented location, Gibbon River, YNP, accounts for just 62 of the captures.

### Release Site Data

We can see that many of the bears are released on the same site that they are captured, but a good amount of them are removed. 

Also note that there are removal numbers for bears removed between 2022-2017.

In [None]:
pd.DataFrame(wygfdCapture['Release Site'].value_counts())

In [None]:
pd.DataFrame(bearCapture["Release Site"].str.contains("Removed").value_counts()) # Finding total number of records that contain Removed

In [None]:
removedCapture = bearCapture[bearCapture["Release Site"].str.contains("Removed")] # Records where bears were removed

pd.DataFrame(removedCapture['Year'].value_counts())

### Sex of Bear Data

We can see that the majority of the bears captured are male.

In [None]:
pd.DataFrame(bearCapture['Sex'].value_counts()) # Count for sex of bears

In [None]:
maleCapture = bearCapture[bearCapture["Sex"].str.contains("Male")] # Records where the Sex is Male

pd.DataFrame(maleCapture['Year'].value_counts()) # Number of records where the Sex is Male each Year

In [None]:
pd.DataFrame(maleCapture['General Location'].value_counts()) # Number of records where the Sex is Male by General Location

In [None]:
femaleCapture = bearCapture[bearCapture["Sex"].str.contains("Female")] # Number of records where the Sex is Female

pd.DataFrame(femaleCapture['Year'].value_counts()) # Number of records where the Sex is Female each Year

In [None]:
pd.DataFrame(femaleCapture['General Location'].value_counts()) # Number of records where the Sex is Female by General Location

### Age Data

Note that from 1987 - 1997 some of the bears have actual ages listed

In [None]:
pd.DataFrame(bearCapture['Age'].value_counts()) # Count for age of bears

## Conclusions

I would be interested in splitting up the month, date, and year in order to get analytics on bears captures each month. Also for bears with adges, classifying them as adult, subadult, yearling, COY in order to get more accurate data that is consistent. Another thing to consider is splitting up both the General Location and Release Site data to mainly include the main location, i.e. South Fork Shoshone instead of South Fork Shoshone, Pr-WY, and notating the Pr-WY in another column. Figuring out bears captures on private land would also be useful information. 