# Table of Contents
1. [Introduction](#Introduction)
1. [Set Up Python](#Set-Up-Python)
1. [Getting Data](#Getting-Data)
1. [Descriptive Stats](#Descriptive-Stats)
1. [Graphs](#Graphs)
    1. [_Sceloporus jarrovii_](#Sceloporus-jarrovii)
    2. [_Sceloporus virgatus_](#Sceloporus-virgatus)
    3. [_Urosaurus ornatus_](#Urosaurus-ornatus)

# Introduction
[Table of Contents](#Table-of-Contents)

During th efield season there is a point at which is it unlikely that we will capture any more individuals for the first time that season.  We this point visually as a flattening of the The purpose of this estimate is to determine the point at which we have effectively curve when one plots the number of total captures against the days in the field.  The purpose of this notebook is to develop a statistical means of:
1. detecting the day at which that point was reached

and

2. estimating the population size at that point.

# Set Up Python
[Table of Contents](#Table-of-Contents)

In [1]:
import pandas as pd
import os,glob,time
import plotly
import chart_studio.plotly as py
import plotly.graph_objs as go
import plotly.io as pio
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
# plotly.tools.set_config_file(world_readable=True)

pd.options.display.max_rows = 99999
pd.options.display.max_columns = 50

# Getting Data
[Table of Contents](#Table-of-Contents)


We use the following chunks to determine from which paths we will read data and to which paths we will write output files. We then read in a data set from the 2019 field season.

### Setting File Locations

In [2]:
deviceDict = {'dataBig':{'source':'S:/Chris/TailDemography/TailDemography/AZ Research/AZ 2019'
                         ,'log':'S:/Chris/TailDemography/TailDemography/AZ Research/AZ 2019'
                         ,'output':'S:/Chris/TailDemography/TailDemography/AZ Research/AZ 2019'},
              'silverSurfer':{'source':'C:\\Users\\craga_eowcrpe\\Google Drive\\AZ Research/AZ 2019'
                              ,'log':'C:\\Users\\craga_eowcrpe\\Google Drive\\AZ Research/AZ 2019'
                              ,'output':'C:\\Users\\craga_eowcrpe\\Google Drive\\AZ Research/AZ 2019'}
              ,'dataPers':{'source':'C:/Users/Christopher/Google Drive/AZ Research/AZ 2019'
                           ,'log': 'C:\\Users\\craga_eowcrpe\\Google Drive/AZ Research/AZ 2019'
                           ,'output':'C:/Users/Christopher/Google Drive/AZ Research/AZ 2019'}
             ,'gandolf':{'source':'C:/Users/craga/Google Drive/AZ Research/AZ 2019'
                           ,'log': 'C:/Users/craga/Google Drive/AZ Research/AZ 2019'
                           ,'output':'C:/Users/craga/Google Drive/AZ Research/AZ 2019'}}

### Choose Device

In [3]:
device = deviceDict['gandolf']
device

{'source': 'C:/Users/craga/Google Drive/AZ Research/AZ 2019',
 'log': 'C:/Users/craga/Google Drive/AZ Research/AZ 2019',
 'output': 'C:/Users/craga/Google Drive/AZ Research/AZ 2019'}

In [4]:
os.listdir(device['source'])

['2018 Captures Cheat Sheet.csv',
 '2018 Captures Cheat Sheet.gsheet',
 'CC 2017 Lizards - 3viii17-figure example.xls',
 'CC Data 2019 - FINAL.xlsx',
 'desktop.ini',
 'Reciepts',
 'Sjtoes.csv',
 'Svtoes.csv',
 'Uotoes.csv']

### Read in Data
Here we read in the data and only keep rows for <i>S. jarrovii</i>.

In [5]:
sourcefile = device['source']+'/CC Data 2019 - FINAL.xlsx'
skipRows = [x for x in range(1,20)]
df=pd.read_excel(sourcefile,skiprows=skipRows, parse_dates=["Date"])
df = df.loc[df.Species.isin(['Sj'])|df.Species.isna()]
print("The sample dataset has {} rows of data.".format(df.shape[0]))
df.head()

The sample dataset has 402 rows of data.


Unnamed: 0,Species,Toes,Toes_string,Unnamed: 3,Date,Sex,SVL,TL,RTL,Autotomized,Mass,Paint Mark,Location,Meters,New/Recap,Painted,Sighting,Misc.,Vial,Time,Click Video,Search Party
0,,,,,2019-06-14,,,,,,,,bottom of site,,,,,"IN: 0928; w=0.8; t=23.4; h=21.3; clear, breezy...",,928.0,,George Middendorf; Christopher Agard
1,Sj,10-18,10-18,' 10-18,2019-06-14,f,75.0,77.0,20.0,1.0,11.5,w2b50c..t,10m ^ trail entrance to creek on R left of and...,-20.0,recap,yes,no,w50..t still visible from last year; not shed ...,,,,George Middendorf; Christopher Agard
2,Sj,5-15,5-15,' 5-15,2019-06-14,m,78.0,108.0,0.0,0.0,15.7,w1b,tree 5m ^ entrance,-25.0,new,yes,no,,19-01,,,George Middendorf; Christopher Agard
3,Sj,12-19,12-19,' 12-19,2019-06-14,f,76.0,112.0,0.0,0.0,11.0,w3b..t,tree at 1 falls,0.0,recap,yes,no,..t still visible from last yr; not shed since,,,,George Middendorf; Christopher Agard
4,Sj,7-18,7-18,' 7-18,2019-06-14,f,68.0,97.0,0.0,0.0,11.0,w4b,3m v top left wall 2.5m^stacked wall,27.0,recap,yes,no,looks gravid,,,,George Middendorf; Christopher Agard


# Descriptive Stats
[Table of Contents](#Table-of-Contents)

In [6]:
nSightings = df.loc[df.Sighting=='yes'].groupby(['Species',
                                                 'Date'])\
.Sighting.count().reset_index()
nSightings

Unnamed: 0,Species,Date,Sighting
0,Sj,2019-06-14,3
1,Sj,2019-06-15,9
2,Sj,2019-06-16,28
3,Sj,2019-06-17,25
4,Sj,2019-06-18,22
5,Sj,2019-06-19,19
6,Sj,2019-06-20,13
7,Sj,2019-06-21,17
8,Sj,2019-06-22,25
9,Sj,2019-06-24,15


In [32]:
nCaptures = df.loc[df.Painted=='yes'].groupby(['Species',
                                                 'Date'])\
.Painted.count().reset_index().merge(nSightings,on=['Species','Date'],how='outer').sort_values('Date')
nCaptures.loc[:,['Painted','Sighting']] = nCaptures[['Painted','Sighting']].fillna(0)
nCaptures['cumulativeNew']=nCaptures.groupby('Species').Painted.cumsum()
nCaptures['nPrevious']=nCaptures.groupby('Species').cumulativeNew.shift(1)
nCaptures['percNew']=(nCaptures.Painted/nCaptures.cumulativeNew)
nCaptures = nCaptures.rename(columns = {'Painted':'New'})
nCaptures

Unnamed: 0,Species,Date,New,Sighting,cumulativeNew,nPrevious,percNew
0,Sj,2019-06-14,17.0,3,17.0,,1.0
1,Sj,2019-06-15,7.0,9,24.0,17.0,0.291667
2,Sj,2019-06-16,2.0,28,26.0,24.0,0.076923
3,Sj,2019-06-17,4.0,25,30.0,26.0,0.133333
4,Sj,2019-06-18,1.0,22,31.0,30.0,0.032258
5,Sj,2019-06-19,1.0,19,32.0,31.0,0.03125
10,Sj,2019-06-20,0.0,13,32.0,32.0,0.0
6,Sj,2019-06-21,5.0,17,37.0,32.0,0.135135
7,Sj,2019-06-22,3.0,25,40.0,37.0,0.075
8,Sj,2019-06-24,1.0,15,41.0,40.0,0.02439


Now we create a day column

In [33]:
nCaptures['fieldDay'] = nCaptures.index+1
nCaptures

Unnamed: 0,Species,Date,New,Sighting,cumulativeNew,nPrevious,percNew,fieldDay
0,Sj,2019-06-14,17.0,3,17.0,,1.0,1
1,Sj,2019-06-15,7.0,9,24.0,17.0,0.291667,2
2,Sj,2019-06-16,2.0,28,26.0,24.0,0.076923,3
3,Sj,2019-06-17,4.0,25,30.0,26.0,0.133333,4
4,Sj,2019-06-18,1.0,22,31.0,30.0,0.032258,5
5,Sj,2019-06-19,1.0,19,32.0,31.0,0.03125,6
10,Sj,2019-06-20,0.0,13,32.0,32.0,0.0,11
6,Sj,2019-06-21,5.0,17,37.0,32.0,0.135135,7
7,Sj,2019-06-22,3.0,25,40.0,37.0,0.075,8
8,Sj,2019-06-24,1.0,15,41.0,40.0,0.02439,9


# Graphs
[Table of Contents](#Table-of-Contents)

Here we will visualize the point at which we captured all lizards in the sight.

In [34]:
year = df.loc[df.Date.notna()]\
.Date.apply(lambda x: x.year).unique()[0]
year

2019

## _Sceloporus jarrovii_

In [35]:
species ='Sj'
# from datetime import datetime
New = go.Scatter(x=nCaptures.loc[nCaptures.Species==species].fieldDay.sort_values(), 
                y=nCaptures.loc[nCaptures.Species==species].New,
               mode = 'lines+markers',  name = 'New')
Sighting =  go.Scatter(x=nCaptures.loc[nCaptures.Species==species].fieldDay.sort_values(), 
                y=nCaptures.loc[nCaptures.Species==species].Sighting,
               mode = 'lines+markers',  name = 'Sightings')
Cumulative =  go.Scatter(x=nCaptures.loc[nCaptures.Species==species].fieldDay.sort_values(), 
                y=nCaptures.loc[nCaptures.Species==species].cumulativeNew,
               mode = 'lines+markers',  name = 'Cumulative')

data = [New,Sighting,Cumulative]
layout = go.Layout(
    title = 'Number of New {} Captures By fieldDay'.format(species),
    titlefont = dict(
        size = 20),
    xaxis = dict(
        dtick = 1,
        title = 'fieldDay',
        titlefont = dict(
            size = 18)),
    yaxis = dict(
        title = 'Number of New Captures',
        titlefont = dict(
            size = 18),
    range=[0,nCaptures.loc[nCaptures.Species==species].cumulativeNew.max()+5]))
fig = go.Figure(
        data = data,
        layout = layout)

iplot(fig,
         filename = 'Number of New {} Captures By fieldDay for {}.html'.format(species, year))