# Table of Contents
1. [Introduction](#Introduction)
2. [Setting Up Environment](#Setting-Up-Environment)
2. [Functions](#Functions)
3. [Getting Data](#Getting-Data)
3. [Descriptive Analysis](#Descriptive-Analysis)
    1. [Independent Variables](#Independent-Variables)
    2. [Dependent Variables](#Dependent-Variables)
4. [Inferential Analysis](#Inferential-Analysis)
    1. [_Sceloporus jarrovii_](#Sceloporus-jarrovii)
    1. [_Sceloporus virgatus_](#Sceloporus-virgatus)
5. [Conclusions](#Conclusions)
6. [Discussion](#Discussion)

## Introduction 

[Back to TOC](#Table-of-Contents)

## Setting Up Environment

[Back to TOC](#Table-of-Contents)

In [2]:
import pandas as pd
import numpy as np
import scipy.stats as ss
import pingouin as pg
import os, glob, logging
from summary_functions import *
import plotly
import chart_studio.plotly as py
import plotly.figure_factory as ff
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

init_notebook_mode(connected=True)
pd.options.display.max_columns = 50
pd.options.display.max_columns = 100
pd.options.display.max_colwidth = 1500


# Functions

[Back to TOC](#Table-of-Contents)

In [3]:
def report(row, threshcol,threshold = 0.05):
    if row[threshcol]<=threshold:
        res = print("{} and {} are significantly different (diff = {}) with an effect size of {} (effect type {})"\
       .format(row['A'], row['B'],row['diff'],row['efsize'],row['eftype']))
    else:
        res="{} and {} are not significantly different (p = {}).".format(row['A'],row['B'],row[threshcol])
    return res


# Getting Data
[Back to TOC](#Table-of-Contents)

In [4]:
df = pd.read_csv('TRT 2008-2016 analysis.csv')

In [5]:
print('There are {} rows in the file and {} columns.\n{}'.format(df.shape[0],df.shape[1],[col for col in df.columns]))

There are 144 rows in the file and 15 columns.
['year', 'Treatment', 'paintmark', 'TRTmin', 'Species', 'sex', 'SVL', 'TL', 'RTL', 'RemTL', 'RemMass', 'propRemTL', 'Treatment2', 'Treatment3', 'TRTminLog']


## Descriptive Analysis 
1. [Independent Variables](#Independent-Variables)
2. [Dependent Variables](#Dependent-Variables)

[Back to TOC](#Table-of-Contents)

### Independent Variables 

The independent variables include 'year', 'Treatment', 'paintmark', 'Species', 'sex', 'SVL', 'TL', 'RTL', 'RemTL', 'RemMass', 'propRemTL', and 'Treatment2'.

[Back to Descriptive Analysis](#Descriptive-Analysis); [Back to TOC](#Table-of-Contents)

In [6]:
df.groupby(['Species','Treatment2']).TRTmin.count()

Species  Treatment2
sj       25%           29
         50%           10
         75%           13
         Intact        23
sv       25%           26
         50%           15
         75%           14
         Intact        14
Name: TRTmin, dtype: int64

In [7]:
df.groupby(['Species','Treatment3']).TRTmin.count()

Species  Treatment3
sj       25%           29
         >=50%         23
         Intact        23
sv       25%           26
         >=50%         29
         Intact        14
Name: TRTmin, dtype: int64

We will use Treatment3 as our indipendent variable.

In [8]:
IV = ['year', 'SVL', 'TL', 'RTL', 'RemTL', 'RemMass', 'propRemTL']
for v in IV:
    print("\n\nReport for '{}':\nDESCRIBE: \n{}\nREVIEW: \n{} "\
          .format(v,df.groupby('Species')[v].apply(distribution),df.groupby('Species')[v].apply(review)))



Report for 'year':
DESCRIBE: 
            n  minimum  maximum  median  siqr         mean     stdev
Species                                                             
sj      0  75     2007     2016  2011.0   4.5  2011.440000  4.156532
sv      0  69     2008     2016  2012.0   1.5  2012.753623  2.068004
REVIEW: 
Species
sj                (Unique types include the following: {<class 'int'>}, Unique values include:{2016, 2015, 2011, 2007}, OK)
sv    (Unique types include the following: {<class 'int'>}, Unique values include:{2016, 2008, 2010, 2011, 2012, 2015}, OK)
Name: year, dtype: object 


Report for 'SVL':
DESCRIBE: 
            n  minimum  maximum  median  siqr       mean     stdev
Species                                                           
sj      0  75     56.0     92.0    76.0  7.25  75.306667  9.122253
sv      0  69     40.0     66.0    53.0  3.00  53.550725  5.454454
REVIEW: 
Species
sj    (Unique types include the following: {<class 'float'>}, Unique values include:


Invalid value encountered in percentile



### Dependent Variables 

[Back to Descriptive Analysis](#Descriptive-Analysis); [Back to TOC](#Table-of-Contents)

In [9]:
DV = ['TRTmin','TRTminLog']
for v in DV:
    print("\n\nReport for '{}':\nDESCRIBE: \n{}\nREVIEW: \n{} "\
          .format(v,df.groupby('Species')[v].apply(distribution),df.groupby('Species')[v].apply(review)))



Report for 'TRTmin':
DESCRIBE: 
            n  minimum  maximum  median    siqr      mean     stdev
Species                                                            
sj      0  75     0.13     0.71    0.29  0.0675  0.289200  0.120805
sv      0  69     0.03     0.51    0.26  0.0700  0.258986  0.100968
REVIEW: 
Species
sj                                  (Unique types include the following: {<class 'float'>}, Unique values include:{0.13, 0.32, 0.36, 0.29, 0.31, 0.24, 0.42, 0.19, 0.25, 0.7, 0.34, 0.26, 0.18, 0.43, 0.68, 0.28, 0.2, 0.17, 0.21, 0.15, 0.23, 0.16, 0.46, 0.71, 0.30000000000000004, 0.14, 0.4, 0.41, 0.33}, OK)
sv    (Unique types include the following: {<class 'float'>}, Unique values include:{0.19, 0.07, 0.24, 0.25, 0.27, 0.28, 0.4, 0.3, 0.26, 0.43, 0.34, 0.51, 0.18, 0.11, 0.36, 0.2, 0.37, 0.29, 0.15, 0.22, 0.17, 0.09, 0.16, 0.23, 0.21, 0.03, 0.38, 0.30000000000000004, 0.14, 0.31, 0.08, 0.32, 0.49, 0.33}, OK)
Name: TRTmin, dtype: object 


Report for 'TRTminLog':
DESCRIBE: 

## Inferential Analysis

Here we compare the Terrestrial Righting times of

1. [_Sceloporus jarrovii_](#Sceloporus-jarrovii)
1. [_Sceloporus virgatus_](#Sceloporus-virgatus)

For _S. jarrovii_ we will use TRTminLog since TRTmin times were skewed.  For _S. virgatus_ we will use TRTmin.

[Back to TOC](#Table-of-Contents)

### _Sceloporus jarrovii_

[Back to Inferential Analysis](#Inferential-Analysis); [Back to TOC](#Table-of-Contents)

In [11]:
aovSj = pg.anova(data=df.loc[df.Species=='sj'], dv='TRTminLog', between='Treatment3', detailed=True)
aovSj

Unnamed: 0,Source,SS,DF,MS,F,p-unc,np2
0,Treatment3,1.916,2,0.958,7.297,0.00130162,0.169
1,Within,9.454,72,0.131,-,-,-


In [11]:
aovSj = pg.anova(data=df.loc[df.Species=='sj'], dv='TRTminLog', between='Treatment3', detailed=True)
aovSj

Unnamed: 0,Source,SS,DF,MS,F,p-unc,np2
0,Treatment3,1.916,2,0.958,7.297,0.00130162,0.169
1,Within,9.454,72,0.131,-,-,-


#### Post-Hoc Analysis

In [12]:
pt = pg.pairwise_tukey(dv='TRTminLog', between='Treatment3', data=df.loc[df.Species=='sj'])
pt.apply(report,threshcol='p-tukey',axis=1)

25% and Intact are significantly different (diff = 0.38613652697286915) with an effect size of 1.051 (effect type hedges)


0        25% and >=50% are not significantly different (p = 0.1579456314670472).
1                                                                           None
2    >=50% and Intact are not significantly different (p = 0.14744025845554232).
dtype: object

In [11]:
## need to fix the x-axis labels

SjI = go.Box(y=df.loc[(df.Species=='sj')&(df.Treatment3=='Intact')].TRTmin, name='Intact')
Sj25 = go.Box(y=df.loc[(df.Species=='sj')&(df.Treatment3=='25%')].TRTmin, name='25%')
Sj50 = go.Box(y=df.loc[(df.Species=='sj')&(df.Treatment3=='50%')].TRTmin, name='50%')
Sj75 = go.Box(y=df.loc[(df.Species=='sj')&(df.Treatment3=='75%')].TRTmin, name='75%')
data=[SjI,Sj25,Sj50,Sj75]
layout = go.Layout(
    title = 'BoxPlot of Terrestrial Righting Times by Treatment',
    titlefont = dict(
        size = 20),
    xaxis= dict(
    ),
    yaxis = dict(
        title = 'Righting Times (s)',
        titlefont = dict(
            size = 18)))

fig = go.Figure(
        data = data,
        layout = layout)
py.iplot(fig, filename = 'BoxPlot of Sceloporus jarrovii Terrestrial Righting Times by Treatment.html')



### _Sceloporus virgatus_

[Back to Inferential Analysis](#Inferential-Analysis); [Back to TOC](#Table-of-Contents)

In [12]:
df.loc[df.Species =='sv'].Treatment3.unique()

array(['Intact', '50%', '25%', '75%'], dtype=object)

In [13]:
aovSv = pg.anova(data=df.loc[df.Species=='sv'], dv='TRTminLog', between='Treatment3', detailed=True)
aovSv

Unnamed: 0,Source,SS,DF,MS,F,p-unc,np2
0,Treatment2,1.418,3,0.473,1.996,0.123288,0.084
1,Within,15.397,65,0.237,-,-,-


In [14]:
## need to fix the x-axis labels

SvI = go.Box(y=df.loc[(df.Species=='sv')&(df.Treatment3=='Intact')].TRTmin, name='Intact')
Sv25 = go.Box(y=df.loc[(df.Species=='sv')&(df.Treatment3=='25%')].TRTmin, name='25%')
Sv50 = go.Box(y=df.loc[(df.Species=='sv')&(df.Treatment3=='50%')].TRTmin, name='50%')
Sv75 = go.Box(y=df.loc[(df.Species=='sv')&(df.Treatment3=='75%')].TRTmin, name='75%')
data=[SvI,Sv25,Sv50,Sv75]
layout = go.Layout(
    title = 'BoxPlot of Terrestrial Righting Times by Treatment',
    titlefont = dict(
        size = 20),
    xaxis= dict(
    ),
    yaxis = dict(
        title = 'Righting Times (s)',
        titlefont = dict(
            size = 18)))

fig = go.Figure(
        data = data,
        layout = layout)
py.iplot(fig, filename = 'BoxPlot of Sceloporus virgatus Terrestrial Righting Times by Treatment.html')



## Conclusion

[Back to TOC](#Table-of-Contents)

## Discussion

[Back to TOC](#Table-of-Contents)