# BIOTAS Project Notebook

This BIOTAS project notebook takes the end user through the data import and false positive reduction phases of a telemetry project.  

# Project Set Up
## Import Modules

In [None]:
import os
import sys
import sqlite3
import pandas as pd

Identify the BIOTAS directory and import BIOTAS

In [None]:
sys.path.append(r"C:\Users\knebiolo\OneDrive - Kleinschmidt Associates, Inc\Software\biotas")
import biotas

## Set up Workspaces
First, identify the project directory and database name

In [None]:
proj_dir = r"C:\Users\knebiolo\Desktop\Nuyakuk BIOTAS"
db_name = 'nuya_test.db'

Next, connect to some important workspaces that we will use later

In [None]:
db_dir = os.path.join(proj_dir,'Data',db_name)
scratch_ws = os.path.join(proj_dir,'Output','Scratch')
input_ws = os.path.join(proj_dir,'Data')
output_ws = os.path.join(proj_dir,'Output')
figure_ws = os.path.join(proj_dir,'Figures')
training_ws = os.path.join(proj_dir,'Data','Training_Files')

## Create BIOTAS Project

**If you are connecting to a previous project, do not run the next cell.**

In [None]:
biotas.createTrainDB(proj_dir, db_name)

## Finish Setting Up your BIOTAS Project

Identify import parameters.  **If you have already created a BIOTAS project, do not run the next two cells**

In [None]:
# number of detections (+/-) in the PDH
det = 5

# duration (minutes) used in noise ratio calculation 
duration = 1

biotas.setAlgorithmParameters(det,duration,db_dir)

Import base data

In [None]:
# import data to Python
tblMasterTag = pd.read_csv(os.path.join(input_ws,'tblMasterTag.csv'))
tblMasterReceiver = pd.read_csv(os.path.join(input_ws,'tblMasterReceiver.csv'))
tblNodes = pd.read_csv(os.path.join(input_ws,'tblNodes.csv'))

# write data to SQLite
biotas.studyDataImport(tblMasterTag,db_dir,'tblMasterTag')
biotas.studyDataImport(tblMasterReceiver,db_dir,'tblMasterReceiver')
biotas.studyDataImport(tblNodes,db_dir,'tblNodes')

# clean up 
del tblMasterTag, tblMasterReceiver, tblNodes

# Import Raw Telemetry Data and Create Training Data

The following cells import and train raw telemetry data, one receiver at a time.  Re-run this section of cells for each receiver in your study.  

The first thing we need to do is identify the site name exactly as it appears in tblMasterReceiver and the receiver type.

In [None]:
site = 10
recType = 'orion'

Then, create an antenna to receiever dictionary.  Radio telemety receivers all for more than one antenna on different channels.  This dictionary identies the channel number associated with each site.

In [None]:
ant_to_rec_dict = {'1':site}

Next, place raw telemetry data into the 'Training Files' folder in your BIOTAS project directory and run the following cell.  Note arguments for scanTime and channels.  For Lotek receivers we can leave these as 1, their default values.  However, scanTime and channels must be entered for Orion receivers if your study value differed from the values used below.  Scan time refers to how long the receiver will monitor a channel before switching to another one, while channels indicates the number of channels or bands this receiver is switching over.

In [None]:
biotas.telemDataImport(site,
                       recType,
                       training_ws,
                       db_dir,
                       scanTime = 1,
                       channels = 1,
                       ant_to_rec_dict = ant_to_rec_dict)

The following cell creates a list of unique individuals to iterate over, creates training data objects for each individual, trains the algorithm, and updates the project database.

In [None]:
for i in ant_to_rec_dict:
    conn = sqlite3.connect(db_dir)
    c = conn.cursor()
    sql ='''SELECT tblRaw.FreqCode FROM tblRaw
            LEFT JOIN tblMasterTag ON tblRaw.FreqCode = tblMasterTag.FreqCode
            WHERE recID == '%s'
            AND TagType IS NOT 'Beacon'
            AND TagType IS NOT 'Test';'''%(ant_to_rec_dict[i])
    histories = pd.read_sql_query(sql,con = conn).FreqCode.unique()
    c.close()

    print ("There are %s fish to iterate through" %(len(histories)))
    print ("Creating training objects for every fish at site %s"%(site))
    
    # create a training data object for each fish and train naive Bayes.
    for j in histories:
        train_dat = biotas.training_data(j,
                                         ant_to_rec_dict[i],
                                         db_dir,
                                         scratch_ws)
        biotas.calc_train_params_map(train_dat)
    print ("Telemetry Parameters Quantified, appending data to project database")
    
    # append data and summarize
    biotas.trainDatAppend(scratch_ws,db_dir)
    train_stats = biotas.training_results(recType,
                                          db_dir,
                                          figure_ws,
                                          ant_to_rec_dict[i])
    # visualize results
    train_stats.train_stats()

# Classify 

The following cells classify a single receiver.  These cells will need to be repeated for each site in your study.

First, identify some important parameters, including: the site, receiver type (recType) and list of receivers to draw training data from for the the final classification.   

In [None]:
site = 'T5'
recType = 'orion'
rec_list = ['T1','T2','T3','T4','T5']

The next cell creates the A-la carte likelihood model.  The end user will construct a classifer by listing the BIOTAS features to include.  Choose from the following list:

conRecLength, consDet, hitRatio, noiseRatio, seriesHit, power, lagDiff

In [None]:
fields =  = ['conRecLength','hitRatio','power','lagDiff']  

Next, indicate whether or not you wish to use an informed prior.

In [None]:
prior = True

The next cell creates a list of unique ID's to iterate through at this site

In [None]:
conn = sqlite3.connect(projectDB)
c = conn.cursor()
sql = "SELECT FreqCode FROM tblRaw WHERE recID == '%s';"%(site)
histories = pd.read_sql(sql,con = conn)
tags = pd.read_sql('''SELECT FreqCode, TagType 
                   FROM tblMasterTag 
                   WHERE TagType == 'Study' ''', con = conn)
histories = histories.merge(right = tags, 
                            how = 'left',
                            left_on = 'FreqCode', 
                            right_on = 'FreqCode')
histories = histories[histories.TagType == 'Study'].FreqCode.unique()
c.close()
print ("There are %s fish to iterate through at site %s" %(len(histories),site))

Then, we create a training dataset for this round of classification

In [None]:
train = biotas.create_training_data(site,
                                    projectDB,
                                    rec_list)

Then we perform the initial classfication of each unique individual and add data to the project database.

In [None]:
for i in histories:
    class_dat = biotas.classify_data(i,
                                     site,
                                     fields,
                                     projectDB,
                                     scratch_ws,
                                     training_data=train,
                                     informed_prior = prior)
    biotas.calc_class_params_map(class_dat)   
print ("Detections classified!")
biotas.classDatAppend(site, scratch_ws, projectDB)

Visualize results of the initial classification. 

In [None]:
class_stats = biotas.classification_results(recType,
                                            projectDB,
                                            figure_ws,
                                            rec_list=[site])
class_stats.classify_stats()

## Reclassification

If the results of the initial classification are not satisfactory.  The following cells perform a reclassification routine.  The end user first indicates which iteration number they are on.  **Note the end user always starts at 2**

In [None]:
class_iter = 2

Next, create a training dataset for this round of classification

In [None]:
train = biotas.create_training_data(site,
                                    projectDB,
                                    reclass_iter,
                                    rec_list)

The next cell iterates over each indidivual at this site and re-classifies it

In [None]:
for i in histories:
    class_dat = biotas.classify_data(i,
                                    site,
                                    fields,
                                    os.path.join(inputWS,dbName),
                                    outputScratch,
                                    train,
                                    informed_prior = prior,
                                    reclass_iter=class_iter)
    biotas.classify(class_dat)
print ("Detections classified!") 
biotas.classDatAppend(site,outputScratch,projectDB,reclass_iter = class_iter)

Now, visualize the results of the re-classifcation.

In [None]:
class_stats = biotas.classification_results(recType,
                                            os.path.join(inputWS,dbName),
                                            figure_ws,
                                            rec_list=[site])
class_stats.classify_stats()