# README - Calibration Chamber
* This python Notebook is a first-draft script to use with the calibration chamber to test and compare AudioMoths
* The basic idea is that to test how well a recorder is functioning (for example, after being a bit beat up in the field by water or other factors), we want to be able to play the same audio clip for each AudioMoth and examine the recorded files. Comparing them to some standard will allow us to decide if the Moth is recording well enough to go back into the field.
* Ultimately, this should be a semi-automated process: Audiomoths are placed into chamber and test sound is played after relevant information about them is recorded (tasks that must be done by a lab member), but the software should handle the data analysis and output. Eventually, it would be nice to have it save the calibration data somewhere like a big Google Drive document with all of the recorders.

## Instructions for use:
* Configure 1-12 AudioMoths (AMs) on Medium Gain, 32kHz Sample Rate, 0s Sleep Duration, and 3600s (1 hour) Recording Duration using the AudioMoth software.
* Record SD card-AM relationships and any notes in the file *moths_toTest.csv*
* Set all AMs to Default mode and place in chamber with microphones facing up. Close flap and play test recording from MP3 player starting at a known time (write it down!)
  * What I used to note the universal time: https://time.is/UTC
* After recording is finishied (~40s), remove AMs and set all to USB/OFF mode, then remove SD cards and mount cards to your Mac.
* Update the *csv_name* and *playback_start* variables at the bottom of this script, and then run the final panel of the Notebook (shift-enter) to read SD cards, grab relevant parts of the audio info, and run and print analytics. Data is also saved to *csv_name* for posterity, so make sure to name in a descriptive way if that's data we care to keep.
  * (You might have to run the above panels first to load the functions on a new computer that hasn't run the script before)
  
## Tasks to work on moving forward (in no particular order):
* Improve this script or the README for readability/usability, if necessary
* Improve the physical chamber, if necessary
* Examine recordings from "good" and "bad" Moths, and determine what kinds of analyses we can automate to output a reliable guess about the quality of the Moth
  * ie median volume, the quietest sounds they can pick up, if there's any pops/echoes/ringing/screeching sounds in certain frequencies -- how do we tell how a/functional a Moth is?
  * trial and error, looking at spectrograms, editing the test audio clip (changing the volume output, actual sounds that make it up, the spacing of sounds,..?), and examining different subsets of recordings will probably be necessary to nail this down
* Run many new AudioMoths through the chamber to get a baseline of what a "good" Moth's recordings are like, and save the output data in a way that is easy to compare to unknown-quality Moths that will be tested in the future when they return from the field.
  * Note that it is easy to read and write a lot of information to CSV files using Python. It might make sense for this task to just save as much information as possible so we have a larger set to compare future files to.
  * This task could be done before/after/or in conjuction with the last task to meet both goals.


In [35]:
import pandas as pd
from tinytag import TinyTag
import os
import re
import scipy.io.wavfile as wave
import math
from statistics import mean, median


### UTC_timeDiff(a,b)
* Takes two colon-separated UTC time stamp strings, and returns the difference (a - b) in seconds
* (ie a = '17:46:53', b = '17:45:11' ->> 102 seconds)

In [36]:
def UTC_timeDiff(a, b): 
    a_hour = a[0:2]
    a_minute = a[3:5]
    a_second = a[6:]
    
    a_parts = list(map(int, a.split(':')))
    b_parts = list(map(int, b.split(':')))

#     for i in range(len(a_parts)):
#         print(int(a_parts[i]))
#         print(int(b_parts[i]))
        
    s_diff = a_parts[2] - b_parts[2]
    m_diff = a_parts[1] - b_parts[1]
    h_diff = a_parts[0] - b_parts[0]

    difference_in_seconds = 60*60*h_diff + 60*m_diff + s_diff
    return(difference_in_seconds)

### splitCalls(file, t0)
* Takes a AM-recorded file and an offset time t0 (how long after the recording begins that the test clip begins),
* Returns three matrices of the sound data containing information about each of the three different bird types


In [37]:
def splitCalls(file, t0):
    sample_rate, data = wave.read(file)
    
# TODO: switch to DataFrame implementation for efficiency, at some point    
#     birds = pd.DataFrame([['chickadee', 187, 7784], ['wren', 8740, 20879],['pileated', 22053, 30390]],
#                          columns=['Bird', 'start', 'end']) #original test clip timings
    
    birds = {"chickadee": [0, 8518], # bounding times (in ms) in original file
             "wren": [9960, 22484],
             "pileated": [24597, 33074]}

    for bird in birds.values():
        for i in range(2):
            bird[i] = (bird[i]+t0)

    BCCH = data[math.floor(birds["chickadee"][0]): math.ceil(birds["chickadee"][1]+1)]
    CAWR = data[math.floor(birds["wren"][0]): math.ceil(birds["wren"][1]+1)]
    PIWO = data[math.floor(birds["pileated"][0]): math.ceil(birds["pileated"][1] +1)]
    
    return BCCH, CAWR, PIWO #BCCH = black-capped chickadee, CAWR = carolina wren, PIWO = pileated woodpecker

# getSpecs(file,t0):
* Split a recorded file down to just the relevant information and run some analytics on it.
* So far I just take the mean and max (but only use the mean) of each birdcall segment.
* We will probably want to come up with some new metrics to compare recording quality moving forward.

In [38]:
def getSpecs(file, t0):
    BCCH, CAWR, PIWO = splitCalls(file, t0)
    
#     print(BCCH, CAWR, PIWO)
    BCCH_max = round(max(BCCH),3)
    BCCH_mean = round(mean(map(abs,BCCH)),3)
    CAWR_max = round(max(CAWR),3)
    CAWR_mean = round(mean(map(abs,CAWR)),3)
    PIWO_max = round(max(PIWO),3)
    PIWO_mean = round(mean(map(abs,PIWO)),3)
    
    return [BCCH_max, BCCH_mean, CAWR_max, CAWR_mean, PIWO_max, PIWO_mean] 

### calibrate(sd_prefix, sd_mount, playback_start, csv_name)
* loads .wav file recordings from SD cards, calculates delay based on user-input playback time, and builds a DataFrame with relevant information.

In [61]:
def calibrate(sd_prefix, sd_mount, playback_start, csv_name):
    moth_info = pd.read_csv('moths_toTest.csv', dtype=str)
    moth_info['SD'] = 'MSD-' + moth_info['SD'] #make df values match SD naming scheme
    print("Moth_info: ")
    print(moth_info)

    calib_output = pd.DataFrame(columns=['AM', 'SD', 'BCCH_mean', 'CAWR_mean', 'PIWO_mean', 'notes'])

    disks = os.listdir(path=sd_mount)
    print(disks)
    count = 0

    for disk in disks:
        if bool(re.search(sd_prefix, disk)):  # iterate thru SD cards
            print(f'\n{disk}')

            files = os.listdir(path=f'{sd_mount}/{disk}')
            for file in files:
                if not file.startswith('.'): #ignore hidden files
                    f = TinyTag.get(f'{sd_mount}/{disk}/{file}')
                    recording_start = f.comment[12:20]
                    delay = UTC_timeDiff(playback_start, recording_start)
                    print(f'{playback_start} - {recording_start} = {delay}\n')                
                    BCCH_max, BCCH_mean, CAWR_max, CAWR_mean, PIWO_max, PIWO_mean = getSpecs(f'{sd_mount}/{disk}/{file}', 1000*delay)
                    print(moth_info[moth_info['SD'] == disk])
                    thismoth = moth_info[moth_info['SD'] == disk]
                    print(thismoth)
                    am = thismoth['AM'].iloc[0]
                    notes = thismoth['notes'].iloc[0]

                    calib_output.loc[count] = [f'M10-{am}', disk, BCCH_mean, CAWR_mean, PIWO_mean, notes]
        count = count + 1
    calib_output.to_csv(csv_name)
    print(calib_output)
                

### Running the script
* Update *csv_name* and *playback_start* variables, then run this panel to run the calibration and print out results

In [63]:
sd_prefix = 'MSD'
sd_mount = '/Volumes'

csv_name = "05-07-19-calib_output.csv"
playback_start = '18:17:30'#'18:17:15'      

calibrate(sd_prefix, sd_mount, playback_start, csv_name)


Moth_info: 
     AM        SD      notes
0  0390  MSD-0369  To Deploy
['lacie', 'seagate2', 'seagate3', 'Macintosh HD', 'MSD-0369', 'seagate1']

MSD-0369
18:17:30 - 18:17:01 = 29

     AM        SD      notes
0  0390  MSD-0369  To Deploy
     AM        SD      notes
0  0390  MSD-0369  To Deploy
         AM        SD BCCH_mean CAWR_mean PIWO_mean      notes
4  M10-0390  MSD-0369      3374      2405      3353  To Deploy
