# AudioMoth WAV File Processing

This notebook can be used to generate frequency and spectrogram charts from the AudioMoth files as JPEG files allowing further analysis. 
It can also do some basic classification of the files according to some configuration of frequency ranges and amplitude thresholds
moving the files into subdirectories according to whether the amplitude in a given frequency range exceeds the threshold. 
During processing, the frequency and spectrogram plots are output into the notebook.

The overall purpose is to find the files of interest with strong evidence of bat passes or birds so the that a subset of a large
collection of files can be further analysed. For bats this can be used to choose a set of files to upload to BTO Acoustic Pipeline
for example to save on bandwidth and system resources when an otherwise much larger set may have been submitted.

## How to use
- Set required frequency ranges / thresholds and run the config section
- Set the file folder path and / or run the folder path section and choose with the file browser
- Run the processing section and wait for completion
- Processed files will be moved to sub-folders with JPEG files for frequency / spectral analysis alongside
- Make sure the drive has sufficient storage for the generated JPEG files, up to 1Mb per WAV file processed
- For unclassified files, change thresholds and re-run on those as required

## Frequency Ranges and Thresholds for Classification

- Audiomoth WAV files will be moved into classification folders for bats, birds etc
- The 'freqRanges' entry name is used for the folder
- The 'freqRanges' items are processed in order, the first classification wins!
- Unclassified files moved to a subfolder "unclassified"

In [None]:
freqRanges = {
    # Bats 40Khz to 60Khz, threshold 400
    'bats': [40000,60000,400],

    # Birds 1Khz to 12Khz, threshold 100
    'birds':[1000,12000,100],

    # Birds 1Khz to 12Khz, threshold 50
    'birds_quiet':[1000,12000,65]
}
print('Frequency ranges and thesholds: ', freqRanges)

# Limit files with spectra / freq charts output to notebook display to prevent overloads with a large batch
max_files_to_plot = 20

## Set or Choose Folder Path

Run this block to choose a path via the browse button, or set it directly.

Note: Processing will create sub-folders in the path where the WAV files reside and move the files around. Keep a master / backup copy!

In [None]:
# Shared code
from utilities import *

# ** Set path to files folder here **
path = '/Users/bill/Documents/AudioMoth/test/'

# Browse for path 
file_dialog = browse_for_path(path)

## File Processing

Main processing script to process all AudioMoth files. Execute this after setting up classifcaton config and the file path.

In [None]:
# Main processing script...
import logging

# Shared code
from utilities import *

# Update path from dialog
path = file_dialog.selected

# Put plots inline in the notebook
%matplotlib inline

# Will be initialised from freqRanges during processing
freqRangeOfInterest = []
fileOfInterestThreshold = 300.0

# Set up logging to a file in the path
initialise_logging(path)

# Get the files to prcess
files = find_and_sort_files_on_path(path)

# Stats counter
stats = initialise_processing_stats(files)

# Create candidates dirs and stats for each freq range
for freqRangeKey in freqRanges:
    # dirs
    candidate_dir = os.path.join(path, str(freqRangeKey))
    if not os.path.isdir(candidate_dir):
        print('Creating candidate dir: ', candidate_dir)
        os.mkdir(candidate_dir)

    # Stats init
    stats[str(freqRangeKey)] = 0
    stats['Max'+str(freqRangeKey)] = 0
    stats['Min'+str(freqRangeKey)] = 0
    stats['FreqMax'+str(freqRangeKey)] = 0
        
# Create unclassified dir. Copy unclassified files into here..
unclassified_dir = os.path.join(path, 'unclassified')
if not os.path.isdir(unclassified_dir):
    print('Creating unclassified files dir: ', unclassified_dir)
    os.mkdir(unclassified_dir)

# Iterate all the found WAV files in the directory
do_plot = True
for index, singfile in enumerate(files):

    # Limit number of files plotted in the notebook
    if index >= max_files_to_plot:
        do_plot = False
         
    # Full path to the file
    file_full_path = os.path.join(path, singfile)
    
    # Spectrogram and frequency chart file paths
    spectrogram_path = file_full_path + '_spec.jpg'
    freq_chart_path = file_full_path + '_freq.jpg'
    
    # Get comment and recorded time, voltage, temp
    comment, recorded_time, battery, temperature = get_comment_and_data(file_full_path)

    # Update stats on temp / voltage
    if float(temperature) > stats['MaxTemp']:
        stats['MaxTemp'] = float(temperature)
    if float(temperature) < stats['MinTemp'] or stats['MinTemp'] == 0.0:
        stats['MinTemp'] = float(temperature)
    if float(battery) > stats['MaxVolts']:
        stats['MaxVolts'] = float(battery)
    if float(battery) < stats['MinVolts'] or stats['MinVolts'] == 0.0:
        stats['MinVolts'] = float(battery)

    # Print some useful info about the files...
    print('--------------------------------------------------------------------------------------------------')
    log_and_print('File: '+file_full_path)
    log_and_print('Spectrogram:'+spectrogram_path)
    log_and_print('Details: '+comment)
    log_and_print('Recorded time: '+recorded_time)
    log_and_print('Battery voltage: '+battery)
    log_and_print('Temperature: '+temperature)
    
    # Get the WAV file data and sample rate
    data, rate = get_wav_info(file_full_path)
    log_and_print('Sample rate: '+str(rate))

    # Plot the spectrogram. Ignore the returned item.. 
    plot_spectrogram(data, rate, singfile, recorded_time, do_plot, spectrogram_path)   

    # Now classify into folders according to frequency ranges configured
    fileClassified = False
    doFreqPlot = True
    for freqRangeKey in freqRanges:

        # May aleady be classified...
        if fileClassified == False:
            log_and_print('** Checking for : '+str(freqRangeKey))

            freqRangeOfInterest = [freqRanges[freqRangeKey][0],freqRanges[freqRangeKey][1]]
            fileOfInterestThreshold = freqRanges[freqRangeKey][2]

            log_and_print('Start frequency : '+str(freqRangeOfInterest[0]))
            log_and_print('End frequency : '+str(freqRangeOfInterest[1]))
            log_and_print('Threshold : '+str(fileOfInterestThreshold))
        
            # Get frequencies and plot if not done
            max_freq, max_freq_val, finterest_max_val, max_freq_range = plot_get_freqs(data, rate, singfile, recorded_time, (doFreqPlot and do_plot), freqRangeOfInterest[0], freqRangeOfInterest[1], freq_chart_path)
            
            # Done freq plot so turn that off for next check..
            doFreqPlot = False
            
            # Log some stats
            log_and_print('Max freq Khz: '+str(max_freq/1000))
            log_and_print('Max freq value: '+str(max_freq_val))
            log_and_print('Freq of interest range (Khz): '+str(freqRangeOfInterest[0]/1000)+','+str(freqRangeOfInterest[1]/1000))
            log_and_print('Max value in range of interest: '+str(finterest_max_val))
            log_and_print('Max frequency in range of interest (Khz): '+str(max_freq_range/1000))
            stats['FreqMax'+str(freqRangeKey)] = round(max_freq_range/1000,2)
            
            if finterest_max_val > fileOfInterestThreshold:
                log_and_print('** File: '+singfile+' looks interesting for '+str(freqRangeKey)+'!! Moving to candidates dir: '+str(freqRangeKey))
                logging.info('CANDIDATE file: '+singfile)
                candidate_dir = os.path.join(path, str(freqRangeKey))
                shutil.move(file_full_path, os.path.join(candidate_dir, singfile))
                shutil.move(spectrogram_path, os.path.join(candidate_dir, singfile)+'_spec.jpg')
                shutil.move(freq_chart_path, os.path.join(candidate_dir, singfile)+'_freq.jpg')
                stats[str(freqRangeKey)]+=1

                if finterest_max_val > stats['Max'+str(freqRangeKey)]:
                    stats['Max'+str(freqRangeKey)] = finterest_max_val

                if finterest_max_val < stats['Min'+str(freqRangeKey)] or stats['Min'+str(freqRangeKey)] == 0:
                    stats['Min'+str(freqRangeKey)] = finterest_max_val 

                fileClassified = True

    # If file not classified, chuck it into the unclassified dir
    if fileClassified == False:
        print('** File: '+singfile+' moving to UNCLASSIFIED dir.') 
        logging.info('Non-candidate file: '+singfile)
        shutil.move(file_full_path, os.path.join(unclassified_dir, singfile))
        shutil.move(spectrogram_path, os.path.join(unclassified_dir, singfile)+'_spec.jpg')
        shutil.move(freq_chart_path, os.path.join(unclassified_dir, singfile)+'_freq.jpg')
        stats['Unclassified']+=1

        if finterest_max_val > stats['MaxUnclassified']:
            stats['MaxUnclassified'] = finterest_max_val

        if finterest_max_val < stats['MinUnclassified'] or stats['MinUnclassified'] == 0:
            stats['MinUnclassified'] = finterest_max_val   

# End of processing 
log_and_print(stats)
logging.shutdown()    
print("** Finished **")
        
        