# hybrid deep finch

### PART 1: Train classifiers with increasing number of samples to estimate accuracy
1. In the cell below, replace the dummy directory name with the name of the directory produced by running 'train_deepfinch.m' in Matlab.
**This directory should contain 'deepfinch_train_results' in the name. For example:**
```python
TRAIN_RESULTS_DIR = os.path.normpath("C:\\DATA\\bk40bl61_prelesion\\deepfinch_train_results_11-23-16_15-46")
```
2. Also change the list of labels for syllables to the labels you used for the bird of interest.
3. Use the arrow keys to highlight the cell (when highlighted, it will have a green box around it), then hit 'Ctrl-Enter'

Note: the script will take a while to run. Probably best to start at the end of the day and let it run overnight on a computer you're not using for data collection.

In [None]:
import os
#need os.path.normpath to fix slashes in paths.
#Enter paths below with forward slashes, and normpath will convert them to
#whatever is appropriate for your system.
HOME_DIR = os.path.normpath("C:\\Users\\jnmcgre\\Documents\\hybrid-deep-finch")
TRAIN_RESULTS_DIR = os.path.normpath("C:\\DATA\\bk40bl61_prelesion\\deepfinch_train_results_11-23-16_15-46")
LABELSET = 'abcdefghijklmnqsvz'

In [None]:
%cd $HOME_DIR
%run test_svmrbf_knn $TRAIN_RESULTS_DIR $LABELSET

### Part 2: Plot estimated accuracy. Decide which classifier you prefer.

In [None]:
%cd $HOME_DIR
#plot graph
import shelve
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

#from annote_funcs import AnnoteFinder

results_shelve = TRAIN_RESULTS_DIR + '\\train\\svmrbf_knn_results_summary.db'
with shelve.open(results_shelve) as results:
    NUM_SAMPLES_TO_TRAIN_WITH = results['NUM_SAMPLES_TO_TRAIN_WITH']
    REPLICATES = results['REPLICATES']
    svm_test_avg_acc_mn = results['svm_test_avg_acc_mn']
    svm_test_avg_acc_std = results['svm_test_avg_acc_std']
    svm_test_avg_acc = results['svm_test_avg_acc']
    knn_test_avg_acc_mn = results['knn_test_avg_acc_mn']
    knn_test_avg_acc_std = results['knn_test_avg_acc_std']
    knn_test_avg_acc = results['knn_test_avg_acc']
    annotes = results['annotes'] #annotations, shown when hovering over data points
    
#to pre-allocate arrays below, need:
rows = len(NUM_SAMPLES_TO_TRAIN_WITH)
cols = len(REPLICATES)

fig, (ax1, ax2, ax3) = plt.subplots(3, sharex=True, sharey=True)
fig.set_size_inches(14,10)

#plot svm and k-NN as lines on one axis for direct comparison between the two
ax1.errorbar(NUM_SAMPLES_TO_TRAIN_WITH,
             svm_test_avg_acc_mn,
             yerr=svm_test_avg_acc_std,
             fmt='-k',label='SVM-RBF')
ax1.errorbar(NUM_SAMPLES_TO_TRAIN_WITH,
             knn_test_avg_acc_mn,
             yerr=knn_test_avg_acc_std,
             fmt='-b',label='k-NN')
#ax.set_ylim([80,100])
ax1.set_xticks(NUM_SAMPLES_TO_TRAIN_WITH)
ax1.set_xlabel('number of samples used')
ax1.set_ylabel('Average accuracy across labels\n (mean and std. dev.)')
ax1.legend(loc=4)

#then plot data points on separate axes so user can hover over them
#and get replicate + number of samples corresponding to best score
ax2_xvals = []
for ind,x_tick in enumerate(NUM_SAMPLES_TO_TRAIN_WITH):
    y = svm_test_avg_acc[:,ind]
    #keep x for call to AnnoteFinder, also make sure it's random w/out replace
    x = np.random.normal(x_tick,0.08, size=len(y))
    ax2_xvals.append(x.tolist())
    ax2.plot(x, y, 'k.', alpha=0.3, markersize=8)
ax2_xvals = np.asarray(ax2_xvals)
ax2.set_xticks(NUM_SAMPLES_TO_TRAIN_WITH)
ax2.set_xlabel('number of samples used')
ax2.set_ylabel('Average accuracy across labels')

ax3_xvals = []
for ind,x_tick in enumerate(NUM_SAMPLES_TO_TRAIN_WITH):
    y = knn_test_avg_acc[:,ind]
    x = np.random.normal(x_tick,0.08, size=len(y))
    ax3_xvals.append(x.tolist())
    ax3.plot(x, y, 'b.', alpha=0.3, markersize=8)
ax3_xvals = np.asarray(ax3_xvals)
ax3.set_xticks(NUM_SAMPLES_TO_TRAIN_WITH)
ax3.set_xlabel('number of samples used')
ax3.set_ylabel('Average accuracy across labels')  

In [None]:
ax2_af = AnnoteFinder(x,y, annotes, ax=ax)
ax3_af = AnnoteFinder(x,y, annotes, ax=ax)
fig.canvas.mpl_connect('button_press_event', af)
plt.show()

fig.canvas.mpl_connect('axes_enter_event', connect_axis)

class connect_axis(object):
    def __init__(self,ax2,ax3,ax2_af,ax3_af):
        self.ax2 = ax2
        self.ax3 = ax3
        self.ax2_af = ax2_af
        self.ax3_af = ax3_af
        
    if event.inaxes==self.ax2:
        self.fig.mpl_connect('pick_event', self._onpick_plot_2)
    elif event.inaxes==self.ax3:
        self.fig.mpl_connect('pick_event', self._onpick_plot_3)

### Part 3: Classify unlabeled syllables

1. Before running the script in the cell below, you must run 'classify_deepfinch.m' in Matlab. That script will generate 'autolabel .not.mat' files for each song with unlabeled syllables, and put a list of directories with song for autolabeling in the directory produced by 'train_deepfinch'.

In [None]:
%cd $HOME_DIR
%run classify $TRAIN_DIR

In [None]:
TRAIN_DIR