# Creating a dataset

First, we need a dataset of labeled programs that Autoplan can classify. We'll use an example Java dataset of Fizzbuzz implementations. In this case, there are two program structures we're considering: 
* **Separate**, where the "FizzBuzz" case is distinct from "Fizz" and "Buzz", and
* **Combined**, where the "FizzBuzz" case is a combination of the two cases

I wrote six programs (three of each kind) below. We define a `programs` array with the Java source, and a `labels` array with the category of each program.

In [1]:
from autoplan.labels import Labels

class FizzbuzzLabels(Labels):
    Separate = 0
    Combined = 1
    
programs = [
    '''
class Main {    
    public static void main(String[] args) {
      for (int i = 0; i < 100; ++i) {
        if (i % 15 == 0) {
          System.out.println("FizzBuzz");
        } else if (i % 3 == 0) {
          System.out.println("Fizz");
        } else if (i % 5 == 0) {
          System.out.println("Buzz");
        }
      }
    }
}
''',
    
    '''
class Main {    
    public static void main(String[] args) {
      int i = 0; 
      while (i < 100) {
        if (i % 15 == 0) {
          System.out.println("FizzBuzz");
        } else if (i % 3 == 0) {
          System.out.println("Fizz");
        } else if (i % 5 == 0) {
          System.out.println("Buzz");
        }
        ++i;
      }
    }
}
''',   
    
        
    '''
class Main {    
    public static void main(String[] args) {
      int i = 0; 
      while (true) {
        if (i % 15 == 0) {
          System.out.println("FizzBuzz");
        } else if (i % 3 == 0) {
          System.out.println("Fizz");
        } else if (i % 5 == 0) {
          System.out.println("Buzz");
        }
        ++i;
        if (i >= 100) { break; }
      }
    }
}
''',
    
    '''
class Main {    
    public static void main(String[] args) {
      for (int i = 0; i < 100; ++i) {
        if (i % 3 == 0) {
          System.out.print("Fizz");
        }  
        if (i % 5 == 0) {
          System.out.print("Buzz");
        }
        if (i % 3 == 0 || i % 5 == 0) {
          System.out.print("\n");
        }
      }
    }
}
''',
    
    '''
class Main {    
    public static void main(String[] args) {
      for (int k = 0; k < 100; ++k) {
        int mod3 = k % 3;
        int mod5 = k % 5;
        if (mod3 == 0) {
          System.out.print("Fizz");
        }  
        if (mod5 == 0) {
          System.out.print("Buzz");
        }
        if (mod3 || mod5) {
          System.out.print("\n");
        }
      }
    }
}
''',
    
   '''
class Main {    
    public static void main(String[] args) {
      int i = 0;
      while (i < 100) {
        if (i % 3 == 0 || i % 5 == 0) {
          if (i % 3 == 0) {
            System.out.print("Fizz");
          }
          if (i % 5 == 0) {
            System.out.print("Buzz");        
          }
          System.out.print("\n");
        }  
        i += 1;
      }
    }
}
'''
]

labels = [
    FizzbuzzLabels.Separate, FizzbuzzLabels.Separate, FizzbuzzLabels.Separate, 
    FizzbuzzLabels.Combined, FizzbuzzLabels.Combined, FizzbuzzLabels.Combined
]

Once our programs and labels are defined, we can turn them into a dataset object using `build_labeled_dataset`. We also need to pass in a parser that understands the syntax of programs in our language. We have a few built-in ones (Java, OCaml, and Pyret). If you want to use Autoplan for an unsupported language, please put an issue on our Github.

In [2]:
from autoplan.dataset import build_labeled_dataset
from autoplan.parser import JavaParser

parser = JavaParser()
dataset = build_labeled_dataset(FizzbuzzLabels, programs, labels, parser)

# Training a classifier

The next step is to train one of the classifiers on your dataset. We have two classifiers, a nearest-neighbors and recurrent neural network. For each classifier, we will train it on the dataset, and then test it on a sample program.

In [3]:
test_program = '''
class Main {    
    public static void main(String[] args) {
      for (int i = 0; i < 100; ++i) {
        if (i % 3 == 0 || i % 5 == 0) {
          if (i % 3 == 0) {
            System.out.print("Fizz");
          }
          if (i % 5 == 0) {
            System.out.print("Buzz");        
          }
          System.out.print("\n");
        }  
      }
    }
}
'''

In [4]:
from autoplan.neighbors import TokenNNClassifier

# Nearest-neighbors does not need training, so we can construct it and we're ready
# Note that you can also use the TreeNNClassifier in some cases
knn = TokenNNClassifier(dataset)

# Classify returns the predicted label of the program
knn.classify(test_program)

<FizzbuzzLabels.Combined: 1>

In [5]:
from autoplan.trainer import ClassifierTrainer
from autoplan.dataset import RandomSplit

from torch import nn
import torch

# The recurrent neural network uses Pytorch, so we initialize it with some Pytorch parameters.
device = torch.device('cuda:7')
model_opts = {
    'model': nn.LSTM,
    'hidden_size': 64,
    'embedding_size': 32
}

# We have to explicitly train the network, and then load the weights of the best run.
rnn = ClassifierTrainer(dataset, device=device, model_opts=model_opts, split=RandomSplit(dataset))
_ = rnn.train_and_load_best(epochs=60)

HBox(children=(IntProgress(value=0, max=60), HTML(value='')))




In [6]:
rnn.classify(test_program)

<FizzbuzzLabels.Combined: 1>

# Evaluating a classifier

Running the classifier on a particular program is useful, but you probably also want to know how effective the classifier is in general. For that, we have facilities to cross-validate each classifier on the labeled dataset. 

Below, each classifier is trained on 2/3 of the dataset and evaluated on the remaining 1/3. This process is repeated 20 times (`folds`), and the average accuracy is computed.

In [7]:
import numpy as np

test_frac = 0.34
folds = 20

dist_mtx = knn.compute_distance_matrix(knn.programs)
confusion_mtxs = knn.crossval(dist_mtx, k=1, folds=folds, test_frac=test_frac)[0]
accuracies = np.array([m.accuracy for m in confusion_mtxs])

print(f'Mean accuracy for nearest-neighbors is {accuracies.mean():.02f} (σ = {accuracies.std():.02f})')

HBox(children=(IntProgress(value=0, max=6), HTML(value='')))


Mean accuracy for nearest-neighbors is 0.47 (σ = 0.29)


In [8]:
from autoplan.trainer import ClassifierTrainer

torch.manual_seed(0)
cval_results = ClassifierTrainer.crossval(
    dataset, split=RandomSplit(dataset), epochs=60, model_opts=model_opts, folds=folds, test_frac=test_frac,  
    device=device, progress=True)
accuracies = np.array(cval_results['accuracy'])

print(f'Mean accuracy for recurrent neural network is {accuracies.mean():.02f} (σ = {accuracies.std():.02f})')

HBox(children=(IntProgress(value=0, max=20), HTML(value='')))


Mean accuracy for recurrent neural network is 0.62 (σ = 0.35)


# Improving a classifier

Some classifiers like nearest-neighbors can't easily be improved, unless you come up with another way to compare programs. However, neural networks have many parameters that can be tweaked to improve performance. Below we show an example of using the [hyperopt](http://hyperopt.github.io/hyperopt/) library to select the best parameters for the neural network.

In [9]:
from hyperopt import hp, fmin, tpe, Trials
model_types = [nn.GRU, nn.LSTM]
opts_space = {
    'model': hp.choice('model', model_types),
    'hidden_size': hp.quniform('hidden_size', 5, 10, 1),
    'embedding_size': hp.quniform('embedding_size', 5, 10, 1),
}

def hp_opts_to_model_opts(hp_opts):
    return {
        'model': hp_opts['model'],
        'hidden_size': 2 ** int(hp_opts['hidden_size']),
        'embedding_size': 2 ** int(hp_opts['embedding_size']),
    }

def objective(hp_opts):
    torch.manual_seed(0)
    model_opts = hp_opts_to_model_opts(hp_opts)
    cval_results = ClassifierTrainer.crossval(
        dataset, split=RandomSplit(dataset), epochs=60, model_opts=model_opts, folds=folds // 2, test_frac=test_frac,
        device=device,
        progress=False)
    return 1. - np.array(cval_results['accuracy']).mean()

trials = Trials()
best_params = fmin(objective, opts_space, algo=tpe.suggest, max_evals=30, trials=trials)

100%|██████████| 30/30 [04:43<00:00, 11.69s/trial, best loss: 0.0] 


In [10]:
# These are the best model opts!
best_model_opts = hp_opts_to_model_opts(best_params)
print('The best model options are:')
print(best_model_opts)
print()

losses = np.array(trials.losses())
print('The highest average accuracy was: ', 1. - losses.min())

The best model options are:
{'model': 0, 'hidden_size': 1024, 'embedding_size': 32}

The highest average accuracy was:  1.0
