Skip to content

Framework Customization

(Bill) Yuchen Lin edited this page Jul 24, 2019 · 11 revisions

The AlpacaTag framework is able for customization on the back-end model (Model Customization) and the active learning methods (AL Customization).

Model Customization

You can build your own model stacking the modules in the project.

Where can I find the code for the Back-end Models?

AlpacaTag:
  - alpaca_client
  + alpaca_server:
      + pytorchAPI
          - active_learning
          + models:
              - cnn_bilstm_crf.py
              - cnn_cnn_crf.py
                ...
          - modules:
              - baseRNN.py
              - CharEncoderCNN.py
              - DecoderCRF.py
                ...
  - annotation

Using the modules we have, build your own model by simple stacking the modules. Then, put your model into models folder.

How should I customize the back-end models?

wrapper.py

self.model = CNN_BiLSTM_CRF(self.p.word_vocab_size,
                                    self.word_embedding_dim,
                                    self.word_lstm_size,
                                    self.p.char_vocab_size,
                                    self.char_embedding_dim,
                                    self.char_lstm_size,
                                    self.p._label_vocab.vocab, pretrained=embeddings)

change the name of model class into your model's name.

self.model = CUSTOMIZED_MODEL(self.p.word_vocab_size,
                                    self.word_embedding_dim,
                                    self.word_lstm_size,
                                    self.p.char_vocab_size,
                                    self.char_embedding_dim,
                                    self.char_lstm_size,
                                    self.p._label_vocab.vocab, pretrained=embeddings)

Active learning Customization

Where are the code for Active Learning methods?

AlpacaTag:
  - alpaca_client
  + alpaca_server:
      + pytorchAPI
          + active_learning:
              - acquisition.py
          - models
          - modules
  - annotation

Acquisition class has active learning methods as functions.

How can I create own active learning method?

def customize_active_learning(self, dataset, model, num_instances, batch_size = 50):
    model.train(False)

    probs = np.ones(len(dataset))*float('Inf')     
    new_dataset = [datapoint for j,datapoint in enumerate(dataset) if j not in self.train_index]
    new_datapoints = [j for j in range(len(dataset)) if j not in self.train_index] 
    data_batches = create_batches(new_dataset, batch_size = batch_size, str_words = True, tag_padded = False)

    probscores = []
    for data in data_batches:
        scores = customize_method(data) # you need to change here
        probscores.extend(scores)

     probs[new_datapoints] = np.array(probscores)
     test_indices = np.argsort(probs)
     cur_indices = set()
     i = 0
     self.return_index = []
     self.return_score = []
     while len(cur_indices) < num_instances:
         cur_indices.add(test_indices[i])
         self.return_index.append(test_indices[i])
         self.return_score.append(probs[test_indices[i]])
         i += 1
     self.train_index.update(cur_indices)

Acquisition class automatically manages indices of already sampled instances and still not sampled instances. To use this management, follow the above code and implement your own method (customize_method) on getting scores of the instances.