<a href="https://colab.research.google.com/github/anjapago/AnalyzeAccountability/blob/master/Classifier_with_BERT_Policy2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Binary Classifier BERT on TF Hub

Bidirectional Encoder Representations from Transformers(BERT) is a neural network architecture designed by Google researchers is a state-of-the-art approach or NLP tasks, including text classification, translation, summarization, and question answering.

BERT has been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, and in an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. 

[Finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a a classifier to detect accountability in news articles using BERT in Tensorflow with tf hub. Code was adapted from [this colab notebook](https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb).

In [0]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [14]:
!pip install bert-tensorflow



In [0]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

Below, we'll set an output directory location to store our model output and checkpoints. We are running this code in Google's hosted Colab, so the directory won't persist after the Colab session ends.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [16]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
DO_DELETE = True #@param {type:"boolean"}
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: OUTPUT_DIR_NAME *****


#Data

Load the dataset of news excerpts annotated with the accountability label. The code below loads the data from xlsx files, formats it as a pandas data frame, and splits it into test and training sets.

In [17]:
from tensorflow import keras
import os
import re
import nltk
from nltk import sent_tokenize
nltk.download('punkt')
from sklearn.model_selection import train_test_split

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [0]:
filenames = [filename for filename in os.listdir() if 'xlsx' in filename]

DATA_COLUMN = 'excerpt'
LABEL_COLUMN = 'label'
label_list = [0, 1]
max_sent = 5
label = 'policy'
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128

In [19]:
train_list = []
test_list = []
for file_name in filenames:
  data = pd.read_excel(file_name, sheet_name='Dedoose Excerpts Export')
  data = data.dropna(axis=0)

  # get relevant columns:
  label_cols = [l for l in data.columns if label in l.lower()]
  excerpt_col = [l for l in data.columns if DATA_COLUMN in l.lower()][0]
  data_subcols = data.loc[:, label_cols+[excerpt_col]]

  for colname in label_cols:
    data_subcols = data_subcols.astype({colname: int})
    id0 = [val not in [0, 1] for val in data_subcols.loc[:, colname]]
    data_subcols.loc[id0, colname] = 0

  #print(data_subcols.shape)
  print(label_cols)

  # filter out rows that do not have any policy subtype label
  label_ids = data_subcols.loc[:, label_cols].sum(axis=1) > 1 
  df_label = data_subcols.loc[label_ids,:]
  #print(df_label.shape)

  # filter out long excerpts

  short_ex_ids = [len(sent_tokenize(sent))>max_sent for sent in df_label.loc[:, excerpt_col]]
  df_label_short = df_label.loc[short_ex_ids, :]
  print(df_label_short.shape)

  # split into train and test dfs
  train, test = train_test_split(df_label_short, test_size=0.25, random_state=42)
  #print(train.shape)
  #print(test.shape)
  train_list.append(train)
  test_list.append(test)

['Code: Policy Applied', 'Code: Policy\\Advocacy by others Applied', 'Code: Policy\\Advocacy by victims families Applied', 'Code: Policy\\Guns Applied', 'Code: Policy\\Immigration Applied', 'Code: Policy\\Information Sharing Applied', 'Code: Policy\\Mental Health Applied', 'Code: Policy\\Other Applied', 'Code: Policy\\Practice Applied']
(56, 10)
['Code: Policy Applied', 'Code: Policy\\Advocacy by others Applied', 'Code: Policy\\Advocacy by victims families Applied', 'Code: Policy\\Guns Applied', 'Code: Policy\\Immigration Applied', 'Code: Policy\\Information Sharing Applied', 'Code: Policy\\Mental Health Applied', 'Code: Policy\\Other Applied', 'Code: Policy\\Practice Applied']
(59, 10)
['Code: Policy Applied', 'Code: Policy\\Advocacy by others Applied', 'Code: Policy\\Advocacy by victims families Applied', 'Code: Policy\\Guns Applied', 'Code: Policy\\Immigration Applied', 'Code: Policy\\Information Sharing Applied', 'Code: Policy\\Mental Health Applied', 'Code: Policy\\Other Applied',

In [0]:
# transform columns for all data frames to be the same
col_dict = {
    'OtherAdv': ['POLICY_OtherAdv', 'Code: Policy\Advocacy by others Applied'],
    'VictimAdv': ['POLICY_VictimAdv', 'Code: Policy\Advocacy by victims families Applied'],
    'Guns': [ 'POLICY_Guns', 'Code: Policy\Guns Applied'],
    'InfoSharing': ['POLICY_InfoSharing', 'Code: Policy\Information Sharing Applied'],
    'MentalHealth': ['POLICY_MentalHealth', 'Code: Policy\Mental Health Applied'],
    'Other': ['POLICY_Other', 'Code: Policy\Other Applied'],
    'Practice': ['POLICY_Practice', 'Code: Policy\Practice Applied'],
    'Immigration': ['Code: Policy\Immigration Applied']
}

LABEL_COLUMNS = list(col_dict.keys())


In [21]:
def merge_dfs(df_list):
  merged_df = pd.DataFrame(columns = list(col_dict.keys())+[DATA_COLUMN])

  for df in df_list:
    df_renamed = pd.DataFrame(columns = col_dict.keys(), index = df.index)
    print(df.shape)

    #renamed ex col
    df_renamed[DATA_COLUMN] = df.loc[:,[l for l in df.columns if DATA_COLUMN in l.lower()][0]]

    # make each dict in the list to have the columns in col_dict
    for new_colname in col_dict.keys():
      #check if df has subtype:
      col = [colname for colname in df.columns if colname in col_dict[new_colname]]
      if len(col) ==0:
        df_renamed[new_colname] = 0
      else:
        df_renamed[new_colname] = df.loc[:, col]
    merged_df = merged_df.append(df_renamed, ignore_index=True)
  print(merged_df.shape)
  return merged_df

test_merged = merge_dfs(test_list)
train_merged = merge_dfs(train_list)

(14, 10)
(15, 10)
(13, 10)
(140, 9)
(68, 9)
(4, 9)
(13, 10)
(267, 9)
(42, 10)
(44, 10)
(37, 10)
(420, 9)
(204, 9)
(12, 9)
(39, 10)
(798, 9)


In [22]:
train_merged.loc[:, col_dict.keys()].head()

Unnamed: 0,OtherAdv,VictimAdv,Guns,InfoSharing,MentalHealth,Other,Practice,Immigration
0,0,0,0,1,0,0,0,0
1,0,0,0,1,0,0,0,0
2,1,0,1,0,0,0,0,0
3,1,0,0,0,0,0,0,1
4,1,0,0,0,0,1,0,0


In [23]:
train_merged.OtherAdv.unique()

array([0, 1], dtype=object)

View the loaded data, and inspect the first few entries in the training set.

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [26]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

W0825 19:24:59.525245 140165842966400 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.



In [27]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our accountability detection task. This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [0]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # beta for L2 regularizer
  beta = 0.1
  
  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for accountability data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    regularizer = tf.nn.l2_loss(output_weights)
    #loss = tf.reduce_mean(per_example_loss + beta*regularizer)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [0]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        #f1_score = tf.contrib.metrics.f1_score(
        #    label_ids,
        #    predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            #"f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [0]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [49]:
# Run for each label
import warnings
warnings.filterwarnings("ignore")

for LABEL_COLUMN in LABEL_COLUMNS[-1:0]:
    print("**************"+str(LABEL_COLUMN)+"**********************")
    print("train num label=1: "+str(sum(train_merged[LABEL_COLUMN])))
    print("train percent = 1: "+str(sum(train_merged[LABEL_COLUMN])/len(train_merged[LABEL_COLUMN])))
    
    print("test num label=1: "+str(sum(test_merged[LABEL_COLUMN])))
    print("test percent = 1: "+str(sum(test_merged[LABEL_COLUMN])/len(test_merged[LABEL_COLUMN])))
          
    # Use the InputExample class from BERT's run_classifier code to create examples from the data
    train_InputExamples = train_merged.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                       text_a = x[DATA_COLUMN], 
                                                                       text_b = None, 
                                                                       label = x[LABEL_COLUMN]), axis = 1)

    test_InputExamples = test_merged.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                       text_a = x[DATA_COLUMN], 
                                                                       text_b = None, 
                                                                       label = x[LABEL_COLUMN]), axis = 1)

    # We'll set sequences to be at most 128 tokens long.
    MAX_SEQ_LENGTH = 128
    # Convert our train and test features to InputFeatures that BERT understands.
    train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, 
                                                                      label_list, 
                                                                      MAX_SEQ_LENGTH,
                                                                      tokenizer)
    test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples,
                                                                     label_list,
                                                                     MAX_SEQ_LENGTH,
                                                                     tokenizer)

    # Compute # train and warmup steps from batch size
    num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
    num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

    model_fn = model_fn_builder(
      num_labels=len(label_list),
      learning_rate=LEARNING_RATE,
      num_train_steps=num_train_steps,
      num_warmup_steps=num_warmup_steps)

    estimator = tf.estimator.Estimator(
      model_fn=model_fn,
      config=run_config,
      params={"batch_size": BATCH_SIZE})

    # Create an input function for training. drop_remainder = True for using TPUs.
    train_input_fn = bert.run_classifier.input_fn_builder(
        features=train_features,
        seq_length=MAX_SEQ_LENGTH,
        is_training=True,
        drop_remainder=False)

    print(f'Beginning Training!')
    current_time = datetime.now()
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
    print("Training took time ", datetime.now() - current_time)

    test_input_fn = run_classifier.input_fn_builder(
        features=test_features,
        seq_length=MAX_SEQ_LENGTH,
        is_training=False,
        drop_remainder=False)

    print("test results:")
    test_result = estimator.evaluate(input_fn=test_input_fn, steps=None)
    print(test_result)
    

    test_train_input_fn = run_classifier.input_fn_builder(
        features=train_features,
        seq_length=MAX_SEQ_LENGTH,
        is_training=False,
        drop_remainder=False)
    print("train results:")
    train_result = estimator.evaluate(input_fn=test_train_input_fn, steps=None)
    print(train_result)

**************Immigration**********************
train num label=1: 10
train percent = 1: 0.012531328320802004
test num label=1: 2
test percent = 1: 0.00749063670411985
Beginning Training!
Training took time  0:00:00.005886
test results:
{'auc': 0.67358506, 'eval_accuracy': 0.35205993, 'false_negatives': 0.0, 'false_positives': 173.0, 'loss': 0.8831202, 'precision': 0.011428571, 'recall': 1.0, 'true_negatives': 92.0, 'true_positives': 2.0, 'global_step': 74}
train results:
{'auc': 0.60431474, 'eval_accuracy': 0.41353384, 'false_negatives': 2.0, 'false_positives': 466.0, 'loss': 0.87404495, 'precision': 0.016877636, 'recall': 0.8, 'true_negatives': 322.0, 'true_positives': 8.0, 'global_step': 74}


## Results from each label individually

**************OtherAdv**********************
train num label=1: 324
train percent = 1: 0.40601503759398494
test num label=1: 101
test percent = 1: 0.3782771535580524
Beginning Training!
Training took time  0:00:00.003997
{'auc': 0.65659666, 'eval_accuracy': 0.60674155, 'false_negatives': 14.0, 'false_positives': 91.0, 'loss': 0.68718004, **'precision': 0.48876405, 'recall': 0.8613861**, 'true_negatives': 75.0, 'true_positives': 87.0, 'global_step': 74}
{'auc': 0.7173973, 'eval_accuracy': 0.68922305, 'false_negatives': 43.0, 'false_positives': 205.0, 'loss': 0.6079148, 'precision': 0.5781893, 'recall': 0.86728394, 'true_negatives': 269.0, 'true_positives': 281.0, 'global_step': 74}

**************VictimAdv**********************
train num label=1: 40
train percent = 1: 0.05012531328320802
test num label=1: 11
test percent = 1: 0.04119850187265917
Beginning Training!
Training took time  0:00:00.004482
{'auc': 0.49201, 'eval_accuracy': 0.35955057, 'false_negatives': 4.0, 'false_positives': 167.0, 'loss': 0.8812283, **'precision': 0.040229887, 'recall': 0.6363636**, 'true_negatives': 89.0, 'true_positives': 7.0, 'global_step': 74}
{'auc': 0.60976255, 'eval_accuracy': 0.4385965, 'false_negatives': 8.0, 'false_positives': 440.0, 'loss': 0.86696815, 'precision': 0.06779661, 'recall': 0.8, 'true_negatives': 318.0, 'true_positives': 32.0, 'global_step': 74}

**************Guns**********************
train num label=1: 558
train percent = 1: 0.6992481203007519
test num label=1: 187
test percent = 1: 0.700374531835206
Beginning Training!
Training took time  0:00:00.004301
{'auc': 0.67790776, 'eval_accuracy': 0.70411986, 'false_negatives': 48.0, 'false_positives': 31.0, 'loss': 0.58773005, **'precision': 0.81764704, 'recall': 0.7433155**, 'true_negatives': 49.0, 'true_positives': 139.0, 'global_step': 74}
{'auc': 0.640793, 'eval_accuracy': 0.6553885, 'false_negatives': 180.0, 'false_positives': 95.0, 'loss': 0.6048415, 'precision': 0.79915434, 'recall': 0.67741936, 'true_negatives': 145.0, 'true_positives': 378.0, 'global_step': 74}

**************InfoSharing**********************
train num label=1: 41
train percent = 1: 0.05137844611528822
test num label=1: 10
test percent = 1: 0.03745318352059925
Beginning Training!
Training took time  0:00:00.004473
{'auc': 0.388716, 'eval_accuracy': 0.37827715, 'false_negatives': 6.0, 'false_positives': 160.0, 'loss': 0.89093673, **'precision': 0.024390243, 'recall': 0.4**, 'true_negatives': 97.0, 'true_positives': 4.0, 'global_step': 74}
{'auc': 0.5099398, 'eval_accuracy': 0.39849624, 'false_negatives': 15.0, 'false_positives': 465.0, 'loss': 0.8835133, 'precision': 0.052953158, 'recall': 0.63414633, 'true_negatives': 292.0, 'true_positives': 26.0, 'global_step': 74}

**************MentalHealth**********************
train num label=1: 174
train percent = 1: 0.21804511278195488
test num label=1: 59
test percent = 1: 0.2209737827715356
Beginning Training!
Training took time  0:00:00.004239
{'auc': 0.34888363, 'eval_accuracy': 0.3071161, 'false_negatives': 34.0, 'false_positives': 151.0, 'loss': 0.89569867, **'precision': 0.14204545, 'recall': 0.42372882**, 'true_negatives': 57.0, 'true_positives': 25.0, 'global_step': 74}
{'auc': 0.32230884, 'eval_accuracy': 0.32581455, 'false_negatives': 119.0, 'false_positives': 419.0, 'loss': 0.9035461, 'precision': 0.116033755, 'recall': 0.31609195, 'true_negatives': 205.0, 'true_positives': 55.0, 'global_step': 74}

**************Other**********************
train num label=1: 78
train percent = 1: 0.09774436090225563
test num label=1: 30
test percent = 1: 0.11235955056179775
Beginning Training!
Training took time  0:00:00.004054
{'auc': 0.38143462, 'eval_accuracy': 0.3670412, 'false_negatives': 18.0, 'false_positives': 151.0, 'loss': 0.8866387, **'precision': 0.073619634, 'recall': 0.4**, 'true_negatives': 86.0, 'true_positives': 12.0, 'global_step': 74}
{'auc': 0.4758013, 'eval_accuracy': 0.4047619, 'false_negatives': 34.0, 'false_positives': 441.0, 'loss': 0.87344176, 'precision': 0.09072165, 'recall': 0.5641026, 'true_negatives': 279.0, 'true_positives': 44.0, 'global_step': 74}

**************Practice**********************
train num label=1: 47
train percent = 1: 0.05889724310776942
test num label=1: 20
test percent = 1: 0.0749063670411985
Beginning Training!
Training took time  0:00:00.003798
{'auc': 0.26396763, 'eval_accuracy': 0.31835207, 'false_negatives': 16.0, 'false_positives': 166.0, 'loss': 0.9134971, **'precision': 0.023529412, 'recall': 0.2**, 'true_negatives': 81.0, 'true_positives': 4.0, 'global_step': 74}
{'auc': 0.27883393, 'eval_accuracy': 0.35588974, 'false_negatives': 38.0, 'false_positives': 476.0, 'loss': 0.90788925, 'precision': 0.018556701, 'recall': 0.19148937, 'true_negatives': 275.0, 'true_positives': 9.0, 'global_step': 74}

**************Immigration**********************
train num label=1: 10
train percent = 1: 0.012531328320802004
test num label=1: 2
test percent = 1: 0.00749063670411985
Beginning Training!
Training took time  0:00:00.004241
{'auc': 0.6811322, 'eval_accuracy': 0.3670412, 'false_negatives': 0.0, 'false_positives': 169.0, 'loss': 0.87873244, **'precision': 0.0116959065, 'recall': 1.0**, 'true_negatives': 96.0, 'true_positives': 2.0, 'global_step': 74}
{'auc': 0.54860413, 'eval_accuracy': 0.4010025, 'false_negatives': 3.0, 'false_positives': 475.0, 'loss': 0.87476, 'precision': 0.014522822, 'recall': 0.7, 'true_negatives': 313.0, 'true_positives': 7.0, 'global_step': 74}