<a href="https://colab.research.google.com/github/marquesarthur/vanilla-bert-vs-huggingface/blob/main/hugging_face_keras_bert.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Based on 



1.   https://towardsdatascience.com/hugging-face-transformers-fine-tuning-distilbert-for-binary-classification-tasks-490f1d192379
2.   https://www.analyticsvidhya.com/blog/2020/07/transfer-learning-for-nlp-fine-tuning-bert-for-text-classification/
3.   https://huggingface.co/transformers/training.html#fine-tuning-with-keras




**problem statement:**


*   a developer has to inspect an **artifact X**
*   Within the artifact, only a portion of the text is relevant to **input task Y**
*   We ought to build a model that establishes relationships between **Y** and **sentences x ∈ X** 
*  The model must determine: **is x relevant to task Y**




<br>

___

*Example of a task and an annotated artifact:*

<br>

[<img src="https://i.imgur.com/Zj1317H.jpg">](https://i.imgur.com/Zj1317H.jpg)




* The coloured sentences are sentences annotated as relevant to the input task. 
* The warmer the color, the more annotators selected that portion of the text. 
* For simplicity, we process the data and used sentences 

<br>

___

*Ultimately, our data is a tuple representing:*


*   **text** = artifact sentence

*   **question** = task description

*   **source** = URL of the artifact

*   **category_index** = whether sentence is relevant [or not] for the input task

*   **weights** = number of participants who annotated sentence as relevant


<br>

___



In [1]:
# @title Install dependencies

# !pip install transformers
# %tensorflow_version 2.x

In [2]:
# !pip install scikit-learn tqdm pandas python-Levenshtein path colorama matplotlib seaborn

In [3]:
# !pip install python-Levenshtein

In [4]:
# @title Download git repo
# !git clone https://github.com/marquesarthur/vanilla-bert-vs-huggingface.git

In [5]:
# %cd vanilla-bert-vs-huggingface
# !git pull
# !ls -l

In [6]:
# @title Import data as JSON
import itertools
import json
import logging
import os
import sys
import random
from pathlib import Path

from Levenshtein import ratio
from colorama import Fore, Style

logger = logging.getLogger()
logger.level = logging.DEBUG
stream_handler = logging.StreamHandler(sys.stdout)
logger.addHandler(stream_handler)

from ds_android import get_input_for_BERT

raw_data = get_input_for_BERT()

print('Sample entry from data:')
print(json.dumps(raw_data[0], indent=4, sort_keys=True))

[31m39 [33m129 [0m https://developer.android.com/training/permissions/requesting
[31m14 [33m21 [0m https://stackoverflow.com/questions/5233543
[31m4 [33m34 [0m https://github.com/morenoh149/react-native-contacts/issues/516
[31m27 [33m63 [0m https://guides.codepath.com/android/Understanding-App-Permissions
[31m9 [33m161 [0m https://www.avg.com/en/signal/guide-to-android-app-permissions-how-to-use-them-smartly
[31m9 [33m15 [0m https://developer.android.com/training/volley/request
[31m14 [33m65 [0m https://stackoverflow.com/questions/28504524
[31m20 [33m59 [0m https://medium.com/@JasonCromer/android-asynctask-http-request-tutorial-6b429d833e28
[31m5 [33m97 [0m https://www.twilio.com/blog/5-ways-to-make-http-requests-in-java
[31m4 [33m12 [0m https://stackoverflow.com/questions/33241952
[31m6 [33m33 [0m https://github.com/realm/realm-java/issues/776
[31m3 [33m17 [0m https://stackoverflow.com/questions/8712652
[31m8 [33m59 [0m https://dzone.com/articles

[31m4 [33m54 [0m https://developer.android.com/training/gestures/scroll
[31m4 [33m16 [0m https://stackoverflow.com/questions/39588322
[31m20 [33m196 [0m https://developer.android.com/training/dependency-injection/dagger-android
[31m6 [33m44 [0m https://stackoverflow.com/questions/57235136
[31m24 [33m121 [0m https://guides.codepath.com/android/dependency-injection-with-dagger-2
Sample entry from data:
{
    "category_index": 1,
    "question": "Permission Denial when trying to access contacts in Android",
    "source": "https://developer.android.com/training/permissions/requesting",
    "text": "Every Android app runs in a limited-access sandbox.",
    "weights": 1
}


In [7]:
from collections import Counter, defaultdict

cnt = Counter([d['category_index'] for d in raw_data])

total = sum(cnt.values())

labels_cnt = [cnt[0] / float(total), cnt[1] / float(total)]
print('label distribution')
print('')
print('not-relevant -- {:.0f}%'.format(labels_cnt[0] * 100))
print('RELEVANT ------ {:.0f}%'.format(labels_cnt[1] * 100))

label distribution

not-relevant -- 87%
RELEVANT ------ 13%


In [8]:
seframes = {}
with open('seframes.json') as input_file:
    seframes = json.load(input_file)

In [9]:
def has_meaningful_frame(text):    
    meaning_frames = [
        'Using', 'Being_obligated', 'Required_event', 'Causation', 'Attempt', 'Execution'
    ]
    
    if text in seframes:
        text_labels = seframes[text]
        if any([elem in meaning_frames for elem in text_labels]):
            return True
    
        
    return False

In [10]:
fold_results = dict()
# if os.path.isfile('bert_ds_android_w_frames.json'):
#     logger.info(Fore.YELLOW + "Loading data from cache" + Style.RESET_ALL)
#     with open('bert_ds_android.json') as input_file:
#         fold_results = json.load(input_file)

In [11]:
# @title Set environment variables

model_id = 'bert-base-uncased'
# model_id = 'distilbert-base-uncased'

import os
import contextlib
import tensorflow as tf
import os
import codecs
import numpy as np
import math
import json

import numpy as np
import pandas as pd

from collections import defaultdict, Counter
from tqdm import tqdm

USE_TPU = False
os.environ['TF_KERAS'] = '1'

# @title Initialize TPU Strategy
if USE_TPU:
    TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
    resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)
    tf.contrib.distribute.initialize_tpu_system(resolver)
    strategy = tf.contrib.distribute.TPUStrategy(resolver)

# sklearn libs
import sklearn
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, f1_score
from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve
from sklearn.metrics import classification_report

# Tensorflow Imports
import tensorflow as tf
from tensorflow.python import keras
import tensorflow.keras.backend as K
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import initializers


# Hugging face imports
from transformers import AutoTokenizer
from transformers import TFDistilBertForSequenceClassification, TFBertForSequenceClassification
from transformers import TFDistilBertModel, DistilBertConfig
from transformers import DistilBertTokenizerFast, BertTokenizerFast

Falling back to TensorFlow client; we recommended you install the Cloud TPU client directly with pip install cloud-tpu-client.


In [12]:
# @title Model parameters

# Bert Model Constants
SEQ_LEN = 64 # 128
BATCH_SIZE = 64 # 64 32 larger batch size causes OOM errors
EPOCHS = 10 # 3 4
LR = 1e-5 # 2e-5

# 3e-4, 1e-4, 5e-5, 3e-5
# My own constants
# USE_FRAME_FILTERING = False
# UNDERSAMPLING = True
# N_UNDERSAMPLING = 2 # ratio of how many samples from 0-class, to 1-class, e.g.: 2:1
# USE_DS_SYNTHETIC = False

USE_FRAME_FILTERING = True
MATCH_FRAME_FROM_TASK = False
USE_PYRAMID = True

UNDERSAMPLING = True
N_UNDERSAMPLING = 2 # ratio of how many samples from 0-class, to 1-class, e.g.: 2:1
USE_DS_SYNTHETIC = False
MIN_W = 3

In [13]:
# @title JSON to dataframe helper functions
def undersample_df(df, n_times=3):
    class_0,class_1 = df.category_index.value_counts()
    c0 = df[df['category_index'] == 0]
    c1 = df[df['category_index'] == 1]
    df_0 = c0.sample(int(n_times * class_1))
    
    undersampled_df = pd.concat([df_0, c1],axis=0)
    return undersampled_df

def get_ds_synthetic_data(min_w=MIN_W):
    short_task = {
      "bugzilla": """How to query bugs using the custom fields with the Bugzilla REST API?""",
      "databases": """Which technology should be adopted for the database layer abstraction: Object/Relational Mapping (ORM) or a Java Database Connectivity API (JDBC)?""",
      "gpmdpu": """Can I bind the cmd key to the GPMDPU shortcuts?""",
      "lucene": """How does Lucene compute similarity scores for the BM25 similarity?""",
      "networking": """Which technology should be adopted for the notification system, Server-Sent Events (SSE) or WebSockets?""",
    }

    with open('relevance_corpus.json') as ipf:
        aux = json.load(ipf)
        raw_data = defaultdict(list)
        for d in aux:
            if d['task'] == 'yargs':
                continue

            raw_data['text'].append(d['text'])
            raw_data['question'].append(short_task[d['task']])
            raw_data['source'].append(d['source'])
            raw_data['category_index'].append(1 if d['weight'] > min_w else 0)
            raw_data['weights'].append(d['weight'] if d['weight'] > min_w else 0)
 
        data = pd.DataFrame.from_dict(raw_data)
        data = undersample_df(data, n_times=1)
        data = data.sample(frac=1).reset_index(drop=True)
      
    return data

def get_class_weights(y, smooth_factor=0, upper_bound=5.0):
    """
    Returns the weights for each class based on the frequencies of the samples
    :param smooth_factor: factor that smooths extremely uneven weights
    :param y: list of true labels (the labels must be hashable)
    :return: dictionary with the weight for each class
    """
    counter = Counter(y)

    if smooth_factor > 0:
        p = max(counter.values()) * smooth_factor
        for k in counter.keys():
            counter[k] += p

    majority = max(counter.values())

    clazz = {cls: float(majority / count) for cls, count in counter.items()}
    result = {}
    for key, value in clazz.items():
        if value > upper_bound:
            value = upper_bound
        
        result[key] = value
    return result
    
    
def add_raw_data(result, data, use_pyramid=False):
    s = data['source']
    if 'docs.oracle' in s or 'developer.android' in s:
        source_type = 'api'
    elif 'stackoverflow.com' in s:
        source_type = 'so'
    elif 'github.com' in s:
        source_type = 'git'
    else:
        source_type = 'misc'
    
    if use_pyramid:
        pyramid = data['category_index']
    else:
        pyramid = 1 if data['weights'] > 1 else 0        
    
    result['text'].append(data['text'])
    result['question'].append(data['question'])
    result['source'].append(data['source'])
    result['category_index'].append(pyramid)
    result['weights'].append(data['weights'])
    result['source_type'].append(source_type)    
    


In [14]:
# @title Tokenizer

print(model_id)
if model_id == 'distilbert-base-uncased':
    tokenizer = DistilBertTokenizerFast.from_pretrained(model_id, cache_dir='/home/msarthur/scratch', local_files_only=True)
else:
    tokenizer = BertTokenizerFast.from_pretrained(model_id, cache_dir='/home/msarthur/scratch', local_files_only=True)

bert-base-uncased


In [15]:
tokenizer

PreTrainedTokenizerFast(name_or_path='bert-base-uncased', vocab_size=30522, model_max_len=512, is_fast=True, padding_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'})

In [16]:
# @title data encoder

def _encode(tokenizer, dataframe, max_length=SEQ_LEN):
    
    seq_a = dataframe['text'].tolist()
    seq_b = dataframe['question'].tolist()
    
    return tokenizer(seq_a, seq_b, truncation=True, padding=True, max_length=max_length)

def to_one_hot_encoding(data, nb_classes = 2):
    targets = np.array([data]).reshape(-1)
    one_hot_targets = np.eye(nb_classes)[targets]
    return one_hot_targets    

In [17]:
# @title Metrics & Logging functions

from sklearn.metrics import classification_report

recommendation_metrics = defaultdict(list)
prediction_metrics = defaultdict(list)
api_metrics = defaultdict(list)
so_metrics = defaultdict(list)
git_metrics = defaultdict(list)
misc_metrics = defaultdict(list)

classification_report_lst = []
log_examples_lst = []
source_lst = []
venn_diagram_set = []

def aggregate_macro_metrics(store_at, precision, recall, fscore):   
    store_at['precision'].append(precision)
    store_at['recall'].append(recall)
    store_at['fscore'].append(fscore)
    
    
def aggregate_macro_source_metrics(precision, recall, fscore, source):
    s = source
    if 'docs.oracle' in s or 'developer.android' in s:
        aggregate_macro_metrics(api_metrics, precision, recall, fscore)
    elif 'stackoverflow.com' in s:
        aggregate_macro_metrics(so_metrics, precision, recall, fscore)
    elif 'github.com' in s:
        aggregate_macro_metrics(git_metrics, precision, recall, fscore)        
    elif  'github.com' not in s and 'docs.oracle' not in s and 'developer.android' not in s and 'stackoverflow.com' not in s:
        aggregate_macro_metrics(misc_metrics, precision, recall, fscore)
    

def aggregate_recommendation_metrics(store_at, k, precision_at_k, pyramid_precision_at_k):
    store_at['k'].append(k)
    store_at['precision'].append(precision_at_k)
    store_at['∆ precision'].append(pyramid_precision_at_k)
    
def aggregate_report_metrics(clz_report):
    relevant_label = str(1)
    if relevant_label in clz_report:
        for _key in ['precision', 'recall']:
            if _key in clz_report[relevant_label]:
                clz_report_lst[_key].append(clz_report[relevant_label][_key])    
                
def log_examples(task_title, source, text, pweights, y_predict, y_probs, k=10):
    # get the predicted prob at every index
    idx_probs = [(idx, y_predict[idx], y_probs[idx]) for idx, _ in enumerate(y_predict)]
    
    # filter probs for all indexes predicted as relevant  
    idx_probs = list(filter(lambda k: k[1] == 1, idx_probs))
    
    most_probable = sorted(idx_probs, key=lambda i: i[2], reverse=True)
    
    result = [idx for idx, _, _ in most_probable][:k]
    
    for idx in result:
        log_examples_lst.append((
            source, 
            task_title,
            pweights[idx],
            y_predict[idx],
            y_probs[idx],
            text[idx]
        ))
        
def log_venn_diagram(y_true, y_predicted, text):
    cnt = 0
    try:
        for _true, _predict, _t in zip(y_true, y_predicted, text):
            if _true == 1 and _predict == 1:
                cnt += 1
                venn_diagram_set.append(_t)
    except Exception as ex:
        logger.info(str(ex))
    logger.info(Fore.RED + str(cnt) + Style.RESET_ALL + " entries logged")

    
def avg_macro_metric_for(data):
    __precision = data['precision']
    __recall = data['recall']
    __fscore = data['fscore']

    return np.mean(__precision), np.mean(__recall), np.mean(__fscore)        

In [18]:
#@title Training procedures

def get_train_val_test(task_uid, size=0.9, undersample=False, aug=True, undersample_n=3):
    if not isinstance(task_uid, list):
        task_uid = [task_uid]
        
    train_data_raw = defaultdict(list)
    test_data_raw = defaultdict(list)
    
    for _data in tqdm(CORPUS):
        if _data['question'] in task_uid:
            add_raw_data(test_data_raw, _data, use_pyramid=USE_PYRAMID)
        else:
            add_raw_data(train_data_raw, _data, use_pyramid=USE_PYRAMID)
    
    train_val = pd.DataFrame.from_dict(train_data_raw)
    test = pd.DataFrame.from_dict(test_data_raw)
    
    # https://stackoverflow.com/questions/29576430/shuffle-dataframe-rows
    #  randomize rows....    
    train_val = train_val.sample(frac=1).reset_index(drop=True)
    test = test.sample(frac=1).reset_index(drop=True)
    
    if undersample:
        train_val = undersample_df(train_val, n_times=undersample_n)
        train_val = train_val.sample(frac=1).reset_index(drop=True)
        
    if aug:
        train_val = pd.concat([train_val, get_ds_synthetic_data()],axis=0)
        train_val = train_val.sample(frac=1).reset_index(drop=True)
    
    weights = get_class_weights(train_val['category_index'].tolist())
    
    train, val = train_test_split(
        train_val, 
        stratify=train_val['category_index'].tolist(), 
        train_size=size
    )
    
    return train, val, test, weights        

In [19]:
from itertools import combinations, product


def get_most_common_frame_relationships(df_train):
    frame_task_pairs = []
    df_filtered = df_train[df_train['category_index'] == 1]
    for __task, __text in zip(df_filtered['question'].tolist(), df_filtered['text'].tolist()):

        task_labels, text_labels = [], []
        if __task in seframes:
            task_labels = seframes[__task]

        if __text in seframes:
            text_labels = seframes[__text]

        if task_labels and text_labels:
            all_pairs = list(product(task_labels, text_labels))
            frame_task_pairs += all_pairs

    most_common_frame_relationships = [pair for pair, cnt in Counter(frame_task_pairs).most_common(100)]
    return most_common_frame_relationships

def has_common_task_frame(task_title, text, most_common_frame_relationships):
    task_labels, text_labels = [], []
    if task_title in seframes:
        task_labels = seframes[task_title]
    else:
        return False

    if text in seframes:
        text_labels = seframes[text]
    else:
        return False
        
        
    all_pairs = list(product(task_labels, text_labels))
    has_frame_match = any([elem in most_common_frame_relationships for elem in all_pairs])
    
    return has_frame_match
    
    
    

In [20]:
def update_predictions(task_title, text, y_true, y_predict, y_probs, relevant_class=1, max_pred_values=10):
    y_true_prime = []
    y_predict_prime = []
        
    
    # update probs after k = 10, same as in eval_model 
    aux = [(idx, prob) for idx, prob in enumerate(y_probs)]
    
    cnt = 0
    for idx, prob in sorted(aux, key=lambda k: k[1], reverse=True):
        y_true_prime.append(y_true[idx])
        _t = text[idx]
        
        cnt += 1
        if cnt > max_pred_values:
            y_predict_prime.append(y_predict[idx])
        else:
            if has_meaningful_frame(_t):
                y_predict_prime.append(max(y_predict[idx], relevant_class))
            else:
                y_predict_prime.append(y_predict[idx])
                
    
    return y_true_prime, y_predict_prime

In [21]:
def update_predictions_with_task(task_title, text, y_true, y_predict, y_probs, task_filter, relevant_class=1, max_pred_values=10):
    y_true_prime = []
    y_predict_prime = []
        
    
    # update probs after k = 10, same as in eval_model 
    aux = [(idx, prob) for idx, prob in enumerate(y_probs)]
    max_pred_values = max(int(len(text) * 0.15), 10)
    
    cnt = 0
    for idx, prob in sorted(aux, key=lambda k: k[1], reverse=True):
        y_true_prime.append(y_true[idx])
        _t = text[idx]
        
        cnt += 1
        if cnt > max_pred_values:
            y_predict_prime.append(y_predict[idx])
        else:
            if has_common_task_frame(task_title, _t, task_filter):
                y_predict_prime.append(max(y_predict[idx], relevant_class))
            else:
                y_predict_prime.append(y_predict[idx])
                
    
    return y_true_prime, y_predict_prime

In [22]:
# @title Testing procedures

# https://medium.com/geekculture/hugging-face-distilbert-tensorflow-for-custom-text-classification-1ad4a49e26a7
def eval_model(model, test_data, max_pred_values=10):
    preds = model.predict(test_data.batch(1)).logits  
    
    #transform to array with probabilities
    res = tf.nn.softmax(preds, axis=1).numpy()      

    y_predict, y_probs = res.argmax(axis=-1), res[:, 1]
    aux = [(idx, prob) for idx, prob in enumerate(y_probs)]
    
    max_pred_values = max(int(len(y_predict) * 0.15), 10)
    
    cnt = 0
    for idx, prob in sorted(aux, key=lambda k: k[1], reverse=True):
#         if cnt < max_pred_values:
#             logger.info(f"DEBUG : {y_predict[idx]} {round(prob, 4)}")
        
        cnt += 1
        if cnt > max_pred_values:
            y_predict[idx] = 0
            
    
    return y_predict, y_probs
    

def test_model(source, df_test, model, tokenizer, pos_filter=False, task_filter=None):
    
    df_source = df_test[df_test["source"] == source]   
    task_title = df_source['question'].tolist()[0]
    text = df_source['text'].tolist()
    pweights = df_source['weights'].tolist()
    
    # Encode X_test
    test_encodings = _encode(tokenizer, df_source)
    test_labels = df_source['category_index'].tolist()
    
    test_dataset = tf.data.Dataset.from_tensor_slices((
        dict(test_encodings),
        test_labels
    ))
    
    y_true = [y.numpy() for x, y in test_dataset]
    
    if any([k == 1 for k in y_true]): # means that this source has at least one annotated sentence
        y_predict, y_probs = eval_model(model, test_dataset)

    
        if task_filter:
            y_true, y_predict = update_predictions_with_task(task_title, text, y_true, y_predict, y_probs, task_filter)
            
        if pos_filter:
            y_true, y_predict = update_predictions(task_title, text, y_true, y_predict, y_probs)


        if len(y_true) > 0 and len(y_predict) > 0:
            accuracy = accuracy_score(y_true, y_predict)
            macro_f1 = f1_score(y_true, y_predict, average='macro')

            classification_report_lst.append(classification_report(y_true, y_predict))
            aggregate_report_metrics(classification_report(y_true, y_predict, output_dict=True))


            logger.info("-" * 20)    

            logger.info("Y")
            logger.info("[0s] {} [1s] {}".format(
                len(list(filter(lambda k: k== 0, y_true))),
                len(list(filter(lambda k: k== 1, y_true)))
            ))


            logger.info("predicted")
            logger.info("[0s] {} [1s] {}".format(
                len(list(filter(lambda k: k== 0, y_predict))),
                len(list(filter(lambda k: k== 1, y_predict)))
            ))

            logger.info("-" * 20)

            logger.info("Accuracy: {:.4f}".format(accuracy))
            logger.info("macro_f1: {:.4f}".format(macro_f1))

            precision, recall, fscore, _ = precision_recall_fscore_support(y_true, y_predict, average='macro')

            aggregate_macro_metrics(prediction_metrics, precision, recall, fscore)
            aggregate_macro_source_metrics(precision, recall, fscore, source)

            logger.info("Precision: {:.4f}".format(precision))
            logger.info("Recall: {:.4f}".format(recall))
            logger.info("F1: {:.4f}".format(fscore))

            log_examples(task_title, source, text, pweights, y_predict, y_probs, k=10)
            log_venn_diagram(y_true, y_predict, text)
            source_lst.append(source)

In [23]:
def add_idx_fold_results(idx_split, store_at):
    if idx_split not in store_at:
        store_at[idx_split] = dict()
        store_at[idx_split]['run_cnt'] = 0
        store_at[idx_split]['overall'] = defaultdict(list)
        store_at[idx_split]['api'] = defaultdict(list)
        store_at[idx_split]['so'] = defaultdict(list)
        store_at[idx_split]['git'] = defaultdict(list)
        store_at[idx_split]['misc'] = defaultdict(list)
    
    store_at[idx_split]['run_cnt'] += 1
    
    _precision, _recall, _f1score = avg_macro_metric_for(prediction_metrics)
    store_at[idx_split]['overall']['precision'].append(_precision)
    store_at[idx_split]['overall']['recall'].append(_recall)
    store_at[idx_split]['overall']['fscore'].append(_f1score)  
    
    _precision, _recall, _f1score = avg_macro_metric_for(api_metrics)
    store_at[idx_split]['api']['precision'].append(_precision)
    store_at[idx_split]['api']['recall'].append(_recall)
    store_at[idx_split]['api']['fscore'].append(_f1score)  
    
    _precision, _recall, _f1score = avg_macro_metric_for(so_metrics)
    store_at[idx_split]['so']['precision'].append(_precision)
    store_at[idx_split]['so']['recall'].append(_recall)
    store_at[idx_split]['so']['fscore'].append(_f1score)  
    
    _precision, _recall, _f1score = avg_macro_metric_for(git_metrics)
    store_at[idx_split]['git']['precision'].append(_precision)
    store_at[idx_split]['git']['recall'].append(_recall)
    store_at[idx_split]['git']['fscore'].append(_f1score)  
    
    _precision, _recall, _f1score = avg_macro_metric_for(misc_metrics)
    store_at[idx_split]['misc']['precision'].append(_precision)
    store_at[idx_split]['misc']['recall'].append(_recall)
    store_at[idx_split]['misc']['fscore'].append(_f1score)  

In [24]:
# model = TFBertForSequenceClassification.from_pretrained(model_id, cache_dir='/home/msarthur/scratch', local_files_only=True)

In [25]:
# @title 10-fold cross validation WIP
CORPUS = raw_data

all_tasks = sorted(list(set([d['question'] for d in raw_data])))
rseed = 20210343
random.seed(rseed)
random.shuffle(all_tasks)

from sklearn.model_selection import KFold


file_handler = logging.FileHandler('/home/msarthur/scratch/LOG-bert_ds_android.ans')
file_handler.setLevel(logging.DEBUG)
logger.addHandler(file_handler)


n_splits = 10
kf = KFold(n_splits=n_splits, random_state=rseed)
np_tasks_arr = np.array(all_tasks)



idx_split = 0
for train_index, test_index in kf.split(np_tasks_arr):

    idx_split = str(idx_split)
    eval_fold = True
    # 10 runs per fold to avoid reporting peek results in a given fold
    if idx_split in fold_results and fold_results[idx_split]['run_cnt'] >= 10:
        logger.info(Fore.RED + f"Fold {idx_split} FULLY TESTED" + Style.RESET_ALL)
        eval_fold = False


    if eval_fold:
        # <------------------------------------------------------------------------- EVAL VARIABLES
        recommendation_metrics = defaultdict(list)
        prediction_metrics = defaultdict(list)
        api_metrics = defaultdict(list)
        so_metrics = defaultdict(list)
        git_metrics = defaultdict(list)
        misc_metrics = defaultdict(list)
        random_prediction_metrics = defaultdict(list)
        clz_report_lst = defaultdict(list)

        classification_report_lst = []
        log_examples_lst = []
        source_lst = []
        venn_diagram_set = []
        # <------------------------------------------------------------------------- EVAL VARIABLES


        test_tasks_lst = np_tasks_arr[test_index].tolist()

        logger.info("")
        logger.info(Fore.RED + f"Fold {idx_split}" + Style.RESET_ALL)
        logger.info('\n'.join(test_tasks_lst))

        # <------------------------------------------------------------------------- INPUT
        df_train, df_val, df_test, weights = get_train_val_test(
            test_tasks_lst,
            aug=USE_DS_SYNTHETIC,
            undersample=UNDERSAMPLING, 
            undersample_n=N_UNDERSAMPLING
        )
        # <------------------------------------------------------------------------- INPUT

        logger.info('-' * 10)
        logger.info(Fore.RED + 'train'+ Style.RESET_ALL)
        logger.info(str(df_train.category_index.value_counts()))
        logger.info("")

        logger.info(Fore.RED + 'test'+ Style.RESET_ALL)
        logger.info(str(df_test.category_index.value_counts()))
        logger.info("")

        logger.info(Fore.RED + 'weights'+ Style.RESET_ALL)
        logger.info(str(weights))
        logger.info('-' * 10)


        # Encode X_train
        train_encodings = _encode(tokenizer, df_train)
        train_labels = df_train['category_index'].tolist()

        # Encode X_valid
        val_encodings = _encode(tokenizer, df_val)
        val_labels = df_val['category_index'].tolist()


        # https://huggingface.co/transformers/custom_datasets.html
        train_dataset = tf.data.Dataset.from_tensor_slices((
            dict(train_encodings),
            train_labels
        ))

        val_dataset = tf.data.Dataset.from_tensor_slices((
            dict(val_encodings),
            val_labels
        ))


        if model_id == 'distilbert-base-uncased':
            model = TFDistilBertForSequenceClassification.from_pretrained(
                model_id, cache_dir='/home/msarthur/scratch'
            )
        else:
            model = TFBertForSequenceClassification.from_pretrained(
                model_id, cache_dir='/home/msarthur/scratch', local_files_only=True
            )

        # freeze all the parameters
        # for param in model.parameters():
        #   param.requires_grad = False


        optimizer = tf.keras.optimizers.Adam(learning_rate=LR)
        loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

        METRICS = [
            tf.keras.metrics.SparseCategoricalAccuracy()
        ]

        early_stopper = tf.keras.callbacks.EarlyStopping(
            monitor='val_loss', mode='min', patience=4, 
            verbose=1, restore_best_weights=True
        )

        # https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ModelCheckpoint
        checkpoint_filepath = '/home/msarthur/scratch/best_model'

        mc = tf.keras.callbacks.ModelCheckpoint(
            checkpoint_filepath, 
            monitor='val_loss', mode='min', verbose=1, 
            save_best_only=True,
            save_weights_only=True
        )

        model.compile(
            optimizer=optimizer,
            loss=loss_fn,
            metrics=METRICS
        )

        # https://discuss.huggingface.co/t/how-to-dealing-with-data-imbalance/393/3
        # https://wandb.ai/ayush-thakur/huggingface/reports/Early-Stopping-in-HuggingFace-Examples--Vmlldzo0MzE2MTM
        model.fit(
            train_dataset.shuffle(1000).batch(BATCH_SIZE), 
            epochs=EPOCHS, 
            batch_size=BATCH_SIZE,
            class_weight=weights,
            validation_data=val_dataset.shuffle(1000).batch(BATCH_SIZE),
            callbacks=[early_stopper, mc]
        )

        model.load_weights(checkpoint_filepath)
        
        most_common_frame_relationships = None
        if MATCH_FRAME_FROM_TASK:
            most_common_frame_relationships = get_most_common_frame_relationships(df_train)

        logger.info("")
        logger.info(Fore.RED + f"Testing model" + Style.RESET_ALL)
        for source in df_test["source"].unique():
            df_source = df_test[df_test["source"] == source]   
            logger.info(source)
            test_model(source, df_source, model, tokenizer, pos_filter=USE_FRAME_FILTERING, task_filter=most_common_frame_relationships)

        add_idx_fold_results(idx_split, fold_results)
        if 'venn_diagram_set' not in fold_results:
            fold_results['venn_diagram_set'] = []

        fold_results['venn_diagram_set'] += venn_diagram_set
        fold_results['venn_diagram_set'] = list(set(fold_results['venn_diagram_set']))


        _precision, _recall, _f1score = avg_macro_metric_for(prediction_metrics)

        logger.info("")
        logger.info(Fore.YELLOW + "Model metrics" + Style.RESET_ALL)
        logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
        logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
        logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)




        log_sources_data = [api_metrics, so_metrics, git_metrics, misc_metrics]
        log_sources_ids = ['api_metrics', 'so_metrics', 'git_metrics', 'misc_metrics']

        for _id, __data in zip(log_sources_ids, log_sources_data):
            _precision, _recall, _f1score = avg_macro_metric_for(__data)

            logger.info("")
            logger.info(Fore.YELLOW + f"{_id}" + Style.RESET_ALL)
            logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
            logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
            logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)


    idx_split = int(idx_split)
    idx_split += 1
    logger.info(f"next {idx_split}")
#     break
#     if idx_split >= 2:
#         logger.info(f"breaking at {idx_split}")
#         break


[31mFold 0[0m
how can i get the value of text view in recyclerview item?
Hide MarkerView when nothing selected
How to check programmatically whether app is running in debug mode or not?
JSONObject parse dictionary objects
Want to add drawable icons insteadof colorful dots


100%|██████████| 7916/7916 [00:00<00:00, 861273.94it/s]

----------
[31mtrain[0m
0    1656
1     828
Name: category_index, dtype: int64

[31mtest[0m
0    664
1     71
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x2b872c29a3d0>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
The parameter `return_dict` cannot be set in graph mode an

100%|██████████| 7916/7916 [00:00<00:00, 750092.86it/s]

----------
[31mtrain[0m
0    1613
1     806
Name: category_index, dtype: int64

[31mtest[0m
0    622
1     95
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.64660, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.64660 to 0.58518, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

  _warn_prf(average, modifier, msg_start, len(result))
100%|██████████| 7916/7916 [00:00<00:00, 352602.51it/s]

----------
[31mtrain[0m
0    1459
1     730
Name: category_index, dtype: int64

[31mtest[0m
0    1178
1     180
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.68853, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.68853 to 0.68054, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

[31m6[0m entries logged
https://developer.android.com/training/keyboard-input/commands
--------------------
Y
[0s] 11 [1s] 3
predicted
[0s] 4 [1s] 10
--------------------
Accuracy: 0.3571
macro_f1: 0.3538
Precision: 0.4750
Recall: 0.4697
F1: 0.3538
[31m2[0m entries logged
https://github.com/morenoh149/react-native-contacts/issues/516
--------------------
Y
[0s] 30 [1s] 4
predicted
[0s] 29 [1s] 5
--------------------
Accuracy: 0.7353
macro_f1: 0.4237
Precision: 0.4310
Recall: 0.4167
F1: 0.4237
[31m0[0m entries logged
https://developer.android.com/training/safetynet/recaptcha
--------------------
Y
[0s] 34 [1s] 20
predicted
[0s] 44 [1s] 10
--------------------
Accuracy: 0.5926
macro_f1: 0.4923
Precision: 0.5182
Recall: 0.5118
F1: 0.4923
[31m4[0m entries logged
https://stackoverflow.com/questions/35357919
--------------------
Y
[0s] 41 [1s] 12
predicted
[0s] 47 [1s] 6
--------------------
Accuracy: 0.7358
macro_f1: 0.5316
Precision: 0.5603
Recall: 0.5346
F1: 0.5316
[31m2[0m entr

100%|██████████| 7916/7916 [00:00<00:00, 825143.16it/s]

----------
[31mtrain[0m
0    1596
1     798
Name: category_index, dtype: int64

[31mtest[0m
0    714
1    104
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.62396, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.62396 to 0.56062, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

  out=out, **kwargs)
  ret = ret.dtype.type(ret / rcount)
100%|██████████| 7916/7916 [00:00<00:00, 845461.29it/s]

----------
[31mtrain[0m
0    1710
1     855
Name: category_index, dtype: int64

[31mtest[0m
0    235
1     41
Name: category_index, dtype: int64

[31mweights[0m
{1: 2.0, 0: 1.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.68552, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.68552 to 0.67903, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

100%|██████████| 7916/7916 [00:00<00:00, 776185.49it/s]

----------
[31mtrain[0m
0    1577
1     788
Name: category_index, dtype: int64

[31mtest[0m
0    752
1    115
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.69975, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.69975 to 0.63259, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

Accuracy: 0.7708
macro_f1: 0.5107
Precision: 0.5237
Recall: 0.5667
F1: 0.5107
[31m1[0m entries logged
https://developer.android.com/guide/navigation/navigation-swipe-view-2
--------------------
Y
[0s] 16 [1s] 3
predicted
[0s] 10 [1s] 9
--------------------
Accuracy: 0.5789
macro_f1: 0.5128
Precision: 0.5611
Recall: 0.6146
F1: 0.5128
[31m2[0m entries logged
https://developer.android.com/guide/navigation/navigation-custom-back
--------------------
Y
[0s] 16 [1s] 17
predicted
[0s] 23 [1s] 10
--------------------
Accuracy: 0.5455
macro_f1: 0.5299
Precision: 0.5609
Recall: 0.5515
F1: 0.5299
[31m6[0m entries logged

[33mModel metrics[0m
precision: [31m0.567[0m
recall:    [31m0.599[0m
f1-score:  [31m0.555[0m

[33mapi_metrics[0m
precision: [31m0.540[0m
recall:    [31m0.561[0m
f1-score:  [31m0.520[0m

[33mso_metrics[0m
precision: [31m0.583[0m
recall:    [31m0.658[0m
f1-score:  [31m0.586[0m

[33mgit_metrics[0m
precision: [31mnan[0m
recall:    [31mnan[0m
f1-sco

  out=out, **kwargs)
  ret = ret.dtype.type(ret / rcount)
100%|██████████| 7916/7916 [00:00<00:00, 773455.18it/s]

----------
[31mtrain[0m
0    1492
1     746
Name: category_index, dtype: int64

[31mtest[0m
0    1119
1     162
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.70491, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.70491 to 0.62071, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

--------------------
Y
[0s] 16 [1s] 16
predicted
[0s] 22 [1s] 10
--------------------
Accuracy: 0.5625
macro_f1: 0.5466
Precision: 0.5727
Recall: 0.5625
F1: 0.5466
[31m6[0m entries logged
https://stackoverflow.com/questions/29738510
--------------------
Y
[0s] 19 [1s] 4
predicted
[0s] 21 [1s] 2
--------------------
Accuracy: 0.8261
macro_f1: 0.6167
Precision: 0.6786
Recall: 0.5987
F1: 0.6167
[31m1[0m entries logged
https://stackoverflow.com/questions/6442054
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
--------------------
Y
[0s] 12 [1s] 9
predicted
[0s] 11 [1s] 10
--------------------
Accuracy: 0.5714
macro_f1: 0.5675
Precision: 0.5682
Recall: 0.5694
F1: 0.5675
[31m5[0m entries logged
https://git

100%|██████████| 7916/7916 [00:00<00:00, 859690.59it/s]

----------
[31mtrain[0m
0    1634
1     817
Name: category_index, dtype: int64

[31mtest[0m
0    815
1     83
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.65287, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.65287 to 0.57749, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep


[33mModel metrics[0m
precision: [31m0.546[0m
recall:    [31m0.556[0m
f1-score:  [31m0.535[0m

[33mapi_metrics[0m
precision: [31m0.533[0m
recall:    [31m0.535[0m
f1-score:  [31m0.522[0m

[33mso_metrics[0m
precision: [31m0.535[0m
recall:    [31m0.565[0m
f1-score:  [31m0.521[0m

[33mgit_metrics[0m
precision: [31mnan[0m
recall:    [31mnan[0m
f1-score:  [31mnan[0m

[33mmisc_metrics[0m
precision: [31m0.592[0m
recall:    [31m0.582[0m
f1-score:  [31m0.586[0m
next 8

[31mFold 8[0m
SeekTo Position of cutted song not working
Android Gallery with pinch zoom
Wait for 2 async REST calls to result in success or error
how  to set Screenshot frame size


  _warn_prf(average, modifier, msg_start, len(result))
  out=out, **kwargs)
  ret = ret.dtype.type(ret / rcount)
100%|██████████| 7916/7916 [00:00<00:00, 818108.38it/s]

----------
[31mtrain[0m
0    1685
1     842
Name: category_index, dtype: int64

[31mtest[0m
0    333
1     55
Name: category_index, dtype: int64

[31mweights[0m
{1: 2.0, 0: 1.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.62975, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.62975 to 0.58528, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

100%|██████████| 7916/7916 [00:00<00:00, 788648.70it/s]

----------
[31mtrain[0m
0    1631
1     815
Name: category_index, dtype: int64

[31mtest[0m
0    493
1     85
Name: category_index, dtype: int64

[31mweights[0m
{0: 1.0, 1: 2.0}
----------



All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Epoch 00001: val_loss improved from inf to 0.67497, saving model to /home/msarthur/scratch/best_model
Epoch 2/10
Epoch 00002: val_loss improved from 0.67497 to 0.62412, saving model to /home/msarthur/scratch/best_model
Epoch 3/10
Ep

In [26]:
# for source in df_test["source"].unique():
#     df_source = df_test[df_test["source"] == source]   
#     logger.info(source)
#     test_model(source, df_source, model, tokenizer, pos_filter=True)
    

In [27]:
__precision, __recall, __fscore = [], [], []

for key_i, value in fold_results.items():
    if isinstance(value, dict):
        for key_j, __data in value.items():
            if key_j == 'overall':
                logger.info(Fore.YELLOW + f"{key_i}" + Style.RESET_ALL)
                logger.info("precision: " + Fore.RED +
                            "{:.3f}".format(np.mean(__data['precision'])) + Style.RESET_ALL +
                           f" {str([round(x, 2) for x in __data['precision']])}")
                logger.info("recall:    " + Fore.RED +
                            "{:.3f}".format(np.mean(__data['recall'])) + Style.RESET_ALL+
                           f" {str([round(x, 2) for x in __data['recall']])}")
                logger.info("f1-score:  " + 
                            Fore.RED + "{:.3f}".format(np.mean(__data['fscore'])) + Style.RESET_ALL+
                           f" {str([round(x, 2) for x in __data['fscore']])}")
                
                __precision += __data['precision']
                __recall += __data['recall']
                __fscore += __data['fscore']
                
__precision = [x for x in __precision if str(x) != 'nan']
__recall = [x for x in __recall if str(x) != 'nan']
__fscore = [x for x in __fscore if str(x) != 'nan']


logger.info("\n")
logger.info(Fore.RED + "AGGREGATED METRICS" + Style.RESET_ALL)
logger.info("\nprecision: " + Fore.RED + "{:.3f}".format(np.mean(__precision)) + Style.RESET_ALL)
logger.info("recall:    " + Fore.RED + "{:.3f}".format(np.mean(__recall)) + Style.RESET_ALL)
logger.info("f1-score:  " +  Fore.RED + "{:.3f}".format(np.mean(__fscore)) + Style.RESET_ALL)

[33m0[0m
precision: [31m0.520[0m [0.52]
recall:    [31m0.516[0m [0.52]
f1-score:  [31m0.510[0m [0.51]
[33m1[0m
precision: [31m0.626[0m [0.63]
recall:    [31m0.670[0m [0.67]
f1-score:  [31m0.627[0m [0.63]
[33m2[0m
precision: [31m0.544[0m [0.54]
recall:    [31m0.524[0m [0.52]
f1-score:  [31m0.511[0m [0.51]
[33m3[0m
precision: [31m0.497[0m [0.5]
recall:    [31m0.522[0m [0.52]
f1-score:  [31m0.494[0m [0.49]
[33m4[0m
precision: [31m0.580[0m [0.58]
recall:    [31m0.615[0m [0.62]
f1-score:  [31m0.576[0m [0.58]
[33m5[0m
precision: [31m0.567[0m [0.57]
recall:    [31m0.599[0m [0.6]
f1-score:  [31m0.555[0m [0.55]
[33m6[0m
precision: [31m0.582[0m [0.58]
recall:    [31m0.596[0m [0.6]
f1-score:  [31m0.561[0m [0.56]
[33m7[0m
precision: [31m0.546[0m [0.55]
recall:    [31m0.556[0m [0.56]
f1-score:  [31m0.535[0m [0.53]
[33m8[0m
precision: [31m0.540[0m [0.54]
recall:    [31m0.576[0m [0.58]
f1-score:  [31m0.542[0m [0.54]
[33m9[0m
pr

In [28]:
logger.info(Fore.YELLOW + "Caching results" + Style.RESET_ALL)
with open('bert_ds_android_best_config.json', 'w') as fo:
    json.dump(fold_results, fo, indent=4)

[33mCaching results[0m


In [29]:
fold_results.keys()

dict_keys(['0', 'venn_diagram_set', '1', '2', '3', '4', '5', '6', '7', '8', '9'])

In [30]:
# cnt = 0
# for source in df_test["source"].unique():
#     df_source = df_test[df_test["source"] == source]   
#     logger.info(source)
#     test_model(source, df_source, model, tokenizer, pos_filter=True)
#     cnt += 1
#     if cnt >= 5:
#         break

In [31]:
#@title Metrics report
# logger.info(json.dumps(fold_results, indent=4, sort_keys=True))

In [32]:
# _precision, _recall, _f1score = avg_macro_metric_for(prediction_metrics)

# logger.info("")
# logger.info(Fore.YELLOW + "Model metrics" + Style.RESET_ALL)
# logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
# logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
# logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)


# _precision, _recall, _f1score = avg_macro_metric_for(api_metrics)

# logger.info("")
# logger.info(Fore.YELLOW + "API metrics" + Style.RESET_ALL)
# logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
# logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
# logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)

# _precision, _recall, _f1score = avg_macro_metric_for(so_metrics)

# logger.info("")
# logger.info(Fore.YELLOW + "SO metrics" + Style.RESET_ALL)
# logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
# logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
# logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)

# _precision, _recall, _f1score = avg_macro_metric_for(git_metrics)

# logger.info("")
# logger.info(Fore.YELLOW + "GIT metrics" + Style.RESET_ALL)
# logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
# logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
# logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)

# _precision, _recall, _f1score = avg_macro_metric_for(misc_metrics)

# logger.info("")
# logger.info(Fore.YELLOW + "MISC metrics" + Style.RESET_ALL)
# logger.info("precision: " + Fore.RED + "{:.3f}".format(_precision) + Style.RESET_ALL)
# logger.info("recall:    " + Fore.RED + "{:.3f}".format(_recall) + Style.RESET_ALL)
# logger.info("f1-score:  " + Fore.RED + "{:.3f}".format(_f1score) + Style.RESET_ALL)

In [33]:
def examples_per_source_type(source_type='misc', n_samples=None):
    _sources = list(set([x[0] for x in log_examples_lst]))

    _template = "[w={}]" + Fore.RED + "[y={}]" + Fore.YELLOW + "[p={:.4f}]" + Style.RESET_ALL + " {}"

    idx = 0
    for s in _sources:
        examples_in_source = []
        if source_type == 'api' and ('docs.oracle' in s or 'developer.android' in s):
            examples_in_source = list(filter(lambda k: k[0] == s, log_examples_lst))
            task_title = examples_in_source[0][1]
            idx += 1
        elif source_type == 'so' and ('stackoverflow.com' in s):
            examples_in_source = list(filter(lambda k: k[0] == s, log_examples_lst))
            task_title = examples_in_source[0][1]            
            idx += 1
        elif source_type == 'git' and ('github.com' in s):
            examples_in_source = list(filter(lambda k: k[0] == s, log_examples_lst))
            task_title = examples_in_source[0][1]
            idx += 1
        elif source_type == 'misc' and 'github.com' not in s and 'docs.oracle' not in s and 'developer.android' not in s and 'stackoverflow.com' not in s:
            examples_in_source = list(filter(lambda k: k[0] == s, log_examples_lst))
            task_title = examples_in_source[0][1]
            idx += 1
        if not examples_in_source:
            continue
        logger.info('')
        logger.info(Fore.RED + f"{task_title}" + Style.RESET_ALL)    
        logger.info(s)
        logger.info('')

        for _, _, pweights, y_predict, y_probs, text in examples_in_source:
            logger.info(_template.format(pweights, y_predict, y_probs, text))
            logger.info('')
        logger.info('-' * 20)
      
        if n_samples and idx >= n_samples:
            break
    

In [34]:
#@title Sample prediction outputs for API sources

logger.info(Fore.RED + "API" + Style.RESET_ALL)
examples_per_source_type(source_type='api', n_samples=8)

[31mAPI[0m

[31mHilt: How to prevent Hilt from picking dependency from a library?[0m
https://developer.android.com/training/dependency-injection/hilt-android

[w=0][31m[y=1][33m[p=0.7922][0m For example, as you might need the Context class from either the application or the activity, Hilt provides the @ApplicationContext and @ActivityContext qualifiers.

[w=0][31m[y=1][33m[p=0.7892][0m The following example demonstrates how to scope a binding to a component in a Hilt module.

[w=0][31m[y=1][33m[p=0.7829][0m However, in most cases it is best to use Hilt to manage all of your usage of Dagger on Android.

[w=0][31m[y=1][33m[p=0.7786][0m Instead, provide Hilt with the binding information by creating an abstract function annotated with @Binds inside a Hilt module.

[w=0][31m[y=1][33m[p=0.7770][0m Hilt automatically generates and provides the following:

[w=0][31m[y=1][33m[p=0.7759][0m One way to provide binding information to Hilt is constructor injection.

[w=0][31m[

In [35]:
#@title Sample prediction outputs for GIT sources

logger.info(Fore.RED + "GIT" + Style.RESET_ALL)
examples_per_source_type(source_type='git', n_samples=8)

[31mGIT[0m

[31mHilt: How to prevent Hilt from picking dependency from a library?[0m
https://github.com/google/dagger/issues/1991

[w=0][31m[y=1][33m[p=0.7481][0m Hilt gradle plugin doesn't pick classes from custom android sdk-addon

[w=1][31m[y=1][33m[p=0.5822][0m In your case, you can add Retrofit and OkHttp dependency in app's build.gradle.

[w=0][31m[y=1][33m[p=0.5753][0m The workaround in the issue mentioned by Dany was working the last time I checked it.

[w=0][31m[y=1][33m[p=0.5584][0m We are aware of the implications this causes, such as leaking classes into other Gradle modules and possibly build performance impact with regards to compile avoidance.

[w=0][31m[y=1][33m[p=0.5574][0m I was surprised by this since I haven't really run into errors like this on my real project.

[w=0][31m[y=1][33m[p=0.5287][0m If you have build variants, this approach makes it easy to have different features in different variants.

[w=0][31m[y=1][33m[p=0.5101][0m In my samp

In [36]:
#@title Sample prediction outputs for SO sources

logger.info(Fore.RED + "SO" + Style.RESET_ALL)
examples_per_source_type(source_type='so', n_samples=8)

[31mSO[0m

[31mAndroid SQLite performance in complex queries[0m
https://stackoverflow.com/questions/4015026

[w=3][31m[y=1][33m[p=0.7519][0m If you have more complex queries that can't make use of any indexes that you might create, you can de-normalize your schema, structuring your data in such a way that the queries are simpler and can be answered using indexes.

[w=0][31m[y=1][33m[p=0.7025][0m I dropped this into my ContentProvider.query -LRB- -RRB- and now I can see exactly how all the queries are getting performed.

[w=0][31m[y=1][33m[p=0.7002][0m You can have indexes that contain multiple columns -LRB- to assist queries with multiple predicates -RRB-.

[w=0][31m[y=1][33m[p=0.6660][0m Only one index will be used on any given query.

[w=2][31m[y=1][33m[p=0.6300][0m Here's a bit of code to get EXPLAIN QUERY PLAN results into Android logcat from a running Android app.

[w=3][31m[y=1][33m[p=0.5909][0m If you have a lot of string / text type data, consider creating

In [37]:
#@title Sample prediction outputs for MISC sources

logger.info(Fore.RED + "MISC" + Style.RESET_ALL)
examples_per_source_type(source_type='misc', n_samples=8)

[31mMISC[0m

[31mAndroid App Retrieve Data from Server but in a Secure way[0m
https://medium.com/mindorks/how-to-pass-large-data-between-server-and-client-android-securely-345fed551651

[w=0][31m[y=1][33m[p=0.6295][0m How to pass large data between server and client ( android ) securely?Using RSA and AES ( Hybrid ) encryption techniqueMayank Mohan UpadhyayFollowJun 14, 2017 · 5 min readHello guys, Most of the times, we pass sensitive data from our Android app to our server.

[w=1][31m[y=1][33m[p=0.5248][0m Client will use this passcode to encrypt user's email ID and send to the server.

[w=0][31m[y=1][33m[p=0.4767][0m You can't ship the passcode in your app.

[w=0][31m[y=1][33m[p=0.4020][0m Also, it is impractical to use asymmetric encryption because 99 % of the times the data that you'd want to transfer would be of more than 128 bytes in size !

[w=0][31m[y=1][33m[p=0.3129][0m 1 public key and 1 private key.

[w=0][31m[y=1][33m[p=0.2585][0m What to do?Hybrid solu

In [38]:
logger.info(Fore.RED + f"{len(fold_results['venn_diagram_set'])} entries VENN SET" + Style.RESET_ALL)
for _t in fold_results['venn_diagram_set']:
    logger.info(_t)

[31m317 entries VENN SET[0m

It also helps simplify refactoring, since you can focus on what modules to build rather than focusing on the order in which they need to be created.
My answer builds on that from Kevin Wong, here as a one-liner using CollectionUtils from spring and a Java 8 lambda expression.
I've seen this cause a crash on Android 5.1.
The example code above will access the first, back-facing camera on a device with more than one camera.
The model in this case is Business and for our application, let's suppose we just need the name, phone, and image of the business which are all provided by the Search API.
To handle an individual key press, implement onKeyDown ( ) or onKeyUp ( ) as appropriate.
I spent some time thinking about it.
Encodes this object as a compact JSON string, such as:
bt when i load the list, it load wrong list items time to time.
There have been reports that this value is not 100 % reliable from Eclipse-based builds, though I personally have not encount

An Intent is an object that provides runtime binding between separate components, such as two activities.
However, I'm somewhat new to Jackson, so perhaps I'm missing something here.
This nested fragment is known as a child fragment.
If you run the app and tap the button on the first activity, the second activity starts but is empty.
Get news and tips by email Subscribe
Please see LINK for more info.
Java 7
If you try any of this out then there is a chance that you will do everything correct and yet your action bar will not show.
It is also working on a real device and I tested it in Panasonic P81.
custom event icon/add small icon to event · Issue # 181 · SundeepK/CompactCalendarView · GitHub
the final action in the ondraw method is the drawing of the bitmap.
Defining the Adapter Next, we need to define the adapter to describe the process of converting the Java object to a View ( in the getView method ).
- proc: only means that only annotation processing is done, without any subsequent

Returns the value mapped by name if it exists and is a boolean or can be coerced to a boolean, or fallback otherwise.
Notice that there's a FrameLayout with the id of @ + id/child _ fragment_container in which the child fragment will be inserted.
Returns the API response in a JsonObject.
I dropped this into my ContentProvider.query -LRB- -RRB- and now I can see exactly how all the queries are getting performed.
Wont help you in this case.
with the api 23, permission <uses-permission android:name="android.pemission.READ_CONTACTS"/> dont work, change the api level in the emulator for api 22 -LRB- lollipop -RRB- or lower
The system passes in the user response to the permission dialog, as well as the request code that you defined, as shown in the following code snippet:
Connecting the PagerAdapter and the ViewPager Open MainActivity.kt and add the following line at the top to declare your MoviesPagerAdapter:
If you launch a foreground service while the activity is visible, and the user the

An audio app should provide the ability to balance its output volume with other apps that might be playing on the same stream.
Set the audio encoder using setAudioEncoder ( ).
Anyone with HTTP POST knowledge could put random data inside of the g-recaptcha-response form field, and foll your site to make it think that this field was provided by the google widget.
Next, we define the parent component:
If the holder holds the view you want, you can reuse it.
A practical guide to using Hilt with Kotlin
The JSON structure was quite complex, with multiple levels and even an array.
If an uncaught exception is thrown by the finalize method, the exception is ignored and finalization of that object terminates.
David TruongJan 9, 2019 · 2 min read
1 - Enable touch in the chart
More information LINK
So in order to get a scoped provider in a module, you need to specify the scope for your module's provider method.
after that, you can easily start recording anywhere you want
the zip is attached at the