# TF-IDF score: Kullback–Leibler divergence

Plan here is to take our sampling distributions of text frequency, inverse document frequency (TF-IDF) scores for human and synthetic text and use them to generate a function that takes a TF-IDF score and converts it into a Kullback-Leibler divergence (KLD) score. See the figure below from the [Wikipedia article on KLD](https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence).

Workflow is as follows:
1. Get kernel density estimate of PR score distributions for human and synthetic text fragments in the training data.
2. Calculated the KLD between the human and synthetic PR score distributions.
3. Get a get kernel density estimate of the KLD.
4. Use the probability density function from the KLD kernel density estimate to calculate a KLD score for each text fragment in the training and testing data.
5. Add the KLD score as a new feature.

The above will be done individually for each fragment length bin and the combined data. This way the KLD score feature in each bin will capture the TF-IDF score distribution for text fragments in that specific length regime, rather that for the whole dataset.

In [2]:
from IPython.display import Image
Image(url = 'https://raw.githubusercontent.com/gperdrizet/llm_detector/benchmarking/benchmarking/notebooks/images/KL-Gauss-Example.png')

## 1. Run set-up

In [None]:
# Change working directory to parent so we can import as we would from main.py
%cd ..

# Do the imports
import configuration as config
import functions.tf_idf_score as tf_idf_funcs
import functions.kullback_leibler_divergence as kld_funcs

In [None]:
# The dataset we want to bin - omit the file extension, it will be 
# added appropriately for the input and output files
dataset_name = 'falcon-7b_scores_v2_10-1000_words'

# Input file path
input_file = f'{config.DATA_PATH}/{dataset_name}.h5'

# Option to sample 10% of the data for rapid testing and development
sample = False

## 2. TF-IDF score
Before calculating a Kullback-Leibler divergence score for the TF-IDF score, we need to calculate the TF-IDF score itself for each fragment.

The TF-IDF score created for this project involves scoring each text fragment with term TF-IDF derived from the human and synthetic text fragments in the training data. The TF-IDF score is a product normalized difference calculated as:

$$ (human - synthetic)(human + synthetic) $$

Where human and synthetic refer to average TF-IDF by term for a given text fragment where the term TF-IDF values were derived from the human or synthetic text in the training dataset.

Let's parallelize the calculation over the text fragment length bins and add the features to the data in our hdf5 dataset.

In [4]:
# Run the Kullback-Leibler score calculation on the TF-IDF score
tf_idf_funcs.tf_idf_score(
        hdf5_file = input_file,
        score_sample = sample
)


Worker 0 - 853 fragments in bin_100
Worker 1 - 808 fragments in bin_150
Worker 2 - 700 fragments in bin_200
Worker 3 - 600 fragments in bin_250
Worker 4 - 519 fragments in bin_300
Worker 5 - 417 fragments in bin_350
Worker 6 - 257 fragments in bin_400
Worker 7 - 115 fragments in bin_450
Worker 8 - 44 fragments in bin_500
Worker 9 - 32 fragments in bin_600
Worker 10 - 2394 fragments in combined
Worker 9 - get_term_tf_idf() error: empty vocabulary; perhaps the documents only contain stop words
Worker 9 - tf_idf_score_text_fragments() error: local variable 'tfidf_luts' referenced before assignment

bin_100 training features:

<class 'pandas.core.frame.DataFrame'>
Index: 85 entries, 6856 to 2801
Data columns (total 11 columns):
 #   Column                                    Non-Null Count  Dtype  
---  ------                                    --------------  -----  
 0   Fragment length (words)                   85 non-null     int64  
 1   Fragment length (tokens)                  85 no

## 3. TF-IDF Kullback-Leibler divergence score

In [5]:
# Run the Kullback-Leibler score calculation on the TF-IDF score
kld_funcs.kullback_leibler_score(
        feature_name = 'TF-IDF score',
        hdf5_file = input_file,
        padding = 0.1,
        sample_frequency = 0.001,
        score_sample = sample
)


Worker 0 - 85 fragments in bin_100
Worker 1 - 81 fragments in bin_150
Worker 2 - 70 fragments in bin_200
Worker 3 - 60 fragments in bin_250
Worker 1 - adding Kullback-Leibler score to training features
Worker 4 - get_pr_score_kdes() error: `dataset` input should have multiple elements.
Worker 4 - get_kld() error: local variable 'human_feature_kde' referenced before assignment
Worker 4 - get_kld_kde() error: local variable 'feature_kld' referenced before assignment
Worker 4 - adding Kullback-Leibler score to training features
Worker 4 - add_kld_score() error: local variable 'kld_kde' referenced before assignment
Worker 4 - 52 fragments in bin_300
Worker 5 - 42 fragments in bin_350
Worker 2 - adding Kullback-Leibler score to training features
Worker 5 - get_pr_score_kdes() error: `dataset` input should have multiple elements.
Worker 5 - get_kld() error: local variable 'human_feature_kde' referenced before assignment
Worker 5 - get_kld_kde() error: local variable 'feature_kld' referenced