This is the code for the paper 
# 🔥  HOTTER: Hierarchical Optimal Topic Transport with Explanatory Context Representations 🔥 

## 💥 What is HOTTER for?

 HOTTER is a document meta-distance using contextual embeddings from the language model BERT in combination with an LDA topic model. 
 
 HOTTER is an extension of the [research](https://proceedings.neurips.cc/paper/2019/file/8b5040a8a5baf3e0e67386c2e3a9b903-Paper.pdf) and [code](https://github.com/IBM/HOTT) by Yurochkin et al. on the Hierarchical Optimal Topic Transport (HOTT).

## 💥 Content

With this notebook, you can reproduce the results reported in the [HOTTER paper](https://aclanthology.org/2021.findings-emnlp.418/). You will go through the following steps:

1. Downloading the Datasets
2. Further Pre-Training a BERT Model on the Datasets
3. Extract Embeddings from the BERT Model
4. Preprocessing, Topic Modeling and Contextual Embedding Aggregation
5. HOTT(ER) Distance +  Baseline Calculation





## 🔥 1. Downloading the Datasets

First, we download the datasets from Kusner et al. from this [dropbox](https://www.dropbox.com/sh/nf532hddgdt68ix/AABGLUiPRyXv6UL2YAcHmAFqa?dl=0) to reproduce the process in which the HOTT metric was calculated. In addition, we also need the [raw texts](https://www.dropbox.com/sh/f44z3nt3i5279yt/AACHBs4qiISGPdBjB_aEgDVMa?dl=0) of the datasets which have not been preprocessed yet for a) fine-tuning the BERT model and b) extracting the contextual word embeddings.

In [1]:
# datasets used by HOTT
!mkdir data
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AACtd7NIdxXUfrxSvP-OUci4a/20ng2_500-emd_tr_te.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AABNO4w-1T6ozVLCxrRNjCgGa/amazon-emd_tr_te_split.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AAArRFEToUmSJg8G9v120rBQa/bbcsport-emd_tr_te_split.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AABLYdubbTlj5-VgiCNKTTopa/classic-emd_tr_te_split.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AAD31LaD0o04z7YSdxsfmZIca/ohsumed-emd_tr_te_ix.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AAD6gtc0gtB4n7zWwwnuPqrYa/r8-emd_tr_te3.mat"
!wget -P data "https://www.dropbox.com/sh/nf532hddgdt68ix/AABg21VwvpVlEXkAGwf4c3bxa/twitter-emd_tr_te_split.mat"


# corresponding raw documents of these datasets to fine-tune BERT and to extract contextual embeddings
!mkdir data/raw_data
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AABd4kiMVyKheozC57f_P7aVa/20ng-test-all-terms.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AADXmEvK3nLmTnAiOuQS4oS9a/20ng-train-all-terms.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AABarLseJpwx_0VbP8d2pbNVa/all_amazon_by_line.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AAD0yJEOwF158GbkB57fIS98a/all_classic.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AAAWKcvgigwi6FRCZronseB2a/bbsport_by_sentence.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AACXiE1JCHqHzyBrhmIphi_Wa/test_ohsumed_by_sentence.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AACvA-yfzsAilFyeorHdTIz4a/train_ohsumed_by_sentence.txt"
!wget -P data/raw_data "https://www.dropbox.com/sh/f44z3nt3i5279yt/AADTtNIktX-CqBYR2xR0LAPea/all_twitter_by_line.txt"
# last time we tested the download, the reuters data was not available
!wget -P data/raw_data "https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-train-all-terms.txt"
!wget -P data/raw_data "https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-test-all-terms.txt"


mkdir: cannot create directory ‘data’: File exists
--2021-11-11 23:45:06--  https://www.dropbox.com/sh/nf532hddgdt68ix/AACtd7NIdxXUfrxSvP-OUci4a/20ng2_500-emd_tr_te.mat
Resolving www.dropbox.com (www.dropbox.com)... 2620:100:6027:18::a27d:4812, 162.125.66.18
Connecting to www.dropbox.com (www.dropbox.com)|2620:100:6027:18::a27d:4812|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /sh/raw/nf532hddgdt68ix/AACtd7NIdxXUfrxSvP-OUci4a/20ng2_500-emd_tr_te.mat [following]
--2021-11-11 23:45:06--  https://www.dropbox.com/sh/raw/nf532hddgdt68ix/AACtd7NIdxXUfrxSvP-OUci4a/20ng2_500-emd_tr_te.mat
Reusing existing connection to [www.dropbox.com]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://ucf72b4e5401e281e56a3eb83fd2.dl.dropboxusercontent.com/cd/0/inline/BZz6CGBu4RjT1MqbnppuZgbWipCF454ynTqRlcsEVSpRC8tRKAkjyz8BzAjHwDIdsTN_aBS3kgyIBH_ocFhae82mphU8S6Q89PAToRlOlHklGzT2twqLmvkvSBFhk0YtdmwKHbTu3fNh5PSeGxIgVvSa/file# [following]
--2021

HTTP request sent, awaiting response... 200 OK
Length: 99661349 (95M) [application/octet-stream]
Saving to: ‘data/bbcsport-emd_tr_te_split.mat’


2021-11-11 23:49:14 (7,80 MB/s) - ‘data/bbcsport-emd_tr_te_split.mat’ saved [99661349/99661349]

--2021-11-11 23:49:14--  https://www.dropbox.com/sh/nf532hddgdt68ix/AABLYdubbTlj5-VgiCNKTTopa/classic-emd_tr_te_split.mat
Resolving www.dropbox.com (www.dropbox.com)... 2620:100:6027:18::a27d:4812, 162.125.66.18
Connecting to www.dropbox.com (www.dropbox.com)|2620:100:6027:18::a27d:4812|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /sh/raw/nf532hddgdt68ix/AABLYdubbTlj5-VgiCNKTTopa/classic-emd_tr_te_split.mat [following]
--2021-11-11 23:49:15--  https://www.dropbox.com/sh/raw/nf532hddgdt68ix/AABLYdubbTlj5-VgiCNKTTopa/classic-emd_tr_te_split.mat
Reusing existing connection to [www.dropbox.com]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://ucb8d7ecfa3c18b81101905abfdd.dl.dropboxu

HTTP request sent, awaiting response... 301 Moved Permanently
Location: /sh/raw/nf532hddgdt68ix/AABg21VwvpVlEXkAGwf4c3bxa/twitter-emd_tr_te_split.mat [following]
--2021-11-11 23:53:34--  https://www.dropbox.com/sh/raw/nf532hddgdt68ix/AABg21VwvpVlEXkAGwf4c3bxa/twitter-emd_tr_te_split.mat
Reusing existing connection to [www.dropbox.com]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://ucbf4191c6cb329113e75c8f84ae.dl.dropboxusercontent.com/cd/0/inline/BZxxBe85tJC6EU4J65RKF_dm9PkJ6ku0aiyLPmY3SKTgs4bjTpBv5gjofTC8hZKpT9oaPGqjppCJlY4unemHm4sNwbXPKGlESMqkw-N8ZKxKRyrhE0Ymar9EXivpkQKZqcXiCCmW-LMU8HM1OMD1xhgh/file# [following]
--2021-11-11 23:53:35--  https://ucbf4191c6cb329113e75c8f84ae.dl.dropboxusercontent.com/cd/0/inline/BZxxBe85tJC6EU4J65RKF_dm9PkJ6ku0aiyLPmY3SKTgs4bjTpBv5gjofTC8hZKpT9oaPGqjppCJlY4unemHm4sNwbXPKGlESMqkw-N8ZKxKRyrhE0Ymar9EXivpkQKZqcXiCCmW-LMU8HM1OMD1xhgh/file
Resolving ucbf4191c6cb329113e75c8f84ae.dl.dropboxusercontent.com (ucbf4191c6cb329113e75c8f84ae

--2021-11-11 23:53:51--  https://www.dropbox.com/sh/f44z3nt3i5279yt/AAD0yJEOwF158GbkB57fIS98a/all_classic.txt
Resolving www.dropbox.com (www.dropbox.com)... 2620:100:6027:18::a27d:4812, 162.125.72.18
Connecting to www.dropbox.com (www.dropbox.com)|2620:100:6027:18::a27d:4812|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /sh/raw/f44z3nt3i5279yt/AAD0yJEOwF158GbkB57fIS98a/all_classic.txt [following]
--2021-11-11 23:55:04--  https://www.dropbox.com/sh/raw/f44z3nt3i5279yt/AAD0yJEOwF158GbkB57fIS98a/all_classic.txt
Reusing existing connection to [www.dropbox.com]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc368305e9c7e9d35d3e13e358f5.dl.dropboxusercontent.com/cd/0/inline/BZw0GWdMSqt_lB0dDVAXxwdlhLf6UDKoh49lPmdKcw41ztT56rDzrq8vTNbVzgVUzf-W_iOM--KN5tPDG7OHSpU2Iab4h_VSuF5POcWKd9isjVRQAYGKqvA7HGte5GbPhGvBi0f7eaJ2wHVu6Hsppd2k/file# [following]
--2021-11-11 23:55:05--  https://uc368305e9c7e9d35d3e13e358f5.dl.dropboxuserconte

Resolving uc1dca98fb6fd38e5a2394558764.dl.dropboxusercontent.com (uc1dca98fb6fd38e5a2394558764.dl.dropboxusercontent.com)... 2620:100:6027:15::a27d:480f, 162.125.72.15
Connecting to uc1dca98fb6fd38e5a2394558764.dl.dropboxusercontent.com (uc1dca98fb6fd38e5a2394558764.dl.dropboxusercontent.com)|2620:100:6027:15::a27d:480f|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 277393 (271K) [text/plain]
Saving to: ‘data/raw_data/all_twitter_by_line.txt’


2021-11-11 23:55:16 (751 KB/s) - ‘data/raw_data/all_twitter_by_line.txt’ saved [277393/277393]

--2021-11-11 23:55:16--  https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-train-all-terms.txt
Resolving www.cs.umb.edu (www.cs.umb.edu)... 158.121.106.224
Connecting to www.cs.umb.edu (www.cs.umb.edu)|158.121.106.224|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2021-11-11 23:55:16 ERROR 404: Not Found.

--2021-11-11 23:55:17--  https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-test-al

We also clone the BERT repository

In [2]:
! git clone https://github.com/google-research/bert

Cloning into 'bert'...
remote: Enumerating objects: 340, done.[K
remote: Total 340 (delta 0), reused 0 (delta 0), pack-reused 340[K
Receiving objects: 100% (340/340), 328.28 KiB | 7.14 MiB/s, done.
Resolving deltas: 100% (182/182), done.


## 🔥 2. Further Pre-Training a BERT Model on the Datasets

This step of adjusting the BERT language model to the dataset is optional as it slightly improves results.

In [1]:
!git clone https://github.com/deepset-ai/FARM.git
%cd FARM
!pip install -r requirements.txt
!pip install --editable .
%cd ..
!pip install sentence_splitter

fatal: destination path 'FARM' already exists and is not an empty directory.
/home/sabine/Desktop/hotter_code/FARM
Collecting Werkzeug==0.16.1
  Using cached Werkzeug-0.16.1-py2.py3-none-any.whl (327 kB)


Installing collected packages: Werkzeug
  Attempting uninstall: Werkzeug
    Found existing installation: Werkzeug 1.0.1
    Uninstalling Werkzeug-1.0.1:
      Successfully uninstalled Werkzeug-1.0.1
Successfully installed Werkzeug-0.16.1
Obtaining file:///home/sabine/Desktop/hotter_code/FARM




Installing collected packages: farm
  Attempting uninstall: farm
    Found existing installation: farm 0.8.1-snapshot
    Uninstalling farm-0.8.1-snapshot:
      Successfully uninstalled farm-0.8.1-snapshot
  Running setup.py develop for farm
Successfully installed farm
/home/sabine/Desktop/hotter_code


In [1]:
import os
import logging
from pathlib import Path
import torch
from farm.data_handler.data_silo import DataSilo
from farm.data_handler.processor import BertStyleLMProcessor
from farm.modeling.adaptive_model import AdaptiveModel
from farm.modeling.language_model import LanguageModel
from farm.modeling.prediction_head import BertLMHead, NextSentenceHead
from farm.modeling.tokenization import Tokenizer
from farm.train import Trainer
from farm.modeling.optimization import initialize_optimizer
import re
from FARM.farm.utils import set_all_seeds, initialize_device_settings
import csv
from sentence_splitter import SentenceSplitter, split_text_into_sentences
import sys
csv.field_size_limit(sys.maxsize)
import traceback





11/12/2021 01:45:29 - INFO - farm.modeling.prediction_head -   Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .


### 💥 Select Dataset

Set the dataset you want to fine-tune a BERT model on. When the raw data is split by sentence, set the by_sent flag to True.


In [3]:
raw="./data/raw_data/bbsport_by_sentence.txt"
by_sent=True

In [10]:
def lm_finetuning():
    logging.basicConfig(
        format="%(asctime)s - %(levelname)s - %(name)s -   %(message)s",
        datefmt="%m/%d/%Y %H:%M:%S",
        level=logging.INFO,
    )

    set_all_seeds(seed=42)

    ##########################
    ########## Settings
    ##########################
    device, n_gpu = initialize_device_settings(use_cuda=False)
    n_epochs = 1
    batch_size = 32
    evaluate_every = 30
    lang_model = "bert-base-uncased"
    do_lower_case = False
    next_sent_pred_style = "bert-style"

    # 1.Create a tokenizer
    tokenizer = Tokenizer.load(
        pretrained_model_name_or_path=lang_model, do_lower_case=do_lower_case
    )

    # 2. Create a DataProcessor that handles all the conversion from raw text into a pytorch Dataset
    processor = BertStyleLMProcessor(
        data_dir=Path("."),
        train_filename="train.txt",
        test_filename="test.txt",
        dev_filename="dev.txt",
        tokenizer=tokenizer,
        max_seq_len=512,
        max_docs=3000000, # We have set max_docs to 20 to speed up data processing
        next_sent_pred=False,
        next_sent_pred_style=next_sent_pred_style
    )

    # 3. Create a DataSilo that loads several datasets (train/dev/test), provides DataLoaders for them and calculates a few descriptive statistics of our datasets
    data_silo = DataSilo(processor=processor, batch_size=batch_size ,caching=True)#, max_multiprocessing_chunksize=3)

    # 4. Create an AdaptiveModel
    # a) which consists of a pretrained language model as a basis
    language_model = LanguageModel.load(lang_model)
    # b) and *two* prediction heads on top that are suited for our task => Language Model finetuning
    lm_prediction_head = BertLMHead.load(lang_model)
    next_sentence_head = NextSentenceHead.load(lang_model)

    model = AdaptiveModel(
        language_model=language_model,
        prediction_heads=[lm_prediction_head, next_sentence_head],
        embeds_dropout_prob=0.1,
        lm_output_types=["per_token", "per_sequence"],
        device=device,
    )

    # 5. Create an optimizer
    model, optimizer, lr_schedule = initialize_optimizer(
        model=model,
        learning_rate=2e-5,
        device=device,
        n_batches=len(data_silo.loaders["train"]),
        n_epochs=n_epochs,
        grad_acc_steps=2,
    )

    # 6. Feed everything to the Trainer, which keeps care of growing our model into powerful plant and evaluates it from time to time
    trainer = Trainer.create_or_load_checkpoint(      
        model=model,
        optimizer=optimizer,
        data_silo=data_silo,
        epochs=n_epochs,
        n_gpu=n_gpu,
        lr_schedule=lr_schedule,
        evaluate_every=evaluate_every,
        device=device,
        grad_acc_steps=2,
        checkpoint_on_sigterm=True,
        checkpoint_every=200,
        checkpoint_root_dir=Path("."),
        resume_from_checkpoint="latest")
    # 7. Let it grow! Watch the tracked metrics live on the public mlflow server: https:/public-mlflow.deepset.ai
   # 
  
    trainer.train()
    # 8. Hooray! You have a model. Store it:
    save_dir = Path(".")
    model.save(save_dir)
    processor.save(save_dir)

### 💥 Further pre-train
Now that the model is specified, we can run the code for training the model on our chosen dataset. In case you get an out of memory error, you may want to set the above batch size to a smaller value or switch entirely to CPU. Hint: For this you have to specify use_cuda=False above and torch.device("cpu") below. You can also entirely skip this step in case you do not want to use a further pre-trained BERT model.

In [11]:
try:
        
        splitter = SentenceSplitter(language='en')
        data=[]
        txt_file=open(raw, "r", encoding='latin1')
        print("going through all data...")
        if not by_sent:
            for i, line in enumerate(txt_file.readlines()):
                if i%100==0:
                    print(i)
                data1=[]
                cols = line.split('\t')
                doc=(re.sub("[\r\n]+", "",cols[1]))
                sentences= splitter.split(text=doc)
                for text in sentences:
                    n = 511
                    
                    words = iter(text.split())
                    lines, current = [], next(words)
                    for word in words:
                        if len(current) + 1 + len(word) > n:
                            lines.append(current)
                            current = word
                        else:
                            current += " " + word
                    lines.append(current)  
                           
                    data1.append(("\n").join(lines) )           
                data.append(("\n").join(data1) )
                            
        else:
            sentences=[]
            doc_count=0
            for i, line in enumerate(txt_file.readlines()):
                cols = line.split('\t')
                if doc_count==int(cols[1]):
                    sentences.append(re.sub("[\r\n]+", "",cols[2]))
                else:
                    data1=[]
                    for v, text in enumerate(sentences):
                            
                            #print(v)
                            #print(text)
                            n = 511
                            if not re.match("\s+", text) :
                                words = iter(text.split())
                                
                                lines, current = [], next(words)
                                for word in words:
                                    if len(current) + 1 + len(word) > n:
                                        lines.append(current)
                                        current = word
                                    else:
                                        current += " " + word
                                lines.append(current)  
                                       
                                data1.append(("\n").join(lines) )
                    
                    data.append(("\n").join(data1))
                    sentences=[]
                    sentences.append(cols[2])
                    doc_count+=1  
      
        print("data sample")
        print(len(data))
        traininstances=int(len(data)*0.7)
        devinstances=int(len(data)*0.2)+traininstances
        text=("\n\n").join(data[:traininstances])
        devtext=("\n\n").join(data[traininstances:devinstances])
        testtext=("\n\n").join(data[devinstances:])
        print(len(text))
        with open("train.txt", "w", encoding="utf-8") as trainfile: 
            trainfile.write(text)
        with open("dev.txt", "w", encoding="utf-8") as devfile:
            devfile.write(devtext)
        with open("test.txt", "w", encoding="utf-8") as testfile:
            testfile.write(testtext)
        print(str("Cuda status: " +str(torch.cuda.is_available())))
        device = torch.device("cpu")#"cuda" if torch.cuda.is_available() else "cpu") 
        print("Devices available: {}".format(device))
        lm_finetuning()
except Exception as e:
        print(e)
        print(traceback.print_exc())


## 🔥 3. Extract Embeddings from the BERT Model
Here you either insert the model which you further pre-trained or standard models, such as bert-base-uncased. By adjusting the gpu, seq_length and num_processes parameters, you can ensure that the model runs efficiently on your machine.

In [6]:

import numpy as np
import re
import pickle as pkl
from farm.infer import Inferencer
logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)

#location of the fine-tuned model
ft="."


def map_raw(raw, by_sent):
    # this is for utf-8 encoding
    em=[]
    try:
        if by_sent:
            #raw_dict={}
            #cont_emb={}
            num_lines=sum(1 for line in open(raw, encoding="latin1"))
            z=open(raw, "r", encoding='latin1')
            doc_count=0
            sentences=[]
            #inferenced_model = Inferencer.load(ft, extraction_strategy="per_token", extraction_layer=-1, task_type="embeddings", gpu=False, max_seq_len=512, num_processes=5, disable_tqdm=True)
            inferenced_model = Inferencer.load("bert-base-uncased", extraction_strategy="per_token", extraction_layer=-1, task_type="embeddings", gpu=False, max_seq_len=512, num_processes=1, disable_tqdm=True)

            # the commented code below is for inputs that have sentences as 
            # the second column and where the document id is the first column 
            # (we start counting at 0)
            for i, line in enumerate(list(z.readlines())):
                articles=[]
                cols = line.split('\t')
                if doc_count==int(cols[1]):
                    sentences.append(re.sub("[\r\n]+", "",cols[2]))
                else:
                    #sentences=[" ".join(sentences)]
                    for s in sentences:
                        articles.append({'text': s})
                    result = inferenced_model.inference_from_dicts(articles)
                    #cont_emb[doc_count]=result
                    em.append(result)
                   
        
                    #print(result)
                    #input()
                    #print(cont_emb[doc_count])
                    print(str(str(i)+ " / " +str(num_lines)))
                    sentences=[]
                    sentences.append(cols[2])
                    doc_count+=1  
            #to get the last line too
            else:
                articles=[]
                sentences.append(cols[2])
                for s in sentences:
                    articles.append({'text': s})

                result = inferenced_model.inference_from_dicts(articles)
                em.append(result)
                print(doc_count)
                sentences=[]
            z.close()
            return em
                
            z.close()
        else:
            splitter = SentenceSplitter(language='en')
            num_lines=sum(1 for line in open(raw, encoding="latin1"))
            z=open(raw, "r", encoding='latin1')
           
            #inferenced_model = Inferencer.load(ft, extraction_strategy="per_token", extraction_layer=-1, task_type="embeddings", gpu=False, max_seq_len=512, num_processes=5, disable_tqdm=True)
            inferenced_model = Inferencer.load("bert-base-uncased", extraction_strategy="per_token", extraction_layer=-1, task_type="embeddings", gpu=False, max_seq_len=512, num_processes=1, disable_tqdm=True)

            for i, line in enumerate(list(z.readlines())):
             
                articles=[]
                cols = line.split('\t')
                doc=(re.sub("[\r\n]+", "",cols[1]))
                sentences= splitter.split(text=doc)
                for text in sentences:
                    
                    articles.append({'text': text})
                    
                    result = inferenced_model.inference_from_dicts(articles)
                  
                    em.append(result)
                   
                    print(str(str(i)+ " / " +str(num_lines)))
                    
            #to get the last line too
            else:
                articles=[]
                sentences= splitter.split(text=doc)
                for text in sentences:
                    articles.append({'text': text})
                    
                    result = inferenced_model.inference_from_dicts(articles)
                  
                    em.append(result)
                   
                    print(str(str(i)+ " / " +str(num_lines)))
            z.close()
            return em
                
            
    except Exception as E:
            print(E)
            print(traceback.print_exc())
            


if not os.path.exists(ft+"/%s_emb.pkl"%raw):
    
    embs=map_raw(ft+"/%s"%raw,by_sent)
    with open(ft+"/%s_emb.pkl"%os.path.splitext(raw)[0],"wb") as handle:
        pkl.dump(embs, handle)
    print("Done with embedding file.")


11/12/2021 01:47:43 - INFO - farm.utils -   Using device: CPU 
11/12/2021 01:47:43 - INFO - farm.utils -   Number of GPUs: 0
11/12/2021 01:47:43 - INFO - farm.utils -   Distributed Training: False
11/12/2021 01:47:43 - INFO - farm.utils -   Automatic Mixed Precision: None
11/12/2021 01:47:43 - INFO - farm.modeling.language_model -   
11/12/2021 01:47:43 - INFO - farm.modeling.language_model -   LOADING MODEL
11/12/2021 01:47:43 - INFO - farm.modeling.language_model -   Could not find bert-base-uncased locally.
11/12/2021 01:47:43 - INFO - farm.modeling.language_model -   Looking on Transformers Model Hub (in local cache and online)...
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.

0 / 12941


11/12/2021 01:47:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:47:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   greene crossed the line just number seconds behind gatlin  who won in number seconds in one of the closest and fastest sprints of all time 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11006, 4625, 1996, 2240, 2074, 2193, 3823, 2369, 11721, 19646, 2378, 2040, 2180, 1999, 2193, 3823, 1999, 2028, 1997, 1996, 7541, 1998, 7915, 9043, 2015, 1997, 2035, 2051,

1 / 12941


11/12/2021 01:48:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it was also agreed that a programme to  de mystify  the issue to athletes  the public and the media was a priority 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 2036, 3530, 2008, 1037, 4746, 2000, 2139, 2026, 16643, 12031, 1996, 3277, 2000, 7576, 1996, 2270, 1998, 1996, 2865, 2001, 1037, 9470, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

2 / 12941


11/12/2021 01:48:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   sweden s carolina kluft  the olympic heptathlon champion  and slovenia s jolanda ceplak had winning performances  too 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4701, 1055, 3792, 1047, 7630, 6199, 1996, 4386, 2002, 22799, 2705, 7811, 3410, 1998, 10307, 1055, 8183, 24448, 8292, 24759, 4817, 2018, 3045, 4616, 2205, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

3 / 12941


11/12/2021 01:48:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i am happy even if i know that maurice is a long way from being at his peak at the start of the season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2572, 3407, 2130, 2065, 1045, 2113, 2008, 7994, 2003, 1037, 2146, 2126, 2013, 2108, 2012, 2010, 4672, 2012, 1996, 2707, 1997, 1996, 2161, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

4 / 12941


11/12/2021 01:48:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: o sullivan commits to dublin racesonia o sullivan will seek to regain her title at the bupa great ireland run on 9 april in dublin 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 1051, 7624, 27791, 2000, 5772, 3837, 12488, 1051, 7624, 2097, 6148, 2000, 12452, 2014, 2516, 2012, 1996, 20934, 4502, 2307, 3163, 2448, 2006, 1023, 2258, 1999, 5772, 102, 0, 0, 0,

5 / 12941


11/12/2021 01:48:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: her coach aston moore told the times   we re not looking at any sooner than 2006  not as a triple jumper 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2014, 2873, 14327, 5405, 2409, 1996, 2335, 2057, 2128, 2025, 2559, 2012, 2151, 10076, 2084, 2294, 2025, 2004, 1037, 6420, 21097, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

6 / 12941


11/12/2021 01:48:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: that leap   37cm short of brazilian winner jadel gregorio s effort   was good enough to qualify for the european indoor championships 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 11679, 4261, 27487, 2460, 1997, 6142, 3453, 12323, 2140, 16973, 3695, 1055, 3947, 2001, 2204, 2438, 2000, 7515, 2005, 1996, 2647, 7169, 3219, 102, 0, 0, 0, 0, 0, 0, 0, 0, 

7 / 12941


11/12/2021 01:48:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i m looking forward to competing against britain s best sprinters and i m sure the 60 metres will be one of the most exciting races of the evening 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 2559, 2830, 2000, 6637, 2114, 3725, 1055, 2190, 19938, 2015, 1998, 1045, 1049, 2469, 1996, 3438, 3620, 2097, 2022, 2028, 1997, 1996, 2087, 10990, 3837,

8 / 12941


11/12/2021 01:48:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: in her absence  the gb team won bronze in brussels 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 2014, 6438, 1996, 16351, 2136, 2180, 4421, 1999, 9371, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

9 / 12941


11/12/2021 01:48:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  his ability is undoubted but all his best performances seem to happen in domestic meetings 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2010, 3754, 2003, 25672, 12083, 3064, 2021, 2035, 2010, 2190, 4616, 4025, 2000, 4148, 1999, 4968, 6295, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

10 / 12941


11/12/2021 01:48:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i am ready and prepared to represent my country 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2572, 3201, 1998, 4810, 2000, 5050, 2026, 2406, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

11 / 12941


11/12/2021 01:48:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:48:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: mcilroy now lives in windsor and feels his career has been transformed by the no nonsense leadership style of former army sergeant lester 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11338, 4014, 13238, 2085, 3268, 1999, 10064, 1998, 5683, 2010, 2476, 2038, 2042, 8590, 2011, 1996, 2053, 14652, 4105, 2806, 1997, 2280, 2390, 6722, 14131, 102, 0, 0, 0, 0, 0

12 / 12941


11/12/2021 01:49:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: uk athletics chief david moorcroft said   the athens experience can now be extended to more major championships 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2866, 6482, 2708, 2585, 16808, 14716, 2056, 1996, 7571, 3325, 2064, 2085, 2022, 3668, 2000, 2062, 2350, 3219, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

13 / 12941


11/12/2021 01:49:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they were set to learn their fate by the end of february  but late evidence from them has pushed the date back 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2020, 2275, 2000, 4553, 2037, 6580, 2011, 1996, 2203, 1997, 2337, 2021, 2397, 3350, 2013, 2068, 2038, 3724, 1996, 3058, 2067, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

14 / 12941


11/12/2021 01:49:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they were also alleged to have avoided tests in tel aviv and chicago before the games 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2020, 2036, 6884, 2000, 2031, 9511, 5852, 1999, 10093, 12724, 1998, 3190, 2077, 1996, 2399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

15 / 12941


11/12/2021 01:49:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the world junior champion clocked number seconds to finish well clear of fellow american bershawn jackson in arkansas 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2088, 3502, 3410, 5119, 2098, 2193, 3823, 2000, 3926, 2092, 3154, 1997, 3507, 2137, 2022, 2869, 14238, 2078, 4027, 1999, 6751, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

16 / 12941


11/12/2021 01:49:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: london hope over chepkemeilondon marathon organisers are hoping that banned athlete susan chepkemei will still take part in this year s race on 17 april 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2414, 3246, 2058, 18178, 2361, 3489, 26432, 7811, 5280, 8589, 22933, 2869, 2024, 5327, 2008, 7917, 8258, 6294, 18178, 2361, 3489, 26432, 2097, 2145, 2202, 21

17 / 12941


11/12/2021 01:49:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: edwards himself kept idowu off top spot at the manchester games 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7380, 2370, 2921, 8909, 5004, 2226, 2125, 2327, 3962, 2012, 1996, 5087, 2399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

18 / 12941


11/12/2021 01:49:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: she failed to run in the event  citing a leg injury 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2016, 3478, 2000, 2448, 1999, 1996, 2724, 8951, 1037, 4190, 4544, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

19 / 12941


11/12/2021 01:49:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   holmes ran a tactically perfect race to finish clear of france s hind dehiba and russia s svetlana cherkasova 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9106, 2743, 1037, 8608, 2135, 3819, 2679, 2000, 3926, 3154, 1997, 2605, 1055, 17666, 2139, 4048, 3676, 1998, 3607, 1055, 17917, 3388, 16695, 24188, 13716, 7103, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

20 / 12941


11/12/2021 01:49:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but kenteris  lawyer gregory ioannidis told bbc sport earlier this week he was confident the sprinters would be cleared of the charges of failing to give information on their location and refusing to submit to testing 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 5982, 11124, 2015, 5160, 7296, 22834, 11639, 28173, 2015, 2409, 4035, 4368, 3041, 2023,

21 / 12941


11/12/2021 01:49:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: race director matthew turnbull said   susan will add even more strength in depth to the world class line up 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2679, 2472, 5487, 26439, 2056, 6294, 2097, 5587, 2130, 2062, 3997, 1999, 5995, 2000, 1996, 2088, 2465, 2240, 2039, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

22 / 12941


11/12/2021 01:49:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: el guerrouj could meet bekele in march as the ethiopian is the defending world cross country champion over both the long and short courses 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3449, 19739, 2121, 22494, 3501, 2071, 3113, 2022, 11705, 2063, 1999, 2233, 2004, 1996, 15101, 2003, 1996, 6984, 2088, 2892, 2406, 3410, 2058, 2119, 1996, 2146, 1998, 2460, 

23 / 12941


11/12/2021 01:49:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:49:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as well as being hit with the ban  collins was stripped of her 2003 world and us indoor 200m titles 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2092, 2004, 2108, 2718, 2007, 1996, 7221, 6868, 2001, 10040, 1997, 2014, 2494, 2088, 1998, 2149, 7169, 3263, 2213, 4486, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

24 / 12941


11/12/2021 01:50:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the briton  made a dame in the new year honours list for taking 800m and 1 500m gold  won vital votes from the public  press and eaa member federations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 28101, 2239, 2081, 1037, 8214, 1999, 1996, 2047, 2095, 8762, 2862, 2005, 2635, 5385, 2213, 1998, 1015, 3156, 2213, 2751, 2180, 8995, 4494, 2013, 1996, 22

25 / 12941


11/12/2021 01:50:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the newport based athlete and team mates jason gardener  marlon devonish and mark lewis francis were rewarded with mbes in the new year honours list 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 9464, 2241, 8258, 1998, 2136, 14711, 4463, 19785, 25861, 7614, 4509, 1998, 2928, 4572, 4557, 2020, 14610, 2007, 20301, 2015, 1999, 1996, 2047, 2095, 8762, 2

26 / 12941


11/12/2021 01:50:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  everybody knows how much i enjoy competing in britain 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7955, 4282, 2129, 2172, 1045, 5959, 6637, 1999, 3725, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

27 / 12941


11/12/2021 01:50:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as only death can  it put the year s athletics happenings in a sharp perspective 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2069, 2331, 2064, 2009, 2404, 1996, 2095, 1055, 6482, 6230, 2015, 1999, 1037, 4629, 7339, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

28 / 12941


11/12/2021 01:50:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if i d had a half decent mark it might have motivated me more  but i won t be racing 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 1045, 1040, 2018, 1037, 2431, 11519, 2928, 2009, 2453, 2031, 12774, 2033, 2062, 2021, 1045, 2180, 1056, 2022, 3868, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

29 / 12941


11/12/2021 01:50:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: compatriots mulugeta wondimu  abiyote abate and markos geneti  the world indoor bronze medallist over 3000m  will race against bekele on 18 february 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4012, 4502, 18886, 12868, 14163, 7630, 18150, 2050, 2180, 22172, 2226, 11113, 28008, 12184, 19557, 2618, 1998, 28003, 2015, 4962, 3775, 1996, 2088, 7169, 4421, 28

30 / 12941


11/12/2021 01:50:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it added that kenteris and thanou had been  provisionally suspended pending the resolution of their cases  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2794, 2008, 5982, 11124, 2015, 1998, 2084, 7140, 2018, 2042, 10864, 2135, 6731, 14223, 1996, 5813, 1997, 2037, 3572, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

31 / 12941


11/12/2021 01:50:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and if even the smallest thing doesn t feel right when you are preparing to race a marathon  10 miles down the road it will hit you like a brick wall 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2065, 2130, 1996, 10479, 2518, 2987, 1056, 2514, 2157, 2043, 2017, 2024, 8225, 2000, 2679, 1037, 8589, 2184, 2661, 2091, 1996, 2346, 2009, 2097, 2718, 2017

32 / 12941


11/12/2021 01:50:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:50:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   so far a total of 13 athletes have been sanctioned for violations involving drugs associated with the balco doping scandal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 2521, 1037, 2561, 1997, 2410, 7576, 2031, 2042, 14755, 2005, 13302, 5994, 5850, 3378, 2007, 1996, 28352, 3597, 23799, 9446, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

33 / 12941


11/12/2021 01:51:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  conte is someone who is under federal indictment and has a record of issuing contradictory  inconsistent statements 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9530, 2618, 2003, 2619, 2040, 2003, 2104, 2976, 24265, 1998, 2038, 1037, 2501, 1997, 15089, 27894, 20316, 8635, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

34 / 12941


11/12/2021 01:51:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the only time you see british sprinters getting upset or riled is when there is a debate as to which one is better than the other   he claimed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2069, 2051, 2017, 2156, 2329, 19938, 2015, 2893, 6314, 2030, 15544, 3709, 2003, 2043, 2045, 2003, 1037, 5981, 2004, 2000, 2029, 2028, 2003, 2488, 2084, 1996, 206

35 / 12941


11/12/2021 01:51:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the reasons she was forced to pull out in athens   the niggling injuries  her lack of energy and the oppressive conditions   weren t at play here 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4436, 2016, 2001, 3140, 2000, 4139, 2041, 1999, 7571, 1996, 9152, 13871, 2989, 6441, 2014, 3768, 1997, 2943, 1998, 1996, 28558, 3785, 4694, 1056, 2012, 2377, 2

36 / 12941


11/12/2021 01:51:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   however  under international olympic committee  ioc  rules  athletes can only be stripped of their medals if caught within three years of the event 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 2104, 2248, 4386, 2837, 25941, 3513, 7576, 2064, 2069, 2022, 10040, 1997, 2037, 6665, 2065, 3236, 2306, 2093, 2086, 1997, 1996, 2724, 102, 0, 0, 0, 0, 0, 0

37 / 12941


11/12/2021 01:51:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   radcliffe described her decision to enter the new york marathon as  impulsive  but she is certain to have a tick list of personal goals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 22603, 2649, 2014, 3247, 2000, 4607, 1996, 2047, 2259, 8589, 2004, 17727, 23004, 2021, 2016, 2003, 3056, 2000, 2031, 1037, 16356, 2862, 1997, 3167, 3289, 102, 0, 0, 0, 0, 0,

38 / 12941


11/12/2021 01:51:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hayes  27  set an olympic record of number in winning the 100m hurdles 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 10192, 2676, 2275, 2019, 4386, 2501, 1997, 2193, 1999, 3045, 1996, 2531, 2213, 18608, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

39 / 12941


11/12/2021 01:51:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   radcliffe decided only recently to run in the race and many had doubted whether she had sufficiently recovered from her olympic ordeal just 11 weeks ago 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 22603, 2787, 2069, 3728, 2000, 2448, 1999, 1996, 2679, 1998, 2116, 2018, 12979, 3251, 2016, 2018, 12949, 6757, 2013, 2014, 4386, 23304, 2074, 2340, 3134, 32

40 / 12941


11/12/2021 01:51:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:51:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   collins  who has worked with javelin thrower steve backley in the past  started his career as a royal marine before becoming a pe teacher 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6868, 2040, 2038, 2499, 2007, 23426, 5466, 2121, 3889, 2067, 3051, 1999, 1996, 2627, 2318, 2010, 2476, 2004, 1037, 2548, 3884, 2077, 3352, 1037, 21877, 3836, 102, 0, 0, 0,

41 / 12941


11/12/2021 01:52:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: she was subsequently handed a two year ban in may this year and has admitted taking the stimulant modafinil 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2016, 2001, 3525, 4375, 1037, 2048, 2095, 7221, 1999, 2089, 2023, 2095, 1998, 2038, 4914, 2635, 1996, 2358, 5714, 7068, 3372, 16913, 10354, 5498, 2140, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

42 / 12941


11/12/2021 01:52:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a firm decision on whether the trial takes place is expected in january 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 3813, 3247, 2006, 3251, 1996, 3979, 3138, 2173, 2003, 3517, 1999, 2254, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

43 / 12941


11/12/2021 01:52:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the current system does not detect many of the substances being abused by athletes 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2783, 2291, 2515, 2025, 11487, 2116, 1997, 1996, 13978, 2108, 16999, 2011, 7576, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

44 / 12941


11/12/2021 01:52:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: part of the investigation has centred on whether they staged the crash 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2112, 1997, 1996, 4812, 2038, 16441, 2006, 3251, 2027, 9813, 1996, 5823, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

45 / 12941


11/12/2021 01:52:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: kenteris  lawyer gregory ioannidis told bbc sport   we refute both charges as unsubstantiated and illogical 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5982, 11124, 2015, 5160, 7296, 22834, 11639, 28173, 2015, 2409, 4035, 4368, 2057, 25416, 10421, 2119, 5571, 2004, 4895, 6342, 5910, 5794, 10711, 3064, 1998, 5665, 20734, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

46 / 12941


11/12/2021 01:52:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if it is felt there is case to answer  it would be for its national governing body  usa track and field  to take the appropriate disciplinary action   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2009, 2003, 2371, 2045, 2003, 2553, 2000, 3437, 2009, 2052, 2022, 2005, 2049, 2120, 8677, 2303, 3915, 2650, 1998, 2492, 2000, 2202, 1996, 6413, 

47 / 12941


11/12/2021 01:52:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but kenteris and thanou then went on to skip tests in tel aviv and chicago  when they decided to fly back to greece early 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 5982, 11124, 2015, 1998, 2084, 7140, 2059, 2253, 2006, 2000, 13558, 5852, 1999, 10093, 12724, 1998, 3190, 2043, 2027, 2787, 2000, 4875, 2067, 2000, 5483, 2220, 102, 0, 0, 0, 0, 0, 0, 

48 / 12941


11/12/2021 01:52:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the sprinters both sent written explanations to the iaaf  which have been taken into account 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 19938, 2015, 2119, 2741, 2517, 17959, 2000, 1996, 21259, 2029, 2031, 2042, 2579, 2046, 4070, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

49 / 12941


11/12/2021 01:52:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: she will also run in the grand prix in birmingham in february and may defend her indoor aaa 800m title in sheffield earlier that month 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2016, 2097, 2036, 2448, 1999, 1996, 2882, 5431, 1999, 6484, 1999, 2337, 1998, 2089, 6985, 2014, 7169, 13360, 5385, 2213, 2516, 1999, 8533, 3041, 2008, 3204, 102, 0, 0, 0, 0, 0,

50 / 12941


11/12/2021 01:52:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he was third in london in 2002 in his first serious attempt at the marathon 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2001, 2353, 1999, 2414, 1999, 2526, 1999, 2010, 2034, 3809, 3535, 2012, 1996, 8589, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

51 / 12941


11/12/2021 01:52:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:52:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  after five months we finally had the chance to give explanations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2044, 2274, 2706, 2057, 2633, 2018, 1996, 3382, 2000, 2507, 17959, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

52 / 12941


11/12/2021 01:53:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:53:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the double olympic champion said   i am very disappointed that i have been forced to withdraw 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3313, 4386, 3410, 2056, 1045, 2572, 2200, 9364, 2008, 1045, 2031, 2042, 3140, 2000, 10632, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

53 / 12941


11/12/2021 01:53:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:53:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: sotherton has also displayed promise  with a new high jump personal best in sheffield at the combined norwich union european trials and aaa championships  and a second place in the long jump behind jade johnson 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 12399, 2669, 2038, 2036, 6913, 4872, 2007, 1037, 2047, 2152, 5376, 3167, 2190, 1999, 8533, 201

54 / 12941


11/12/2021 01:53:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:53:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: after her olympic glory she emphatically denied she planned to retire 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2044, 2014, 4386, 8294, 2016, 7861, 21890, 25084, 6380, 2016, 3740, 2000, 11036, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

55 / 12941


11/12/2021 01:53:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:53:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   pole vaultermade a winning return to major competition after a drugs ban 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6536, 11632, 2121, 21565, 1037, 3045, 2709, 2000, 2350, 2971, 2044, 1037, 5850, 7221, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

56 / 12941


11/12/2021 01:53:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:53:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a couple of other interesting topics to look out for are the citizenship issues surrounding mark findlay and rabah yusuf 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 3232, 1997, 2060, 5875, 7832, 2000, 2298, 2041, 2005, 2024, 1996, 9068, 3314, 4193, 2928, 2424, 8485, 1998, 10958, 24206, 23495, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

57 / 12941


11/12/2021 01:54:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: cairns came in ahead of paul rowan and allan bogle in the men s race 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 21731, 2234, 1999, 3805, 1997, 2703, 14596, 1998, 8926, 22132, 2571, 1999, 1996, 2273, 1055, 2679, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

58 / 12941


11/12/2021 01:54:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fraser became the fastest british woman over the distance this season when she qualified for the final in number seconds   though that time is outside the european standard 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9443, 2150, 1996, 7915, 2329, 2450, 2058, 1996, 3292, 2023, 2161, 2043, 2016, 4591, 2005, 1996, 2345, 1999, 2193, 3823, 2295, 2008, 2051, 

59 / 12941


11/12/2021 01:54:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: willowfield s paul rowan will go in as individual favourite but annadale could have a tough job holding on to their team title as andrew dunwoody and noel pollock are unlikely to run 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11940, 3790, 1055, 2703, 14596, 2097, 2175, 1999, 2004, 3265, 8837, 2021, 4698, 5634, 2071, 2031, 1037, 7823, 3105, 3173, 2006, 

60 / 12941


11/12/2021 01:54:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   meanwhile  ceplak will be looking to follow up last saturday s win in boston with a fast time and victory in friday s night of athletics in erfurt  germany 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5564, 8292, 24759, 4817, 2097, 2022, 2559, 2000, 3582, 2039, 2197, 5095, 1055, 2663, 1999, 3731, 2007, 1037, 3435, 2051, 1998, 3377, 1999, 5958, 1055, 23

61 / 12941


11/12/2021 01:54:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the northern ireland runner set a new personal best of one minute  number seconds   a time good enough to qualify for the european indoor championships 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2642, 3163, 5479, 2275, 1037, 2047, 3167, 2190, 1997, 2028, 3371, 2193, 3823, 1037, 2051, 2204, 2438, 2000, 7515, 2005, 1996, 2647, 7169, 3219, 102, 0, 0

62 / 12941


11/12/2021 01:54:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i love the atmosphere  crowds and course and know it will always be a great quality race 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2293, 1996, 7224, 12783, 1998, 2607, 1998, 2113, 2009, 2097, 2467, 2022, 1037, 2307, 3737, 2679, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

63 / 12941


11/12/2021 01:54:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hungarian compatriots annus and fazelas both refused to give urine samples while russian korzhanenko tested positive for the steroid stanozolol 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5588, 4012, 4502, 18886, 12868, 5754, 2271, 1998, 6904, 12638, 3022, 2119, 4188, 2000, 2507, 17996, 8168, 2096, 2845, 12849, 15378, 28006, 16107, 7718, 3893, 2005, 199

64 / 12941


11/12/2021 01:54:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:54:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it can happen  i don t see why not   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2064, 4148, 1045, 2123, 1056, 2156, 2339, 2025, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

65 / 12941


11/12/2021 01:55:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:55:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: n morgan  birchfield harriers   c tomlinson  newham and essex beagles  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1050, 5253, 16421, 3790, 5292, 27051, 2015, 1039, 3419, 25051, 2047, 3511, 1998, 8862, 26892, 17125, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

66 / 12941


11/12/2021 01:55:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:55:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  won the men s event with a season s best of numberm  taking the scalp of world indoor champion savante stringfellow of the usa 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2180, 1996, 2273, 1055, 2724, 2007, 1037, 2161, 1055, 2190, 1997, 2193, 2213, 2635, 1996, 21065, 1997, 2088, 7169, 3410, 28350, 10111, 5164, 23510, 5004, 1997, 1996, 3915, 102, 0, 0, 

67 / 12941


11/12/2021 01:55:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:55:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: jason has a lot of experience indoors but for some reason he is struggling to maintain his pace through to the finish 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4463, 2038, 1037, 2843, 1997, 3325, 24274, 2021, 2005, 2070, 3114, 2002, 2003, 8084, 2000, 5441, 2010, 6393, 2083, 2000, 1996, 3926, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

68 / 12941


11/12/2021 01:55:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:55:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the birmingham athlete  who clocked a season s best of number seconds over 60m in birmingham last week  also prefers to focus his attentions on next month s european indoor championships 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 6484, 8258, 2040, 5119, 2098, 1037, 2161, 1055, 2190, 1997, 2193, 3823, 2058, 3438, 2213, 1999, 6484, 2197, 2733, 2036

69 / 12941


11/12/2021 01:55:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:55:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we talked about the sydney olympics and that made the time go past more quickly 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 5720, 2055, 1996, 3994, 3783, 1998, 2008, 2081, 1996, 2051, 2175, 2627, 2062, 2855, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

70 / 12941


11/12/2021 01:56:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but if i m doing this kind of thing  then i will have to see how it goes 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2065, 1045, 1049, 2725, 2023, 2785, 1997, 2518, 2059, 1045, 2097, 2031, 2000, 2156, 2129, 2009, 3632, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

71 / 12941


11/12/2021 01:56:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  because of previous injuries i don t even run up hills in training 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2138, 1997, 3025, 6441, 1045, 2123, 1056, 2130, 2448, 2039, 4564, 1999, 2731, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

72 / 12941


11/12/2021 01:56:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: his times of number and number seconds were well short of american maurice greene s 60m world record of numbersecs from 1998 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2010, 2335, 1997, 2193, 1998, 2193, 3823, 2020, 2092, 2460, 1997, 2137, 7994, 11006, 1055, 3438, 2213, 2088, 2501, 1997, 3616, 8586, 2015, 2013, 2687, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

73 / 12941


11/12/2021 01:56:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the morpeth harrier led from the end of the first lap and ended mike skinner and andrew baddeley s hopes with a surge in the lasp lap 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 22822, 22327, 2232, 5292, 27051, 2419, 2013, 1996, 2203, 1997, 1996, 2034, 5001, 1998, 3092, 3505, 17451, 1998, 4080, 2919, 9247, 3240, 1055, 8069, 2007, 1037, 12058, 1999

74 / 12941


11/12/2021 01:56:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: yelling takes cardiff hat trickeuropean cross country champion hayley yelling completed a hat trick of wins in the reebok cardiff cross challenge in bute park on sunday afternoon 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 13175, 3138, 10149, 6045, 7577, 11236, 17635, 2319, 2892, 2406, 3410, 10974, 3051, 13175, 2949, 1037, 6045, 7577, 1997, 5222, 1999,

75 / 12941


11/12/2021 01:56:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the scot  who led gb to world cross country bronze earlier this year  moved away from the field with ines monteiro halfway into the numberkm race 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 8040, 4140, 2040, 2419, 16351, 2000, 2088, 2892, 2406, 4421, 3041, 2023, 2095, 2333, 2185, 2013, 1996, 2492, 2007, 1999, 2229, 10125, 9711, 8576, 2046, 1996, 2

76 / 12941


11/12/2021 01:56:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they withdrew from the athens games after missing a drugs test at the olympic village on 12 august 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 6780, 2013, 1996, 7571, 2399, 2044, 4394, 1037, 5850, 3231, 2012, 1996, 4386, 2352, 2006, 2260, 2257, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

77 / 12941


11/12/2021 01:56:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the athletes are technically free to compete while the iaaf reviews its response to the decision to clear kenteris and thanou 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 7576, 2024, 10892, 2489, 2000, 5566, 2096, 1996, 21259, 4391, 2049, 3433, 2000, 1996, 3247, 2000, 3154, 5982, 11124, 2015, 1998, 2084, 7140, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

78 / 12941


11/12/2021 01:56:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   the statement continued   both athletes  cases will be refered to arbitration before the cas 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4861, 2506, 2119, 7576, 3572, 2097, 2022, 6523, 2098, 2000, 18010, 2077, 1996, 25222, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

79 / 12941


11/12/2021 01:56:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he has suffered greatly throughout this ordeal that has exposed both himself and his family to enormous pressures 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2038, 4265, 6551, 2802, 2023, 23304, 2008, 2038, 6086, 2119, 2370, 1998, 2010, 2155, 2000, 8216, 15399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

80 / 12941


11/12/2021 01:56:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: masai s fellow kenyan alice timbilil finished third 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16137, 4886, 1055, 3507, 20428, 5650, 5199, 14454, 4014, 2736, 2353, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

81 / 12941


11/12/2021 01:56:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:56:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: holmes will make her first track appearance on home soil since winning double olympic gold in january s norwich union international in glasgow 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9106, 2097, 2191, 2014, 2034, 2650, 3311, 2006, 2188, 5800, 2144, 3045, 3313, 4386, 2751, 1999, 2254, 1055, 12634, 2586, 2248, 1999, 6785, 102, 0, 0, 0, 0, 0, 0, 0, 0, 

82 / 12941


11/12/2021 01:57:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  her record speaks for herself and there are few other women distance runners who would dare compare their pedigree with tulu s   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2014, 2501, 8847, 2005, 2841, 1998, 2045, 2024, 2261, 2060, 2308, 3292, 7190, 2040, 2052, 8108, 12826, 2037, 21877, 4305, 28637, 2007, 10722, 7630, 1055, 2002, 2794, 102, 0

83 / 12941


11/12/2021 01:57:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   vivancos slashed his personal best to equal the spanish record with a time of numbersecs while kronberg and dorival clocked numbersecs and numbersecs respectively 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 20022, 15305, 2015, 23587, 2010, 3167, 2190, 2000, 5020, 1996, 3009, 2501, 2007, 1037, 2051, 1997, 3616, 8586, 2015, 2096, 1047, 4948, 4059, 1998,

84 / 12941


11/12/2021 01:57:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: keska  who has made a successful return from a long term injury lay off  contests the men s 12km race on 20 march  while 16 year old hickey goes in the junior men s 8km on the same day 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 17710, 8337, 2040, 2038, 2081, 1037, 3144, 2709, 2013, 1037, 2146, 2744, 4544, 3913, 2125, 15795, 1996, 2273, 1055, 2260, 2228

85 / 12941


11/12/2021 01:57:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he and three others were indicted in february by a federal grand jury for a variety of alleged offences 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 1998, 2093, 2500, 2020, 21801, 1999, 2337, 2011, 1037, 2976, 2882, 6467, 2005, 1037, 3528, 1997, 6884, 18421, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

86 / 12941


11/12/2021 01:57:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the iaaf   which said it was  very surprised  by the decision of the greek tribunal   is deciding whether to appeal against the decision at the court of arbitration for sport 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 21259, 2029, 2056, 2009, 2001, 2200, 4527, 2011, 1996, 3247, 1997, 1996, 3306, 12152, 2003, 10561, 3251, 2000, 5574, 2114, 1996, 3

87 / 12941


11/12/2021 01:57:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  of course this award is very special  but for me nothing will ever take away winning an olympic gold medal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1997, 2607, 2023, 2400, 2003, 2200, 2569, 2021, 2005, 2033, 2498, 2097, 2412, 2202, 2185, 3045, 2019, 4386, 2751, 3101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

88 / 12941


11/12/2021 01:57:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: balco has been accused by the united states anti doping agency  usada  of being the source of the banned steroid thg and modafinil 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 28352, 3597, 2038, 2042, 5496, 2011, 1996, 2142, 2163, 3424, 23799, 4034, 3915, 2850, 1997, 2108, 1996, 3120, 1997, 1996, 7917, 26261, 22943, 16215, 2290, 1998, 16913, 10354, 5498,

89 / 12941


11/12/2021 01:57:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he is a good person 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2003, 1037, 2204, 2711, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

90 / 12941


11/12/2021 01:57:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:57:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: costin aims for comeback in 2006jamie costin should be paralysed 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 3465, 2378, 8704, 2005, 12845, 1999, 2294, 3900, 9856, 3465, 2378, 2323, 2022, 11498, 2135, 6924, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

91 / 12941


11/12/2021 01:58:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: that is my goal and whatever i do between now and then will be geared to making the final 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 2003, 2026, 3125, 1998, 3649, 1045, 2079, 2090, 2085, 1998, 2059, 2097, 2022, 23636, 2000, 2437, 1996, 2345, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

92 / 12941


11/12/2021 01:58:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but he was beaten to the line on the relay anchor leg by lewis francis 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2002, 2001, 7854, 2000, 1996, 2240, 2006, 1996, 8846, 8133, 4190, 2011, 4572, 4557, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

93 / 12941


11/12/2021 01:58:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: kluft playing down record chancesweden s carolina kluft fears jackie joyner kersee s world record heptathlon points total of 7291 set at the 1988 olympics may never be surpassed 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 1047, 7630, 6199, 2652, 2091, 2501, 9592, 15557, 2368, 1055, 3792, 1047, 7630, 6199, 10069, 9901, 6569, 3678, 17710, 22573, 2063, 10

94 / 12941


11/12/2021 01:58:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i have enjoy working with her and wish her well 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2031, 5959, 2551, 2007, 2014, 1998, 4299, 2014, 2092, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

95 / 12941


11/12/2021 01:58:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   the 22 year old goes head to head with relay team mate jason gardener for a place in great britain s european championships squad at the trials in sheffield 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2570, 2095, 2214, 3632, 2132, 2000, 2132, 2007, 8846, 2136, 6775, 4463, 19785, 2005, 1037, 2173, 1999, 2307, 3725, 1055, 2647, 3219, 4686, 2012, 1

96 / 12941


11/12/2021 01:58:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but governing body uk athletics  uka  said the trials in sheffield were never in her plans 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 8677, 2303, 2866, 6482, 2866, 2050, 2056, 1996, 7012, 1999, 8533, 2020, 2196, 1999, 2014, 3488, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

97 / 12941


11/12/2021 01:58:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: when you have got a unique talent like she has  maybe it would work to try again 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2043, 2017, 2031, 2288, 1037, 4310, 5848, 2066, 2016, 2038, 2672, 2009, 2052, 2147, 2000, 3046, 2153, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

98 / 12941


11/12/2021 01:58:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the scheme also coincides with plans to introduce a new uk ranking system 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 5679, 2036, 19680, 2015, 2007, 3488, 2000, 8970, 1037, 2047, 2866, 5464, 2291, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

99 / 12941


11/12/2021 01:58:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: vettori and brendon mccullum  20  halted their decline by sharing a stand of 62 before mccullum gave a return catch to symonds  who finished with 3 41 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 29525, 29469, 1998, 7987, 10497, 2239, 23680, 18083, 2819, 2322, 12705, 2037, 6689, 2011, 6631, 1037, 3233, 1997, 5786, 2077, 23680, 18083, 2819, 2435, 1037, 27

100 / 12941


11/12/2021 01:58:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:58:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and he middles another down to long off 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2002, 2690, 2015, 2178, 2091, 2000, 2146, 2125, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

101 / 12941


11/12/2021 01:59:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: stuart matsikenyeri  barney rogers  hamilton masakadza  alester maregwede  brendan taylor  tatenda taibu  captain   gavin ewing  sean williams  tinashe panyangara  prosper utseya  christopher mpofu 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6990, 22281, 17339, 4890, 11124, 15377, 7369, 5226, 16137, 11905, 2094, 4143, 15669, 6238, 11941, 2290, 15557, 20

102 / 12941


11/12/2021 01:59:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he said   a lot has changed since that series 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2056, 1037, 2843, 2038, 2904, 2144, 2008, 2186, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

103 / 12941


11/12/2021 01:59:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he has the right attitude 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2038, 1996, 2157, 7729, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

104 / 12941


11/12/2021 01:59:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: iftikhar ali and tariq butt 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 3775, 22510, 4862, 1998, 16985, 18515, 10007, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

105 / 12941


11/12/2021 01:59:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he s got a lot of talent and i am sure he cannot be held back 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 1055, 2288, 1037, 2843, 1997, 5848, 1998, 1045, 2572, 2469, 2002, 3685, 2022, 2218, 2067, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

106 / 12941


11/12/2021 01:59:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   australia captain ricky ponting refuses to admit there is animosity between the sides despite bracewell s disquiet 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2660, 2952, 11184, 21179, 2075, 10220, 2000, 6449, 2045, 2003, 2019, 16339, 17759, 2090, 1996, 3903, 2750, 17180, 4381, 1055, 4487, 2015, 15549, 3388, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

107 / 12941


11/12/2021 01:59:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 01:59:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ahmedabad was the scene of religious riots in 2002 and pakistan s cricketers had been wary of playing there 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 27249, 2001, 1996, 3496, 1997, 3412, 12925, 1999, 2526, 1998, 4501, 1055, 9490, 2015, 2018, 2042, 15705, 1997, 2652, 2045, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

108 / 12941


11/12/2021 02:00:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: lance hamilton replaces tuffey for the fourth odi in wellington on tuesday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9993, 5226, 20736, 10722, 24513, 2005, 1996, 2959, 21045, 1999, 8409, 2006, 9857, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

109 / 12941


11/12/2021 02:00:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pakistan arrive in delhi on 28 february for their first tour to india in six years  which includes three tests and six one day games 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4501, 7180, 1999, 6768, 2006, 2654, 2337, 2005, 2037, 2034, 2778, 2000, 2634, 1999, 2416, 2086, 2029, 2950, 2093, 5852, 1998, 2416, 2028, 2154, 2399, 102, 0, 0, 0, 0, 0, 0, 0, 0,

110 / 12941


11/12/2021 02:00:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: this is a new start for me and i am just itching to get out on the field   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2023, 2003, 1037, 2047, 2707, 2005, 2033, 1998, 1045, 2572, 2074, 2009, 8450, 2000, 2131, 2041, 2006, 1996, 2492, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

111 / 12941


11/12/2021 02:00:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they set an even pace  notching half centuries from successive deliveries  but still looked under par when kallis was run out by a brilliant direct throw from paul collingwood at point 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2275, 2019, 2130, 6393, 18624, 2075, 2431, 4693, 2013, 11165, 23534, 2021, 2145, 2246, 2104, 11968, 2043, 10556, 21711, 

112 / 12941


11/12/2021 02:00:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   in theory there would be no major conflict of interests this summer as zimbabwe s tour to south africa ends in mid march and they are not in action again until september  when they play host to new zealand 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 3399, 2045, 2052, 2022, 2053, 2350, 4736, 1997, 5426, 2023, 2621, 2004, 11399, 1055, 2778, 2000, 

113 / 12941


11/12/2021 02:00:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: new zealand trail 2 0 and need to win in auckland to keep the five match series alive 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2047, 3414, 4446, 1016, 1014, 1998, 2342, 2000, 2663, 1999, 8666, 2000, 2562, 1996, 2274, 2674, 2186, 4142, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

114 / 12941


11/12/2021 02:00:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: van jaarsveld s domestic side in south africa  the centurion based titans  said the batsman may play for them in future 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3158, 14855, 11650, 15985, 2094, 1055, 4968, 2217, 1999, 2148, 3088, 1996, 9358, 9496, 2239, 2241, 13785, 2056, 1996, 13953, 2089, 2377, 2005, 2068, 1999, 2925, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0

115 / 12941


11/12/2021 02:00:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: jayasuriya set to join somersetsomerset are expected to announce later on thursday that sri lankan batsman sanath jayasuriya will join the county for the start of the 2005 season 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 24120, 26210, 8717, 2275, 2000, 3693, 9198, 14045, 22573, 2102, 2024, 3517, 2000, 14970, 2101, 2006, 9432, 2008, 5185, 16159, 13953

116 / 12941


11/12/2021 02:00:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:00:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: streak made 68 bu the home side won the third and final one day international by five wickets at port elizabeth 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9039, 2081, 6273, 20934, 1996, 2188, 2217, 2180, 1996, 2353, 1998, 2345, 2028, 2154, 2248, 2011, 2274, 10370, 2012, 3417, 3870, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

117 / 12941


11/12/2021 02:01:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: australia were given a boost when matthew hayden overcame a pleurisy scare to be available for all matches 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2660, 2020, 2445, 1037, 12992, 2043, 5487, 13872, 26463, 1037, 20228, 11236, 2483, 2100, 12665, 2000, 2022, 2800, 2005, 2035, 3503, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

118 / 12941


11/12/2021 02:01:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but he would have to dislodge rob key and mark butcher from the waiting list  and ian bell has also staked a claim for a place 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2002, 2052, 2031, 2000, 4487, 14540, 7716, 3351, 6487, 3145, 1998, 2928, 14998, 2013, 1996, 3403, 2862, 1998, 4775, 4330, 2038, 2036, 8406, 2094, 1037, 4366, 2005, 1037, 2173, 10

119 / 12941


11/12/2021 02:01:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: jayasuriya  35  had rejected an offer to play for the scottish saltires 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 24120, 26210, 8717, 3486, 2018, 5837, 2019, 3749, 2000, 2377, 2005, 1996, 4104, 5474, 7442, 2015, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

120 / 12941


11/12/2021 02:01:16 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:16 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pakistan batsmen make edgy starttour match  dharamsala  day one  stumps  pakistan 165 5 v indian board president s xiat stumps on day one of three  the tourists were 165 5 in dharamsala  with only 45 overs possible because of rain 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 4501, 12236, 3549, 2191, 3968, 6292, 2707, 21163, 2674, 28144, 5400, 5244, 7911

121 / 12941


11/12/2021 02:01:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: akhtar will miss the test series because of hamstring trouble  but is hoping to return for the one dayers 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 17712, 22893, 2099, 2097, 3335, 1996, 3231, 2186, 2138, 1997, 10654, 3367, 4892, 4390, 2021, 2003, 5327, 2000, 2709, 2005, 1996, 2028, 2154, 2545, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

122 / 12941


11/12/2021 02:01:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   sachin tendulkar  who will bat at number four with ganguly at five  also received the support of his skipper despite concerns over fitness and form 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 17266, 10606, 7166, 5313, 6673, 2040, 2097, 7151, 2012, 2193, 2176, 2007, 6080, 5313, 2100, 2012, 2274, 2036, 2363, 1996, 2490, 1997, 2010, 23249, 2750, 5936, 20

123 / 12941


11/12/2021 02:01:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: elliott said he was confident he had bowled with a similar intensity to that of test matches during the camera test 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9899, 2056, 2002, 2001, 9657, 2002, 2018, 19831, 2007, 1037, 2714, 8015, 2000, 2008, 1997, 3231, 3503, 2076, 1996, 4950, 3231, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

124 / 12941


11/12/2021 02:01:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: inzamam beseeched his side to make up for the 2004 defeat to india on home soil  where they lost 2 1 in the tests and 3 2 in the one day internationals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 20722, 3286, 2022, 19763, 7690, 2010, 2217, 2000, 2191, 2039, 2005, 1996, 2432, 4154, 2000, 2634, 2006, 2188, 5800, 2073, 2027, 2439, 1016, 1015, 1999, 1

125 / 12941


11/12/2021 02:01:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:01:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: zimbabwe  who gave a debut to teenager sean williams  chose to field first after skipper tatenda taibu won the toss 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11399, 2040, 2435, 1037, 2834, 2000, 10563, 5977, 3766, 4900, 2000, 2492, 2034, 2044, 23249, 9902, 8943, 13843, 8569, 2180, 1996, 10055, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

126 / 12941


11/12/2021 02:02:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: opening batsman justin langer and leg spinner shane warne will join up with them on monday 7 march 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3098, 13953, 6796, 21395, 2099, 1998, 4190, 6714, 3678, 8683, 11582, 2063, 2097, 3693, 2039, 2007, 2068, 2006, 6928, 1021, 2233, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

127 / 12941


11/12/2021 02:02:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: tuesday s practice session had to be cancelled because of rain and the forecast is gloomy until the weekend 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9857, 1055, 3218, 5219, 2018, 2000, 2022, 8014, 2138, 1997, 4542, 1998, 1996, 19939, 2003, 24067, 2100, 2127, 1996, 5353, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

128 / 12941


11/12/2021 02:02:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: in recent years he worked in the club s commercial department before taking on the presidency 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 3522, 2086, 2002, 2499, 1999, 1996, 2252, 1055, 3293, 2533, 2077, 2635, 2006, 1996, 8798, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

129 / 12941


11/12/2021 02:02:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: streak return could lift zimbabweone day international  port elizabeth  south africa v zimbabwe   starts wednesday 1230 gmttrailing 2 0 in the one day series  news of streak s return has given the tourists a lift 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 9039, 2709, 2071, 6336, 11399, 5643, 2154, 2248, 3417, 3870, 2148, 3088, 1058, 11399, 4627, 9317,

130 / 12941


11/12/2021 02:02:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  our seam attack had done particularly well during the pakistan tour last year  and we will select the final eleven after having a look at the wickets in mohali  which has had a record of a seaming track 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2256, 25180, 2886, 2018, 2589, 3391, 2092, 2076, 1996, 4501, 2778, 2197, 2095, 1998, 2057, 2097, 7276, 1996

131 / 12941


11/12/2021 02:02:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we can make a comparison to the video from the match when he was reported   things such as ball speed  arm rotation  position of the body and technique   he explained 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2064, 2191, 1037, 7831, 2000, 1996, 2678, 2013, 1996, 2674, 2043, 2002, 2001, 2988, 2477, 2107, 2004, 3608, 3177, 2849, 9963, 2597, 1997,

132 / 12941


11/12/2021 02:02:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the ground is 1 300 feet above sea level but pakistan do not expect conditions to be a problem 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2598, 2003, 1015, 3998, 2519, 2682, 2712, 2504, 2021, 4501, 2079, 2025, 5987, 3785, 2000, 2022, 1037, 3291, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

133 / 12941


11/12/2021 02:02:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:02:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he was part of the first windies team to defeat england in a test series in 1950  before quitting the first class game to focus on his legal career 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2001, 2112, 1997, 1996, 2034, 3612, 3111, 2136, 2000, 4154, 2563, 1999, 1037, 3231, 2186, 1999, 3925, 2077, 8046, 3436, 1996, 2034, 2465, 2208, 2000, 3579, 2

134 / 12941


11/12/2021 02:03:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  obviously there s an element of concern at not having played at this level for a while  but it s not like i haven t been playing cricket 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5525, 2045, 1055, 2019, 5783, 1997, 5142, 2012, 2025, 2383, 2209, 2012, 2023, 2504, 2005, 1037, 2096, 2021, 2009, 1055, 2025, 2066, 1045, 4033, 1056, 2042, 2652, 4533, 102, 

135 / 12941


11/12/2021 02:03:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the pakistan squad is due to arrive in india on monday to play three tests and six one day internationals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4501, 4686, 2003, 2349, 2000, 7180, 1999, 2634, 2006, 6928, 2000, 2377, 2093, 5852, 1998, 2416, 2028, 2154, 27340, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

136 / 12941


11/12/2021 02:03:16 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:16 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: c h gayle  w w hinds  r r sarwan  b c lara  capt  s chanderpaul  r l powell  d j j bravo  c o browne  wkt  i d r bradshaw  r d king  p t collins 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1039, 1044, 5637, 2571, 1059, 1059, 17666, 2015, 1054, 1054, 18906, 7447, 1038, 1039, 13679, 14408, 1055, 9212, 4063, 4502, 5313, 1054, 1048, 8997, 1040, 1046, 1046, 

137 / 12941


11/12/2021 02:03:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   bangladesh have won only nine of their 106 matches since making their one day debut in 1986 and clinched their maiden test series against zimbabwe earlier in january 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7269, 2031, 2180, 2069, 3157, 1997, 2037, 10114, 3503, 2144, 2437, 2037, 2028, 2154, 2834, 1999, 3069, 1998, 18311, 2037, 10494, 3231, 2186, 21

138 / 12941


11/12/2021 02:03:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: flintoff batted in the nets but will not attempt to bowl until tuesday and could play solely as a batsman 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 13493, 7245, 12822, 1999, 1996, 16996, 2021, 2097, 2025, 3535, 2000, 4605, 2127, 9857, 1998, 2071, 2377, 9578, 2004, 1037, 13953, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

139 / 12941


11/12/2021 02:03:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  sourav ganguly  ind capt   sanath jayasuriya  sl   virender sehwag  ind   rahul dravid  ind   yousuf youhana  pak   kumar sangakkara  sl   abdul razzaq  pak   chaminda vaas  sl   zaheer khan  ind   anil kumble  ind   muttiah muralitharan  sl  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 14768, 11431, 6080, 5313, 2100, 27427, 14408, 2624, 8988, 24120, 26

140 / 12941


11/12/2021 02:03:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we ll see how he goes in the nets and hopefully he ll come through unscathed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2222, 2156, 2129, 2002, 3632, 1999, 1996, 16996, 1998, 11504, 2002, 2222, 2272, 2083, 4895, 15782, 23816, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

141 / 12941


11/12/2021 02:03:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:03:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he said   we re allowed the odd blip here and there 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2056, 2057, 2128, 3039, 1996, 5976, 1038, 15000, 2182, 1998, 2045, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

142 / 12941


11/12/2021 02:04:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and soon india s last line of defence was dismantled when glenn mcgrath and first slip warne combined to oust kaif for 55   a second successive test match fifty 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2574, 2634, 1055, 2197, 2240, 1997, 4721, 2001, 17293, 2043, 9465, 23220, 1998, 2034, 7540, 11582, 2063, 4117, 2000, 15068, 3367, 11928, 2546, 2

143 / 12941


11/12/2021 02:04:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: jason gillespie  lbw to zaheer  and kasprowicz  bowled by agit agarkar  both fell cheaply  leaving mcgrath as clarke s last hope of a second test century 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4463, 21067, 6053, 2860, 2000, 23564, 21030, 2099, 1998, 10556, 13102, 10524, 18682, 19831, 2011, 12943, 4183, 12943, 17007, 2906, 2119, 3062, 10036, 2135, 2

144 / 12941


11/12/2021 02:04:16 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:16 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: that s a pleasing thing   it s what you need in a good side   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 1055, 1037, 24820, 2518, 2009, 1055, 2054, 2017, 2342, 1999, 1037, 2204, 2217, 2002, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

145 / 12941


11/12/2021 02:04:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i think it s been brought in through pressure from sri lanka and murali s supporters   boycott told bbc sport 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2228, 2009, 1055, 2042, 2716, 1999, 2083, 3778, 2013, 5185, 7252, 1998, 15533, 2072, 1055, 6793, 17757, 2409, 4035, 4368, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

146 / 12941


11/12/2021 02:04:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the remainder of the party will link up with the other members of the test squad in johannesburg  in preparation for the first test  starting in port elizabeth on 17 december 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 6893, 1997, 1996, 2283, 2097, 4957, 2039, 2007, 1996, 2060, 2372, 1997, 1996, 3231, 4686, 1999, 15976, 1999, 7547, 2005, 1996, 203

147 / 12941


11/12/2021 02:04:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wilson has been included by the kiwis to face a world xi in a one day series in aid of the tsunami victims 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4267, 2038, 2042, 2443, 2011, 1996, 11382, 9148, 2015, 2000, 2227, 1037, 2088, 8418, 1999, 1037, 2028, 2154, 2186, 1999, 4681, 1997, 1996, 19267, 5694, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

148 / 12941


11/12/2021 02:04:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:04:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: having seemed set to guide his team home  bell s composed innings ended 26 runs short of the victory target when he edged his 115th delivery to taibu 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2383, 2790, 2275, 2000, 5009, 2010, 2136, 2188, 4330, 1055, 3605, 7202, 3092, 2656, 3216, 2460, 1997, 1996, 3377, 4539, 2043, 2002, 13011, 2010, 10630, 2705, 695

149 / 12941


11/12/2021 02:05:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we ve been moving on for a number of years now  narrowly missing out on the world cup but this is the next best thing   he told bbc sport 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2310, 2042, 3048, 2006, 2005, 1037, 2193, 1997, 2086, 2085, 11866, 4394, 2041, 2006, 1996, 2088, 2452, 2021, 2023, 2003, 1996, 2279, 2190, 2518, 2002, 2409, 4035, 436

150 / 12941


11/12/2021 02:05:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: boje missed the recent tour of india because of fears he would be called in for questioning by indian police over match fixing allegations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8945, 6460, 4771, 1996, 3522, 2778, 1997, 2634, 2138, 1997, 10069, 2002, 2052, 2022, 2170, 1999, 2005, 11242, 2011, 2796, 2610, 2058, 2674, 15887, 9989, 102, 0, 0, 0, 0, 0,

151 / 12941


11/12/2021 02:05:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: nicky boje  who underwent minor surgery to remove a growth in his neck  will have to pass a fitness test before the match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 20158, 8945, 6460, 2040, 9601, 3576, 5970, 2000, 6366, 1037, 3930, 1999, 2010, 3300, 2097, 2031, 2000, 3413, 1037, 10516, 3231, 2077, 1996, 2674, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

152 / 12941


11/12/2021 02:05:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pakistan opted to leave out pacemen shoaib akhtar and mohammad sami 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4501, 12132, 2000, 2681, 2041, 6393, 3549, 26822, 4886, 2497, 17712, 22893, 2099, 1998, 12050, 17015, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

153 / 12941


11/12/2021 02:05:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: at the time of his injury  muralitharan held the world record of 532 test wickets  which has since been surpassed by australia s shane warne 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2012, 1996, 2051, 1997, 2010, 4544, 15533, 26054, 5521, 2218, 1996, 2088, 2501, 1997, 5187, 2475, 3231, 10370, 2029, 2038, 2144, 2042, 15602, 2011, 2660, 1055, 8683, 1158

154 / 12941


11/12/2021 02:05:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: m hayden  a gilchrist  wkt   r ponting  capt   d lehmann  d martyn  a symonds  m clarke  b hogg  j gillespie  b lee  g mcgrath 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1049, 13872, 1037, 13097, 26654, 1059, 25509, 1054, 21179, 2075, 14408, 1040, 28444, 2078, 1040, 12578, 2078, 1037, 25353, 11442, 2015, 1049, 8359, 1038, 27589, 2290, 1046, 21067, 1038

155 / 12941


11/12/2021 02:05:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i always respect the opposition and bangladesh are no exception 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2467, 4847, 1996, 4559, 1998, 7269, 2024, 2053, 6453, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

156 / 12941


11/12/2021 02:05:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and the team have now given the green light for the games scheduled for chittagong  which will stage the second test and first one day international 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 1996, 2136, 2031, 2085, 2445, 1996, 2665, 2422, 2005, 1996, 2399, 5115, 2005, 9610, 5946, 17036, 2029, 2097, 2754, 1996, 2117, 3231, 1998, 2034, 2028, 2154,

157 / 12941


11/12/2021 02:05:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they did well to run australia close in sydney after being reduced to 86 6  with chris cairns  kyle milss and daniel vettori showing the depth of their batting talent 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2106, 2092, 2000, 2448, 2660, 2485, 1999, 3994, 2044, 2108, 4359, 2000, 6564, 1020, 2007, 3782, 21731, 7648, 23689, 4757, 1998, 3817, 2952

158 / 12941


11/12/2021 02:05:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ashraful ended unbeaten on 60  having hit six fours and faced 135 balls 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6683, 27528, 5313, 3092, 20458, 2006, 3438, 2383, 2718, 2416, 23817, 1998, 4320, 11502, 7395, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

159 / 12941


11/12/2021 02:05:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:05:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: nafis iqbal and rajin saleh were also ajudged lbw by umpire jeremy lloyds off consecutive balls in pathan s fifth over 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6583, 8873, 2015, 28111, 1998, 11948, 2378, 5096, 2232, 2020, 2036, 19128, 27066, 6053, 2860, 2011, 20887, 7441, 6746, 2015, 2125, 5486, 7395, 1999, 4130, 2319, 1055, 3587, 2058, 102, 0, 0, 0,

160 / 12941


11/12/2021 02:06:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: australia will enter the pakistan game full of confidence after beating new zealand 2 0 in their recent test series 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2660, 2097, 4607, 1996, 4501, 2208, 2440, 1997, 7023, 2044, 6012, 2047, 3414, 1016, 1014, 1999, 2037, 3522, 3231, 2186, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

161 / 12941


11/12/2021 02:06:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it might look a bit juicy but will probably play pretty good 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2453, 2298, 1037, 2978, 28900, 2021, 2097, 2763, 2377, 3492, 2204, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

162 / 12941


11/12/2021 02:06:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: having pulled charl willoughby through mid wicket for four and clipped him away to reach three figures  vaughan chased a wide one from the left armer end edged to keeper mark boucher 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2383, 2766, 25869, 2140, 24919, 2083, 3054, 12937, 2005, 2176, 1998, 20144, 2032, 2185, 2000, 3362, 2093, 4481, 14461, 13303, 10

163 / 12941


11/12/2021 02:06:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: s ganguly  capt   v sehwag  g gambhir  s tendulkar  r dravid  m kaif  d karthik  wkt   i pathan  a kumble  harbhajan singh  z khan 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1055, 6080, 5313, 2100, 14408, 1058, 7367, 18663, 2290, 1043, 11721, 14905, 11961, 1055, 7166, 5313, 6673, 1054, 2852, 18891, 2094, 1049, 11928, 2546, 1040, 10556, 15265, 5480, 105

164 / 12941


11/12/2021 02:06:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   richardson began his career as a spin bowler but gained a reputation as a patient accumulator of runs  scoring four test centuries 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9482, 2211, 2010, 2476, 2004, 1037, 6714, 14999, 2021, 4227, 1037, 5891, 2004, 1037, 5776, 16222, 2819, 20350, 1997, 3216, 4577, 2176, 3231, 4693, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0,

165 / 12941


11/12/2021 02:06:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we lacked intensity and we need to rise to the level south africa will be at   thorpe commented 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 10858, 8015, 1998, 2057, 2342, 2000, 4125, 2000, 1996, 2504, 2148, 3088, 2097, 2022, 2012, 20249, 7034, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

166 / 12941


11/12/2021 02:06:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but he then perished when holing out to third man off jones 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2002, 2059, 23181, 2043, 7570, 2989, 2041, 2000, 2353, 2158, 2125, 3557, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

167 / 12941


11/12/2021 02:06:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:06:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bskyb lands england deallive coverage of england s home test matches will no longer be available on terrestrial tv from 2006 onwards 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 18667, 4801, 2497, 4915, 2563, 3066, 3669, 3726, 6325, 1997, 2563, 1055, 2188, 3231, 3503, 2097, 2053, 2936, 2022, 2800, 2006, 12350, 2694, 2013, 2294, 9921, 102, 0, 0, 0, 0, 0,

168 / 12941


11/12/2021 02:07:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but langer stood firm throughout the day to wrest back the initiative  displaying rare flair in his 21st test century and fourth against pakistan 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 21395, 2099, 2768, 3813, 2802, 1996, 2154, 2000, 23277, 4355, 2067, 1996, 6349, 14962, 4678, 22012, 1999, 2010, 7398, 3231, 2301, 1998, 2959, 2114, 4501, 102, 

169 / 12941


11/12/2021 02:07:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fast bowler dale steyn is poised to make his test debut  but all rounder jacques kallis is still troubled by an ankle injury and is expected to play as a batsman only 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3435, 14999, 8512, 26261, 6038, 2003, 22303, 2000, 2191, 2010, 3231, 2834, 2021, 2035, 2461, 2121, 7445, 10556, 21711, 2003, 2145, 11587, 2011, 

170 / 12941


11/12/2021 02:07:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the squad has been reduced to 12 for the first three games of the one day series to allow as many players as possible to play in domestic first class competition 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4686, 2038, 2042, 4359, 2000, 2260, 2005, 1996, 2034, 2093, 2399, 1997, 1996, 2028, 2154, 2186, 2000, 3499, 2004, 2116, 2867, 2004, 2825, 2000,

171 / 12941


11/12/2021 02:07:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: south africa were severely limited by having five right arm seamers at their disposal and no spinner and when the score rattled along to 152 0 shortly after tea they looked in deep trouble 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2148, 3088, 2020, 8949, 3132, 2011, 2383, 2274, 2157, 2849, 25180, 2545, 2012, 2037, 13148, 1998, 2053, 6714, 3678, 1998, 

172 / 12941


11/12/2021 02:07:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pakistan s cause was not helped by the absence of skipper inzamam ul haq and paceman shoaib akhtar for most of the afternoon 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4501, 1055, 3426, 2001, 2025, 3271, 2011, 1996, 6438, 1997, 23249, 1999, 20722, 3286, 17359, 5292, 4160, 1998, 6393, 2386, 26822, 4886, 2497, 17712, 22893, 2099, 2005, 2087, 1997, 1996, 

173 / 12941


11/12/2021 02:07:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it took the introduction of giles to oust rudolph  caught at slip  for a nicely crafted 29  but smith  33  and jacques kallis  10  comfortably saw south africa out of the red and to the safety of stumps 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2165, 1996, 4955, 1997, 13287, 2000, 15068, 3367, 18466, 3236, 2012, 7540, 2005, 1037, 19957, 19275, 2

174 / 12941


11/12/2021 02:07:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:07:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: england last held the ashes in 1987 but have won eight tests in a row in the past year and stewart said   they re going from strength to strength 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2563, 2197, 2218, 1996, 11289, 1999, 3055, 2021, 2031, 2180, 2809, 5852, 1999, 1037, 5216, 1999, 1996, 2627, 2095, 1998, 5954, 2056, 2027, 2128, 2183, 2013, 3997, 20

175 / 12941


11/12/2021 02:08:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:08:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: aleem dar and mahbubur rahman 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 15669, 6633, 18243, 1998, 5003, 2232, 8569, 8569, 2099, 14364, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

176 / 12941


11/12/2021 02:08:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:08:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it will be the first competition they have taken part in since winning the icc champions trophy in england three months ago 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2097, 2022, 1996, 2034, 2971, 2027, 2031, 2579, 2112, 1999, 2144, 3045, 1996, 16461, 3966, 5384, 1999, 2563, 2093, 2706, 3283, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

177 / 12941


11/12/2021 02:08:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:08:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: england hopeful over jonesengland are hopeful that simon jones will be fit for the boxing day test against south africa despite going down with a stomach bug 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2563, 17772, 2058, 3557, 13159, 3122, 2024, 17772, 2008, 4079, 3557, 2097, 2022, 4906, 2005, 1996, 8362, 2154, 3231, 2114, 2148, 3088, 2750, 2183, 2091,

178 / 12941


11/12/2021 02:08:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:08:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: boje finds some early spin with his third delivery and catches the edge of vaughan s bat as the england captain pushes forward  but de villiers spills the chance and the ball falls to safety 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8945, 6460, 4858, 2070, 2220, 6714, 2007, 2010, 2353, 6959, 1998, 11269, 1996, 3341, 1997, 14461, 1055, 7151, 2004, 1996

179 / 12941


11/12/2021 02:08:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:08:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: andrew symonds  promoted to number four after five ducks from his previous six digs  helped australia recover from 38 4 but was caught behind off wavell hinds for 31 off 49 balls 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4080, 25353, 11442, 2015, 3755, 2000, 2193, 2176, 2044, 2274, 14875, 2013, 2010, 3025, 2416, 10667, 2015, 3271, 2660, 8980, 2013, 42

180 / 12941


11/12/2021 02:09:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: michael kasprowicz claimed the only wicket to fall before lunch when he had farhat brilliantly caught by ponting at slip for 20 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2745, 10556, 13102, 10524, 18682, 3555, 1996, 2069, 12937, 2000, 2991, 2077, 6265, 2043, 2002, 2018, 2521, 12707, 8235, 2135, 3236, 2011, 21179, 2075, 2012, 7540, 2005, 2322, 102, 0, 

181 / 12941


11/12/2021 02:09:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pollock was the only one to bother england s rampant openers  beating the outside edge of both batsmen 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 25218, 2001, 1996, 2069, 2028, 2000, 8572, 2563, 1055, 25883, 16181, 2015, 6012, 1996, 2648, 3341, 1997, 2119, 12236, 3549, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

182 / 12941


11/12/2021 02:09:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: graeme smith  captain   herschelle gibbs  jacques rudolph  jacques kallis  martin van jaarsveld  hashim amla  ab de villiers  shaun pollock  nicky boje  makhaya ntini  dale steyn 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 21840, 3044, 2952, 5106, 15721, 2571, 15659, 7445, 18466, 7445, 10556, 21711, 3235, 3158, 14855, 11650, 15985, 2094, 23325, 5714, 25

183 / 12941


11/12/2021 02:09:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   speed said   the impact of the tsunami on sri lanka has been devastating 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3177, 2056, 1996, 4254, 1997, 1996, 19267, 2006, 5185, 7252, 2038, 2042, 14886, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

184 / 12941


11/12/2021 02:09:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he passed 4 000 test runs as australia added 169 during the afternoon and reached his 13th century in spectacular style with two sixes off mohammad asif to reach 92 and then two successive fours off shahid afridi 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2979, 1018, 2199, 3231, 3216, 2004, 2660, 2794, 18582, 2076, 1996, 5027, 1998, 2584, 2010, 6

185 / 12941


11/12/2021 02:09:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: australia complete sweepthird test  sydney  day four  result   australia 568   62 1 bt pakistan 304   325pakistan  continuing on 67 1 overnight  posted 325 all out in their second innings and the home team quickly scored the 62 runs needed for victory 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2660, 3143, 11740, 15222, 4103, 3231, 3994, 2154, 2176, 27

186 / 12941


11/12/2021 02:09:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but asim kamal  partnered by mohammad asif  who scored his first runs in test cricket  put on 55 for the last wicket to set australia a meagre target for victory 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2004, 5714, 21911, 12404, 2011, 12050, 2004, 10128, 2040, 3195, 2010, 2034, 3216, 1999, 3231, 4533, 2404, 2006, 4583, 2005, 1996, 2197, 12937, 

187 / 12941


11/12/2021 02:09:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:09:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: g smith  capt   a bacher  n boje  m boucher  wkt   ab de villiers  h gibbs  a hall  j kallis  j kemp  c langeveldt  a nel  m ntini  a prince  s pollock  j rudolph 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1043, 3044, 14408, 1037, 10384, 2121, 1050, 8945, 6460, 1049, 8945, 22368, 1059, 25509, 11113, 2139, 25333, 1044, 15659, 1037, 2534, 1046, 10556, 21

188 / 12941


11/12/2021 02:10:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: mashud s departure signalled an end to the niceties as rafique and mortaza tore into tiring bowlers 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16137, 6979, 2094, 1055, 6712, 4742, 3709, 2019, 2203, 2000, 1996, 3835, 7368, 2004, 7148, 7413, 1998, 22294, 10936, 2050, 9538, 2046, 14841, 4892, 14999, 2015, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

189 / 12941


11/12/2021 02:10:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ghai was accused of failing to account for sponsorship money  resolve player disputes or hold regular elections 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1043, 10932, 2001, 5496, 1997, 7989, 2000, 4070, 2005, 12026, 2769, 10663, 2447, 11936, 2030, 2907, 3180, 3864, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

190 / 12941


11/12/2021 02:10:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it strikes me the umpires have had a poor test both on and off the field all match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 9326, 2033, 1996, 20887, 2015, 2031, 2018, 1037, 3532, 3231, 2119, 2006, 1998, 2125, 1996, 2492, 2035, 2674, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

191 / 12941


11/12/2021 02:10:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ab devilliers joined gibbs  who was now strangely becalmed and only 65 runs were scored in the afternoon session  but after devilliers hooked hoggard and was caught at long leg for 19  gibbs reached his 14th century in almost five and a half hours 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11113, 6548, 14355, 2015, 2587, 15659, 2040, 2001, 2085, 13939,

192 / 12941


11/12/2021 02:10:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: avishka gunawardene and kaushal lokuarachchi were both the subject of an official disciplinary inquiry after allegations of drunken misconduct 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 20704, 4509, 2912, 3282, 10830, 18246, 2063, 1998, 10556, 20668, 2389, 13660, 6692, 22648, 16257, 4048, 2020, 2119, 1996, 3395, 1997, 2019, 2880, 17972, 9934, 2044, 998

193 / 12941


11/12/2021 02:10:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we re under a lot more pressure than england  but we re determined to draw the series   kallis told bbc sport wales 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2128, 2104, 1037, 2843, 2062, 3778, 2084, 2563, 2021, 2057, 2128, 4340, 2000, 4009, 1996, 2186, 10556, 21711, 2409, 4035, 4368, 3575, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

194 / 12941


11/12/2021 02:10:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   kirsten too recalls some positive images of cronje 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11382, 19020, 2205, 17722, 2070, 3893, 4871, 1997, 13675, 2239, 6460, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

195 / 12941


11/12/2021 02:10:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if people back home cannot understand that  that s tough   he told the daily telegraph newspaper 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2111, 2067, 2188, 3685, 3305, 2008, 2008, 1055, 7823, 2002, 2409, 1996, 3679, 10013, 3780, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

196 / 12941


11/12/2021 02:10:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:10:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i hope my assessment about the team is incorrect and if they prove me wrong  i will be the happiest man on this planet 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 3246, 2026, 7667, 2055, 1996, 2136, 2003, 16542, 1998, 2065, 2027, 6011, 2033, 3308, 1045, 2097, 2022, 1996, 5292, 9397, 10458, 2158, 2006, 2023, 4774, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

197 / 12941


11/12/2021 02:11:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  glenn was outstanding and we just did enough to win   ponting said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9465, 2001, 5151, 1998, 2057, 2074, 2106, 2438, 2000, 2663, 21179, 2075, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

198 / 12941


11/12/2021 02:11:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   shoaib s absence will be a huge blow to pakistan s chances of avenging last year s home test series defeat by india 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 26822, 4886, 2497, 1055, 6438, 2097, 2022, 1037, 4121, 6271, 2000, 4501, 1055, 9592, 1997, 13642, 22373, 2197, 2095, 1055, 2188, 3231, 2186, 4154, 2011, 2634, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

199 / 12941


11/12/2021 02:11:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: sunil gavaskar heads the icc committee which has recommended the proposal in a bid to inject interest into the one day game which the former india captain says has become  predictable  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3103, 4014, 11721, 12044, 6673, 4641, 1996, 16461, 2837, 2029, 2038, 6749, 1996, 6378, 1999, 1037, 7226, 2000, 1999, 20614, 30

200 / 12941


11/12/2021 02:11:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: an expert panel comprising aravinda de silva  angus fraser  michael holding  tony lewis  tim may and david richardson found that most modern bowlers broke the rules in some way 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2019, 6739, 5997, 9605, 19027, 6371, 2850, 2139, 11183, 13682, 9443, 2745, 3173, 4116, 4572, 5199, 2089, 1998, 2585, 9482, 2179, 2008,

201 / 12941


11/12/2021 02:11:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  there is nothing good about twenty20 cricket 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2003, 2498, 2204, 2055, 22240, 4533, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

202 / 12941


11/12/2021 02:11:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pakistan  to host next asia cup pakistan are to host their first asia cup one day tournament next march  according to pakistan cricket board chairman shaharyar khan 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 4501, 2000, 3677, 2279, 4021, 2452, 4501, 2024, 2000, 3677, 2037, 2034, 4021, 2452, 2028, 2154, 2977, 2279, 2233, 2429, 2000, 4501, 4533, 2604, 3

203 / 12941


11/12/2021 02:11:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: all eight teams will face each other under a league format  with the top four then progressing to the semi finals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2035, 2809, 2780, 2097, 2227, 2169, 2060, 2104, 1037, 2223, 4289, 2007, 1996, 2327, 2176, 2059, 27673, 2000, 1996, 4100, 4399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

204 / 12941


11/12/2021 02:11:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  further  we have witnessed the abysmal performance of the usa cricket team at the icc champions trophy 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2582, 2057, 2031, 9741, 1996, 11113, 7274, 9067, 2836, 1997, 1996, 3915, 4533, 2136, 2012, 1996, 16461, 3966, 5384, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

205 / 12941


11/12/2021 02:11:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the first to go was marcus trescothick who was run out thanks to a misjudgement by andrew strauss 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2034, 2000, 2175, 2001, 6647, 24403, 24310, 16066, 2243, 2040, 2001, 2448, 2041, 4283, 2000, 1037, 28616, 9103, 11818, 3672, 2011, 4080, 16423, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

206 / 12941


11/12/2021 02:11:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:11:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the boys were terrified   we were all terrified   but they were  also  amazingly calm   said jones 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3337, 2020, 10215, 2057, 2020, 2035, 10215, 2021, 2027, 2020, 2036, 29350, 5475, 2056, 3557, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

207 / 12941


11/12/2021 02:12:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: his introduction to the senior side came on the controversial tour to zimbabwe  where he showed his potential with an innings of 77 not out off 76 balls in the second match as england achieved a 4 0 clean sweep 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2010, 4955, 2000, 1996, 3026, 2217, 2234, 2006, 1996, 6801, 2778, 2000, 11399, 2073, 2002, 3662, 201

208 / 12941


11/12/2021 02:12:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: both sides were penalised six runs for slow over rates 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2119, 3903, 2020, 18476, 5084, 2416, 3216, 2005, 4030, 2058, 6165, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

209 / 12941


11/12/2021 02:12:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: five wickets then fell for the addition of just 34 runs as matsikenyeri  the sixth bowler tried  had a major impact 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2274, 10370, 2059, 3062, 2005, 1996, 2804, 1997, 2074, 4090, 3216, 2004, 22281, 17339, 4890, 11124, 1996, 4369, 14999, 2699, 2018, 1037, 2350, 4254, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

210 / 12941


11/12/2021 02:12:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but whatever the department of culture  media and sport can offer  the biggest factor for the icc   tax breaks   remains something that only the treasury can rule on 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 3649, 1996, 2533, 1997, 3226, 2865, 1998, 4368, 2064, 3749, 1996, 5221, 5387, 2005, 1996, 16461, 4171, 7807, 3464, 2242, 2008, 2069, 1996, 

211 / 12941


11/12/2021 02:12:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  a decision about the venues for the re scheduled tests should be made within the next two weeks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 3247, 2055, 1996, 9356, 2005, 1996, 2128, 5115, 5852, 2323, 2022, 2081, 2306, 1996, 2279, 2048, 3134, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

212 / 12941


11/12/2021 02:12:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: muttiah muralitharan captured his wicket in overall figures of 3 59 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 14163, 6916, 4430, 15533, 26054, 5521, 4110, 2010, 12937, 1999, 3452, 4481, 1997, 1017, 5354, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

213 / 12941


11/12/2021 02:12:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it s the best day in my life 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 1996, 2190, 2154, 1999, 2026, 2166, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

214 / 12941


11/12/2021 02:12:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

215 / 12941


11/12/2021 02:12:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:12:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we all knew how the tour was scheduled 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2035, 2354, 2129, 1996, 2778, 2001, 5115, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

216 / 12941


11/12/2021 02:13:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   law could in theory play for england s test side but he is now 36 and his selection could be a backward step 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2375, 2071, 1999, 3399, 2377, 2005, 2563, 1055, 3231, 2217, 2021, 2002, 2003, 2085, 4029, 1998, 2010, 4989, 2071, 2022, 1037, 8848, 3357, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

217 / 12941


11/12/2021 02:13:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if the injury does not recover in time  england face the nightmare of trying to choose a balanced team 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 1996, 4544, 2515, 2025, 8980, 1999, 2051, 2563, 2227, 1996, 10103, 1997, 2667, 2000, 5454, 1037, 12042, 2136, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

218 / 12941


11/12/2021 02:13:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he is still recuperating from shoulder surgery 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2003, 2145, 28667, 6279, 6906, 3436, 2013, 3244, 5970, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

219 / 12941


11/12/2021 02:13:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   john bracewell  coach of the new zealand team  said his players had been deeply moved by the disaster and their thoughts were with the sri lankan squad 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2198, 17180, 4381, 2873, 1997, 1996, 2047, 3414, 2136, 2056, 2010, 2867, 2018, 2042, 6171, 2333, 2011, 1996, 7071, 1998, 2037, 4301, 2020, 2007, 1996, 5185, 

220 / 12941


11/12/2021 02:13:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: however  the scottish saltires  who have finished bottom in two seasons in division two  will be removed from the competition after 2005 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 1996, 4104, 5474, 7442, 2015, 2040, 2031, 2736, 3953, 1999, 2048, 3692, 1999, 2407, 2048, 2097, 2022, 3718, 2013, 1996, 2971, 2044, 2384, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0

221 / 12941


11/12/2021 02:13:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we need to be playing more games 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2342, 2000, 2022, 2652, 2062, 2399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

222 / 12941


11/12/2021 02:13:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  woolmer is a highly regarded coach and he needs to back his players and give them confidence at all times 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 12121, 5017, 2003, 1037, 3811, 5240, 2873, 1998, 2002, 3791, 2000, 2067, 2010, 2867, 1998, 2507, 2068, 7023, 2012, 2035, 2335, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

223 / 12941


11/12/2021 02:13:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: in fairness i think everton have missed a couple of players and got some young players out 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 26935, 1045, 2228, 18022, 2031, 4771, 1037, 3232, 1997, 2867, 1998, 2288, 2070, 2402, 2867, 2041, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

224 / 12941


11/12/2021 02:13:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:13:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   even without van nistelrooy  united made it 13 wins in 15 league games with a 2 0 derby victory at manchester city on sunday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2130, 2302, 3158, 9152, 13473, 20974, 9541, 2100, 2142, 2081, 2009, 2410, 5222, 1999, 2321, 2223, 2399, 2007, 1037, 1016, 1014, 7350, 3377, 2012, 5087, 2103, 2006, 4465, 102, 0, 0, 0, 

225 / 12941


11/12/2021 02:14:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he is a competitive player but a fair player and i know how upset he is by what has happened 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2003, 1037, 6975, 2447, 2021, 1037, 4189, 2447, 1998, 1045, 2113, 2129, 6314, 2002, 2003, 2011, 2054, 2038, 3047, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

226 / 12941


11/12/2021 02:14:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  my agent has spoken with the club and it will be resolved soon 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2026, 4005, 2038, 5287, 2007, 1996, 2252, 1998, 2009, 2097, 2022, 10395, 2574, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

227 / 12941


11/12/2021 02:14:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  so i would support the return of the home internationals   the only problem would be fitting them in to the fixture schedule 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 1045, 2052, 2490, 1996, 2709, 1997, 1996, 2188, 27340, 1996, 2069, 3291, 2052, 2022, 11414, 2068, 1999, 2000, 1996, 15083, 6134, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

228 / 12941


11/12/2021 02:14:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i feel that the national players are playing with a new spirit as i saw them play against belgium  egypt won 4 0 on wednesday  and i simply want to add to their success 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2514, 2008, 1996, 2120, 2867, 2024, 2652, 2007, 1037, 2047, 4382, 2004, 1045, 2387, 2068, 2377, 2114, 5706, 5279, 2180, 1018, 1014, 200

229 / 12941


11/12/2021 02:14:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   man city  james  mills  bradley wright phillips 83   dunne  distin  thatcher  shaun wright phillips  barton  macken 68   sibierski  mcmanaman  musampa  fowler 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2158, 2103, 2508, 6341, 8981, 6119, 8109, 6640, 26553, 4487, 16643, 2078, 21127, 16845, 6119, 8109, 12975, 11349, 2368, 6273, 9033, 11283, 27472, 2072

230 / 12941


11/12/2021 02:14:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  rafa has still time in front of him to build an even better team  maybe he s a little bit behind  right now    he told bbc radio five live 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7148, 2050, 2038, 2145, 2051, 1999, 2392, 1997, 2032, 2000, 3857, 2019, 2130, 2488, 2136, 2672, 2002, 1055, 1037, 2210, 2978, 2369, 2157, 2085, 2002, 2409, 4035, 2557, 227

231 / 12941


11/12/2021 02:14:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: his injury is very painful  so he is out 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2010, 4544, 2003, 2200, 9145, 2061, 2002, 2003, 2041, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

232 / 12941


11/12/2021 02:14:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:14:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: despite their frustration  chelsea began to dominate midfield without seriously threatening to break liverpool s well organised defence 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2750, 2037, 9135, 9295, 2211, 2000, 16083, 23071, 2302, 5667, 8701, 2000, 3338, 6220, 1055, 2092, 7362, 4721, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

233 / 12941


11/12/2021 02:15:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:15:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bolton keeper jussi jaaskelainen had to make two saves in quick succession midway through the first half   keeping out shearer s low shot and dyer s close range header   but that was the only goalmouth action of note 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 12118, 10684, 18414, 18719, 14855, 19895, 10581, 21820, 2018, 2000, 2191, 2048, 13169, 1999, 4

234 / 12941


11/12/2021 02:15:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:15:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but for  charlton goalkeeper  dean kiely  who made three tremendous saves  we could have scored five or six 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2005, 17821, 9653, 4670, 20963, 2100, 2040, 2081, 2093, 14388, 13169, 2057, 2071, 2031, 3195, 2274, 2030, 2416, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

235 / 12941


11/12/2021 02:15:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:15:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   anderson  diamond 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5143, 6323, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

236 / 12941


11/12/2021 02:15:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:15:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bryn halliwell was the busier keeper early on  saving from bellamy  chris sutton and juninho 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 19904, 2534, 2072, 4381, 2001, 1996, 3902, 3771, 10684, 2220, 2006, 7494, 2013, 25544, 3782, 11175, 1998, 12022, 29344, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

237 / 12941


11/12/2021 02:15:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:15:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: lee miller scored inside the opening 60 seconds  heading over colin meldrum and into the net from a jamie mcallister free kick 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3389, 4679, 3195, 2503, 1996, 3098, 3438, 3823, 5825, 2058, 6972, 11463, 21884, 1998, 2046, 1996, 5658, 2013, 1037, 6175, 22432, 21711, 3334, 2489, 5926, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0

238 / 12941


11/12/2021 02:16:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:16:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i put everything right with arjen s foot the last time i saw him 12 days ago 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2404, 2673, 2157, 2007, 12098, 6460, 2078, 1055, 3329, 1996, 2197, 2051, 1045, 2387, 2032, 2260, 2420, 3283, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

239 / 12941


11/12/2021 02:16:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:16:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we will have to wait and see  but i won t cry about injuries because we will have 11 players to play on tuesday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2097, 2031, 2000, 3524, 1998, 2156, 2021, 1045, 2180, 1056, 5390, 2055, 6441, 2138, 2057, 2097, 2031, 2340, 2867, 2000, 2377, 2006, 9857, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

240 / 12941


11/12/2021 02:16:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:16:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it is suggested we had a deal tied up last summer 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2003, 4081, 2057, 2018, 1037, 3066, 5079, 2039, 2197, 2621, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

241 / 12941


11/12/2021 02:16:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:16:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  downing is another one making a great season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 22501, 2003, 2178, 2028, 2437, 1037, 2307, 2161, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

242 / 12941


11/12/2021 02:16:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:16:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i hope that it will now take me six to eight weeks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 3246, 2008, 2009, 2097, 2085, 2202, 2033, 2416, 2000, 2809, 3134, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

243 / 12941


11/12/2021 02:17:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i have told john i will be around for the next european tournament  by then i will be 35 so hopefully i will still be okay 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2031, 2409, 2198, 1045, 2097, 2022, 2105, 2005, 1996, 2279, 2647, 2977, 2011, 2059, 1045, 2097, 2022, 3486, 2061, 11504, 1045, 2097, 2145, 2022, 3100, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0

244 / 12941


11/12/2021 02:17:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 34 year old dutch midfielder is out of contract in the summer and  although his age may count against him  he feels he can play on for another season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4090, 2095, 2214, 3803, 8850, 2003, 2041, 1997, 3206, 1999, 1996, 2621, 1998, 2348, 2010, 2287, 2089, 4175, 2114, 2032, 2002, 5683, 2002, 2064, 2377, 2

245 / 12941


11/12/2021 02:17:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: sociedad say they will pay rangers Â£150 000  with an option to buy the serbia   montenegro international 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 27084, 6340, 4215, 2360, 2027, 2097, 3477, 7181, 1037, 29646, 16068, 2692, 2199, 2007, 2019, 5724, 2000, 4965, 1996, 7238, 13018, 2248, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

246 / 12941


11/12/2021 02:17:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: no amount of money could replace his obvious love of the club and determination to succeed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2053, 3815, 1997, 2769, 2071, 5672, 2010, 5793, 2293, 1997, 1996, 2252, 1998, 9128, 2000, 9510, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

247 / 12941


11/12/2021 02:17:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the only problem is that the game itself can often be a farce 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2069, 3291, 2003, 2008, 1996, 2208, 2993, 2064, 2411, 2022, 1037, 2521, 3401, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

248 / 12941


11/12/2021 02:17:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fernando torres gave athletico an ideal start with a goal in the first minute 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9158, 13101, 2435, 5188, 2080, 2019, 7812, 2707, 2007, 1037, 3125, 1999, 1996, 2034, 3371, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

249 / 12941


11/12/2021 02:17:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:17:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we know he s ambitious and nobody can argue with that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2113, 2002, 1055, 12479, 1998, 6343, 2064, 7475, 2007, 2008, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

250 / 12941


11/12/2021 02:18:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:18:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  there are no easy ties in the fa cup and i m sure nobody is counting on one 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2024, 2053, 3733, 7208, 1999, 1996, 6904, 2452, 1998, 1045, 1049, 2469, 6343, 2003, 10320, 2006, 2028, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

251 / 12941


11/12/2021 02:18:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:18:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 25 year old was only called into the squad on sunday night as cover following the enforced withdrawal of upson  who has a hamstring injury 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2423, 2095, 2214, 2001, 2069, 2170, 2046, 1996, 4686, 2006, 4465, 2305, 2004, 3104, 2206, 1996, 16348, 10534, 1997, 11139, 2239, 2040, 2038, 1037, 10654, 3367, 48

252 / 12941


11/12/2021 02:18:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:18:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he will still need the approval of major shareholders john magnier and jp mcmanus  who own number  of the club to succeed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2097, 2145, 2342, 1996, 6226, 1997, 2350, 15337, 2198, 23848, 14862, 1998, 16545, 11338, 2386, 2271, 2040, 2219, 2193, 1997, 1996, 2252, 2000, 9510, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

253 / 12941


11/12/2021 02:18:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:18:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: chelsea have yet to officially confirm or deny the meeting  which would be in breach of premier league rule k3 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9295, 2031, 2664, 2000, 3985, 12210, 2030, 9772, 1996, 3116, 2029, 2052, 2022, 1999, 12510, 1997, 4239, 2223, 3627, 1047, 2509, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

254 / 12941


11/12/2021 02:18:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:18:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i think all the boys are behind ian mccall   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2228, 2035, 1996, 3337, 2024, 2369, 4775, 25790, 2002, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

255 / 12941


11/12/2021 02:19:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   kewell has not played since december 19 and misses out on international duty this week  with australia facing south africa in durban on wednesday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 17710, 4381, 2038, 2025, 2209, 2144, 2285, 2539, 1998, 22182, 2041, 2006, 2248, 4611, 2023, 2733, 2007, 2660, 5307, 2148, 3088, 1999, 25040, 2006, 9317, 102, 0, 0,

256 / 12941


11/12/2021 02:19:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   a 25 man squad will spend the next three days based at the mottram hall hotel in cheshire and will train at manchester united s nearby carrington complex 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 2423, 2158, 4686, 2097, 5247, 1996, 2279, 2093, 2420, 2241, 2012, 1996, 9587, 4779, 6444, 2534, 3309, 1999, 13789, 1998, 2097, 3345, 2012, 5087, 2142

257 / 12941


11/12/2021 02:19:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: coyne  burnley   jones  wolves   roberts  wrexham   collins  sunderland   edwards  wolves   gabbidon  cardiff   page  cardiff   partridge  motherwell   ricketts  swansea   roberts  tranmere   weston  cardiff   davies  tottenham   fletcher  west ham   giggs  man utd   koumas  west brom   robinson  sunderland   savage  blackburn   williams  west ham   bellamy  

258 / 12941


11/12/2021 02:19:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   robbie has an agreement with larne that he can negotiate with interested clubs 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 12289, 2038, 2019, 3820, 2007, 2474, 12119, 2008, 2002, 2064, 13676, 2007, 4699, 4184, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

259 / 12941


11/12/2021 02:19:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the ifa upheld its original decision to throw newry out of the cup following the andy crawford registration row 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2065, 2050, 16813, 2049, 2434, 3247, 2000, 5466, 2047, 2854, 2041, 1997, 1996, 2452, 2206, 1996, 5557, 10554, 8819, 5216, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

260 / 12941


11/12/2021 02:19:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: in their last meeting  the irish beat italy in the 1994 world cup finals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 2037, 2197, 3116, 1996, 3493, 3786, 3304, 1999, 1996, 2807, 2088, 2452, 4399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

261 / 12941


11/12/2021 02:19:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it would certainly be good to avoid a play off  but on the back of a couple of good results i don t see why we can t win the group 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2052, 5121, 2022, 2204, 2000, 4468, 1037, 2377, 2125, 2021, 2006, 1996, 2067, 1997, 1037, 3232, 1997, 2204, 3463, 1045, 2123, 1056, 2156, 2339, 2057, 2064, 1056, 2663, 1996, 

262 / 12941


11/12/2021 02:19:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: august 17   faroe islands v cyprus 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2257, 2459, 2521, 8913, 3470, 1058, 9719, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

263 / 12941


11/12/2021 02:19:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bnei sakhnin are the first arab side ever to play in european competition and will play english premiership side newcastle united in the first round 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 24869, 7416, 7842, 10023, 11483, 2024, 1996, 2034, 5424, 2217, 2412, 2000, 2377, 1999, 2647, 2971, 1998, 2097, 2377, 2394, 11264, 2217, 8142, 2142, 1999, 1996, 20

264 / 12941


11/12/2021 02:19:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a bayern spokesman said on monday that the decision not to take hashemian to israel had been motivated only by his physical condition 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 21350, 14056, 2056, 2006, 6928, 2008, 1996, 3247, 2025, 2000, 2202, 23325, 17577, 2078, 2000, 3956, 2018, 2042, 12774, 2069, 2011, 2010, 3558, 4650, 102, 0, 0, 0, 0, 0, 0,

265 / 12941


11/12/2021 02:19:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: patrick vieira has apparently threatened some of our players and things like that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4754, 20098, 7895, 2038, 4593, 5561, 2070, 1997, 2256, 2867, 1998, 2477, 2066, 2008, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

266 / 12941


11/12/2021 02:19:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:19:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  that s not good enough for a striker at a club like this 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 1055, 2025, 2204, 2438, 2005, 1037, 11854, 2012, 1037, 2252, 2066, 2023, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

267 / 12941


11/12/2021 02:20:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wright phillips raced down the left and crossed to fowler but city s lone man up front  left free by terry s slip  contrived to head wide when it seemed a breakthrough was certain 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6119, 8109, 8255, 2091, 1996, 2187, 1998, 4625, 2000, 14990, 2021, 2103, 1055, 10459, 2158, 2039, 2392, 2187, 2489, 2011, 6609, 105

268 / 12941


11/12/2021 02:20:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: uefa delegate thomas giordano added   the only unusual thing that happened as far as we are concerned is that chelsea failed to present themselves in the press conference 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6663, 11849, 2726, 21025, 8551, 6761, 2794, 1996, 2069, 5866, 2518, 2008, 3047, 2004, 2521, 2004, 2057, 2024, 4986, 2003, 2008, 9295, 3478, 

269 / 12941


11/12/2021 02:20:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if you only think about winning the next game  you don t know what the draw will be 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2017, 2069, 2228, 2055, 3045, 1996, 2279, 2208, 2017, 2123, 1056, 2113, 2054, 1996, 4009, 2097, 2022, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

270 / 12941


11/12/2021 02:20:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: however  he said he was delighted with the way his time in spain was going and dismissed criticism of his decision to join real 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 2002, 2056, 2002, 2001, 15936, 2007, 1996, 2126, 2010, 2051, 1999, 3577, 2001, 2183, 1998, 7219, 6256, 1997, 2010, 3247, 2000, 3693, 2613, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

271 / 12941


11/12/2021 02:20:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  edelman added that it was pointless having a brand new stadium if the team did not match the surroundings 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3968, 23830, 2794, 2008, 2009, 2001, 23100, 2383, 1037, 4435, 2047, 3346, 2065, 1996, 2136, 2106, 2025, 2674, 1996, 11301, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

272 / 12941


11/12/2021 02:20:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  in the first half he did really well and did everything you want from a wide player 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 1996, 2034, 2431, 2002, 2106, 2428, 2092, 1998, 2106, 2673, 2017, 2215, 2013, 1037, 2898, 2447, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

273 / 12941


11/12/2021 02:20:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i would have liked to play in holland   that would have been a little bit special to me 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2052, 2031, 4669, 2000, 2377, 1999, 7935, 2008, 2052, 2031, 2042, 1037, 2210, 2978, 2569, 2000, 2033, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

274 / 12941


11/12/2021 02:20:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it was an old firm return for barry ferguson as mcleish stuck by the side that thumped four goals past hibernian 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 2019, 2214, 3813, 2709, 2005, 6287, 11262, 2004, 11338, 23057, 4095, 5881, 2011, 1996, 2217, 2008, 28963, 2176, 3289, 2627, 7632, 5677, 11148, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

275 / 12941


11/12/2021 02:20:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:20:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but people have got to take into account why he was incensed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2111, 2031, 2288, 2000, 2202, 2046, 4070, 2339, 2002, 2001, 28647, 2094, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

276 / 12941


11/12/2021 02:21:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   united fans had declared saturday as  cantona day  and had planned to wear masks  that were popular during the frenchman s time as a player at the old trafford club 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2142, 4599, 2018, 4161, 5095, 2004, 8770, 2050, 2154, 1998, 2018, 3740, 2000, 4929, 15806, 2008, 2020, 2759, 2076, 1996, 26529, 1055, 2051, 2004

277 / 12941


11/12/2021 02:21:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: michels had recently undergone heart surgery and dutch football federation  knvb  spokesman frank huizinga said   he was one of the best coaches we had in history 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8709, 2015, 2018, 3728, 17215, 2540, 5970, 1998, 3803, 2374, 4657, 14161, 26493, 14056, 3581, 17504, 6774, 2050, 2056, 2002, 2001, 2028, 1997, 1996,

278 / 12941


11/12/2021 02:21:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he conceded the win over forest  which included goals from noe pamarot and mido  was not pretty to watch 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 15848, 1996, 2663, 2058, 3224, 2029, 2443, 3289, 2013, 2053, 2063, 14089, 10464, 2102, 1998, 3054, 2080, 2001, 2025, 3492, 2000, 3422, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

279 / 12941


11/12/2021 02:21:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: victory took real to within six points of leaders barcelona and owen is confident real can close the gap 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3377, 2165, 2613, 2000, 2306, 2416, 2685, 1997, 4177, 7623, 1998, 7291, 2003, 9657, 2613, 2064, 2485, 1996, 6578, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

280 / 12941


11/12/2021 02:21:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we have an opportunity to win this cup this year  no question about that   he declared 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2031, 2019, 4495, 2000, 2663, 2023, 2452, 2023, 2095, 2053, 3160, 2055, 2008, 2002, 4161, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

281 / 12941


11/12/2021 02:21:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   rush was appointed at the end of august following the departure of former liverpool team mate mark wright  who guided chester to the conference title last season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5481, 2001, 2805, 2012, 1996, 2203, 1997, 2257, 2206, 1996, 6712, 1997, 2280, 6220, 2136, 6775, 2928, 6119, 2040, 8546, 8812, 2000, 1996, 3034, 251

282 / 12941


11/12/2021 02:21:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   butragueno  meanwhile  was angry at being impersonated by the radio disc jockey 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 29181, 24997, 2080, 5564, 2001, 4854, 2012, 2108, 17727, 18617, 4383, 2011, 1996, 2557, 5860, 13989, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

283 / 12941


11/12/2021 02:21:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

284 / 12941


11/12/2021 02:21:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:21:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: milburn scored 200 league and cup goals between 1946 and 1957  while shearer currently has 187 goals to his name 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 23689, 8022, 3195, 3263, 2223, 1998, 2452, 3289, 2090, 3918, 1998, 3890, 2096, 18330, 2121, 2747, 2038, 19446, 3289, 2000, 2010, 2171, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

285 / 12941


11/12/2021 02:22:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the players are really down in the dressing room 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2867, 2024, 2428, 2091, 1999, 1996, 11225, 2282, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

286 / 12941


11/12/2021 02:22:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if a club want to sell you  there is nothing you can do 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 1037, 2252, 2215, 2000, 5271, 2017, 2045, 2003, 2498, 2017, 2064, 2079, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

287 / 12941


11/12/2021 02:22:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but he denied his club was suffering a dip in form which league rivals arsenal and manchester united could exploit 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2002, 6380, 2010, 2252, 2001, 6114, 1037, 16510, 1999, 2433, 2029, 2223, 9169, 9433, 1998, 5087, 2142, 2071, 18077, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

288 / 12941


11/12/2021 02:22:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the bigger discussions around the winter break should be to do with the nature of football today  the needs of football players and the way the premiership has developed  rather than one or two matches in the champions league in february 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 7046, 10287, 2105, 1996, 3467, 3338, 2323, 2022, 2000, 2079, 2007,

289 / 12941


11/12/2021 02:22:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: things began well but the spanish champions extended their winless streak to six after losing to racing santander last weekend 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2477, 2211, 2092, 2021, 1996, 3009, 3966, 3668, 2037, 2663, 3238, 9039, 2000, 2416, 2044, 3974, 2000, 3868, 4203, 11563, 2197, 5353, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

290 / 12941


11/12/2021 02:22:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i won t be able to tell you whether he will need an operation until maybe next week 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2180, 1056, 2022, 2583, 2000, 2425, 2017, 3251, 2002, 2097, 2342, 2019, 3169, 2127, 2672, 2279, 2733, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

291 / 12941


11/12/2021 02:22:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: arsenal were now totally dominant  and were desperately unlucky not to take the lead after 62 minutes when fabregas crashed a rising drive against the bar from 20 yards 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9433, 2020, 2085, 6135, 7444, 1998, 2020, 9652, 4895, 7630, 17413, 2025, 2000, 2202, 1996, 2599, 2044, 5786, 2781, 2043, 6904, 13578, 12617, 8

292 / 12941


11/12/2021 02:22:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   ronald koeman quit as ajax boss last week after their exit from the uefa cup 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8923, 12849, 16704, 8046, 2004, 18176, 5795, 2197, 2733, 2044, 2037, 6164, 2013, 1996, 6663, 2452, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

293 / 12941


11/12/2021 02:22:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i was disappointed but i am not thinking of leaving right now 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2001, 9364, 2021, 1045, 2572, 2025, 3241, 1997, 2975, 2157, 2085, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

294 / 12941


11/12/2021 02:22:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:22:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i apologise to the ref and linesman  who were only doing their job 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 9706, 12898, 17701, 2063, 2000, 1996, 25416, 1998, 3210, 2386, 2040, 2020, 2069, 2725, 2037, 3105, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

295 / 12941


11/12/2021 02:23:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: robben plays down european returninjured chelsea winger arjen robben has insisted that he only has a 10  chance of making a return against barcelona in the champions league 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 26211, 2368, 3248, 2091, 2647, 2709, 2378, 26949, 9295, 16072, 12098, 6460, 2078, 26211, 2368, 2038, 7278, 2008, 2002, 2069, 2038, 1037, 

296 / 12941


11/12/2021 02:23:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: mourinho said he was  just practising my portuguese with him because i don t need strikers  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9587, 9496, 25311, 2080, 2056, 2002, 2001, 2074, 10975, 18908, 9355, 2026, 5077, 2007, 2032, 2138, 1045, 2123, 1056, 2342, 26049, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

297 / 12941


11/12/2021 02:23:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the premier league is also continuing investigations into allegations chelsea officials tapped up arsenal defender ashley cole in january 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4239, 2223, 2003, 2036, 5719, 9751, 2046, 9989, 9295, 4584, 10410, 2039, 9433, 8291, 9321, 5624, 1999, 2254, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

298 / 12941


11/12/2021 02:23:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: aragones angered by racism finespain coach luis aragones is furious after being fined by the spanish football federation for his comments about thierry henry 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 16146, 2229, 18748, 2011, 14398, 21892, 4502, 2378, 2873, 6446, 16146, 2229, 2003, 9943, 2044, 2108, 16981, 2011, 1996, 3009, 2374, 4657, 2005, 2010, 79

299 / 12941


11/12/2021 02:23:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 18 year old  who has played in 13 of the club s last 14 games  had surgery to repair a double hernia 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2324, 2095, 2214, 2040, 2038, 2209, 1999, 2410, 1997, 1996, 2252, 1055, 2197, 2403, 2399, 2018, 5970, 2000, 7192, 1037, 3313, 2014, 6200, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

300 / 12941


11/12/2021 02:23:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   robson was speaking after being formally granted the freedom of the city of newcastle 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 23698, 2001, 4092, 2044, 2108, 6246, 4379, 1996, 4071, 1997, 1996, 2103, 1997, 8142, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

301 / 12941


11/12/2021 02:23:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: qpr have also signed italian generoso rossi 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1053, 18098, 2031, 2036, 2772, 3059, 4962, 7352, 2080, 18451, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

302 / 12941


11/12/2021 02:23:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 22 year old czech republic international has set a new premiership record of 961 consecutive minutes without conceding a goal  a mark which is still running 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2570, 2095, 2214, 5569, 3072, 2248, 2038, 2275, 1037, 2047, 11264, 2501, 1997, 5986, 2487, 5486, 2781, 2302, 9530, 11788, 2075, 1037, 3125, 1037

303 / 12941


11/12/2021 02:23:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he added   he did speak to the police but will not be pressing charges 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2794, 2002, 2106, 3713, 2000, 1996, 2610, 2021, 2097, 2025, 2022, 7827, 5571, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

304 / 12941


11/12/2021 02:23:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he can get better if we can supply him better   added keegan 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2064, 2131, 2488, 2065, 2057, 2064, 4425, 2032, 2488, 2794, 17710, 20307, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

305 / 12941


11/12/2021 02:23:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:23:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i hope he s hardening to the fact he will have big decisions to make but i hope it is to the benefit of steven gerrard and i hope it is worthwhile for liverpool 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 3246, 2002, 1055, 28751, 2075, 2000, 1996, 2755, 2002, 2097, 2031, 2502, 6567, 2000, 2191, 2021, 1045, 3246, 2009, 2003, 2000, 1996, 5770, 1997

306 / 12941


11/12/2021 02:24:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: thierry henry is ruled out with an achilles tendon injury but cole said   no one is putting the blame on robin 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 26413, 2888, 2003, 5451, 2041, 2007, 2019, 23167, 7166, 2239, 4544, 2021, 5624, 2056, 2053, 2028, 2003, 5128, 1996, 7499, 2006, 5863, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

307 / 12941


11/12/2021 02:24:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: that was in the season before last  when they disposed of premiership fulham at this fifth round stage 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 2001, 1999, 1996, 2161, 2077, 2197, 2043, 2027, 21866, 1997, 11264, 21703, 2012, 2023, 3587, 2461, 2754, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

308 / 12941


11/12/2021 02:24:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we ll battle but if it comes to a football match i think we ll win 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2222, 2645, 2021, 2065, 2009, 3310, 2000, 1037, 2374, 2674, 1045, 2228, 2057, 2222, 2663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

309 / 12941


11/12/2021 02:24:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  everything is new and there is a huge determination to win 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2673, 2003, 2047, 1998, 2045, 2003, 1037, 4121, 9128, 2000, 2663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

310 / 12941


11/12/2021 02:24:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  mentally they are much stronger  even though a lot of their players are young   the 36 year old said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 10597, 2027, 2024, 2172, 6428, 2130, 2295, 1037, 2843, 1997, 2037, 2867, 2024, 2402, 1996, 4029, 2095, 2214, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

311 / 12941


11/12/2021 02:24:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:24:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: nevertheless  barcelona are accustomed to playing big games at the nou camp  where they have to face the likes of real madrid each season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6600, 7623, 2024, 17730, 2000, 2652, 2502, 2399, 2012, 1996, 2053, 2226, 3409, 2073, 2027, 2031, 2000, 2227, 1996, 7777, 1997, 2613, 6921, 2169, 2161, 102, 0, 0, 0, 0, 0, 0,

312 / 12941


11/12/2021 02:25:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: united were drawn against everton  while chelsea face a trip to newcastle 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2142, 2020, 4567, 2114, 18022, 2096, 9295, 2227, 1037, 4440, 2000, 8142, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

313 / 12941


11/12/2021 02:25:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 31 year old former france international gave his last press conference as a roma player on monday  anouncing his move to bolton 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2861, 2095, 2214, 2280, 2605, 2248, 2435, 2010, 2197, 2811, 3034, 2004, 1037, 12836, 2447, 2006, 6928, 2019, 23709, 6129, 2010, 2693, 2000, 12118, 102, 0, 0, 0, 0, 0, 0, 0, 

314 / 12941


11/12/2021 02:25:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: stuart spent just over four years at goodison park  making 125 senior appearances and scoring 25 goals  before signing for sheffield united   where he scored 12 goals in 68 appearances 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6990, 2985, 2074, 2058, 2176, 2086, 2012, 2204, 10929, 2380, 2437, 8732, 3026, 3922, 1998, 4577, 2423, 3289, 2077, 6608, 2005,

315 / 12941


11/12/2021 02:25:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he can t go on television and accuse me of telling lies 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2064, 1056, 2175, 2006, 2547, 1998, 26960, 2033, 1997, 4129, 3658, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

316 / 12941


11/12/2021 02:25:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we are up for this one 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2024, 2039, 2005, 2023, 2028, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

317 / 12941


11/12/2021 02:25:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the way things are looking  i think it is unlikely we are going to 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2126, 2477, 2024, 2559, 1045, 2228, 2009, 2003, 9832, 2057, 2024, 2183, 2000, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

318 / 12941


11/12/2021 02:25:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: last year bafana bafana were humbled in the first by minnows mauritius who beat them 2 0 in curepipe 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2197, 2095, 8670, 15143, 2050, 8670, 15143, 2050, 2020, 15716, 2094, 1999, 1996, 2034, 2011, 8117, 19779, 2015, 18004, 2040, 3786, 2068, 1016, 1014, 1999, 9526, 24548, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

319 / 12941


11/12/2021 02:25:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: tottenham bid   163 8m for forest duonottingham forest have confirmed they have received an Â£8m bid from tottenham for andy reid and michael dawson 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 18127, 7226, 17867, 1022, 2213, 2005, 3224, 6829, 17048, 3436, 3511, 3224, 2031, 4484, 2027, 2031, 2363, 2019, 1037, 29646, 2620, 2213, 7226, 2013, 18127, 2005, 

320 / 12941


11/12/2021 02:25:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we all want to get through to the next round and face a massive team  that s the way it is 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2035, 2215, 2000, 2131, 2083, 2000, 1996, 2279, 2461, 1998, 2227, 1037, 5294, 2136, 2008, 1055, 1996, 2126, 2009, 2003, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

321 / 12941


11/12/2021 02:25:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   everton must decide whether to cash in now on the denmark midfield man  or risk losing him for nothing in the summer 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 18022, 2442, 5630, 3251, 2000, 5356, 1999, 2085, 2006, 1996, 5842, 23071, 2158, 2030, 3891, 3974, 2032, 2005, 2498, 1999, 1996, 2621, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

322 / 12941


11/12/2021 02:25:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the situation for jens is that he is currently the number two keeper at arsenal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3663, 2005, 25093, 2003, 2008, 2002, 2003, 2747, 1996, 2193, 2048, 10684, 2012, 9433, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

323 / 12941


11/12/2021 02:25:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:25:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if you are not going to get a game under one manager  you look for another whose style of play suits you 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2017, 2024, 2025, 2183, 2000, 2131, 1037, 2208, 2104, 2028, 3208, 2017, 2298, 2005, 2178, 3005, 2806, 1997, 2377, 11072, 2017, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

324 / 12941


11/12/2021 02:26:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  when we come towards the end of his contract we will both review the situation 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2043, 2057, 2272, 2875, 1996, 2203, 1997, 2010, 3206, 2057, 2097, 2119, 3319, 1996, 3663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

325 / 12941


11/12/2021 02:26:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: scotland yard said there had been 11 arrests for alleged public order  drugs and offensive weapon offences 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3885, 4220, 2056, 2045, 2018, 2042, 2340, 17615, 2005, 6884, 2270, 2344, 5850, 1998, 5805, 5195, 18421, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

326 / 12941


11/12/2021 02:26:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i like to play in games like this with this intense rivalry 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2066, 2000, 2377, 1999, 2399, 2066, 2023, 2007, 2023, 6387, 10685, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

327 / 12941


11/12/2021 02:26:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   fifa s disciplinary code stipulates that a first doping offence should be followed by a six month ban 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5713, 1055, 17972, 3642, 2358, 11514, 18969, 2008, 1037, 2034, 23799, 15226, 2323, 2022, 2628, 2011, 1037, 2416, 3204, 7221, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

328 / 12941


11/12/2021 02:26:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the united boss said it was worse than ruud van nistelrooy s foul on ashley cole for which he got a three game ban 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2142, 5795, 2056, 2009, 2001, 4788, 2084, 21766, 6784, 3158, 9152, 13473, 20974, 9541, 2100, 1055, 12487, 2006, 9321, 5624, 2005, 2029, 2002, 2288, 1037, 2093, 2208, 7221, 102, 0, 0, 0, 0, 0

329 / 12941


11/12/2021 02:26:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and mutu agreed  adding   it is unfair 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 14163, 8525, 3530, 5815, 2009, 2003, 15571, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

330 / 12941


11/12/2021 02:26:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i shall make a further statement on monday  clarifying our position 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 4618, 2191, 1037, 2582, 4861, 2006, 6928, 25037, 2075, 2256, 2597, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

331 / 12941


11/12/2021 02:26:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: mutu  banned by the english fa  can resume playing next may 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 14163, 8525, 7917, 2011, 1996, 2394, 6904, 2064, 13746, 2652, 2279, 2089, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

332 / 12941


11/12/2021 02:26:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: international matches can also be played on such pitches  although games at major tournaments have to be contested on grass 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2248, 3503, 2064, 2036, 2022, 2209, 2006, 2107, 19299, 2348, 2399, 2012, 2350, 8504, 2031, 2000, 2022, 7259, 2006, 5568, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

333 / 12941


11/12/2021 02:26:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:26:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: third placed everton visit plymouth  liverpool travel to burnley  crystal palace go to sunderland  fulham face carling cup semi finalists watford  bolton meet ipswich  while aston villa were drawn against sheffield united 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2353, 2872, 18022, 3942, 10221, 6220, 3604, 2000, 23028, 6121, 4186, 2175, 2000, 15518, 2

334 / 12941


11/12/2021 02:27:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if we stagnate between eighth and 15th place it s impossible to progress 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2057, 2358, 8490, 12556, 2090, 5964, 1998, 6286, 2173, 2009, 1055, 5263, 2000, 5082, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

335 / 12941


11/12/2021 02:27:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he added   jacques has never gone into exactly what it was 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2794, 7445, 2038, 2196, 2908, 2046, 3599, 2054, 2009, 2001, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

336 / 12941


11/12/2021 02:27:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   other candidates for the job include former scotland midfielders gordon strachan and gary mcallister and vogts  assistant tommy burns 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2060, 5347, 2005, 1996, 3105, 2421, 2280, 3885, 8850, 2015, 5146, 2358, 22648, 4819, 1998, 5639, 22432, 21711, 3334, 1998, 29536, 13512, 2015, 3353, 6838, 7641, 102, 0, 0, 0, 

337 / 12941


11/12/2021 02:27:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it s only normal when you have got a team put together of such big names that you put the finishing touch to it and the finishing touch at chelsea is a fantastic manager like mourinho   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 2069, 3671, 2043, 2017, 2031, 2288, 1037, 2136, 2404, 2362, 1997, 2107, 2502, 3415, 2008, 2017, 2404, 19

338 / 12941


11/12/2021 02:27:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: so now he says that he is heading for corinthians in search of  tranquillity 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 2085, 2002, 2758, 2008, 2002, 2003, 5825, 2005, 2522, 6657, 15222, 6962, 1999, 3945, 1997, 25283, 26147, 18605, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

339 / 12941


11/12/2021 02:27:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:27:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i still have a good relationship with arsene wenger   he s always said he wants me to sign 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2145, 2031, 1037, 2204, 3276, 2007, 29393, 8625, 19181, 4590, 2002, 1055, 2467, 2056, 2002, 4122, 2033, 2000, 3696, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

340 / 12941


11/12/2021 02:28:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but is smith the man for what must be one of the hardest jobs in football  the 56 year old takes over at a time when the national side is in the doldrums 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2003, 3044, 1996, 2158, 2005, 2054, 2442, 2022, 2028, 1997, 1996, 18263, 5841, 1999, 2374, 1996, 5179, 2095, 2214, 3138, 2058, 2012, 1037, 2051, 2043, 

341 / 12941


11/12/2021 02:28:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: you can t put a real time on a comeback  we ll see how he progresses 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2017, 2064, 1056, 2404, 1037, 2613, 2051, 2006, 1037, 12845, 2057, 2222, 2156, 2129, 2002, 22901, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

342 / 12941


11/12/2021 02:28:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: subs not used  morris  wardley  newey  zakuani  mcmahon 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4942, 2015, 2025, 2109, 6384, 4829, 3051, 2047, 3240, 23564, 5283, 7088, 17741, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

343 / 12941


11/12/2021 02:28:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: that victory for angola also marked a first defeat in 14 years for zambia at lusaka s independence stadium  where saturday s game is being played 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 3377, 2005, 13491, 2036, 4417, 1037, 2034, 4154, 1999, 2403, 2086, 2005, 15633, 2012, 11320, 29289, 1055, 4336, 3346, 2073, 5095, 1055, 2208, 2003, 2108, 2209,

344 / 12941


11/12/2021 02:28:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: spain s first half performance is showered with praise  with xavi singled out as the biggest star 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3577, 1055, 2034, 2431, 2836, 2003, 23973, 2007, 8489, 2007, 1060, 18891, 25369, 2041, 2004, 1996, 5221, 2732, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

345 / 12941


11/12/2021 02:28:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the conditions were difficult but he did well and is definitely one for the future 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3785, 2020, 3697, 2021, 2002, 2106, 2092, 1998, 2003, 5791, 2028, 2005, 1996, 2925, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

346 / 12941


11/12/2021 02:28:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:28:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as i said  winning the major honours is the hardest task of all  but in mourinho they have a manager who will make it a whole lot easier to handle the anticipation and expectation that will come their way now 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 1045, 2056, 3045, 1996, 2350, 8762, 2003, 1996, 18263, 4708, 1997, 2035, 2021, 1999, 9587, 9496,

347 / 12941


11/12/2021 02:29:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hearts of oak started the game needing a win to qualify for the final while cotonsport only needed to avoid defeat to go through 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8072, 1997, 6116, 2318, 1996, 2208, 11303, 1037, 2663, 2000, 7515, 2005, 1996, 2345, 2096, 26046, 5644, 6442, 2069, 2734, 2000, 4468, 4154, 2000, 2175, 2083, 102, 0, 0, 0, 0, 0, 0, 0

348 / 12941


11/12/2021 02:29:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he s part of our squad and he got us a couple of important goals early on 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 1055, 2112, 1997, 2256, 4686, 1998, 2002, 2288, 2149, 1037, 3232, 1997, 2590, 3289, 2220, 2006, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

349 / 12941


11/12/2021 02:29:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

350 / 12941


11/12/2021 02:29:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  no matter what anybody says  lazio are favourites to win this competition 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2053, 3043, 2054, 10334, 2758, 2474, 12426, 2024, 28271, 2000, 2663, 2023, 2971, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

351 / 12941


11/12/2021 02:29:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but in the end because of the way that  dietmar  hamann and  igor  biscan performed  we did not need to change things until right at the end of the match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1999, 1996, 2203, 2138, 1997, 1996, 2126, 2008, 8738, 7849, 10654, 11639, 1998, 14661, 20377, 9336, 2864, 2057, 2106, 2025, 2342, 2000, 2689, 2477, 21

352 / 12941


11/12/2021 02:29:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: subs not used  orr  brown 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4942, 2015, 2025, 2109, 26914, 2829, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

353 / 12941


11/12/2021 02:29:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: mawson  travis  mkandawire  james  robinson  daniel williams  stanley  hyde  pitman 105   purdie  mills 83   brown  stansfield  green 102  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5003, 9333, 2239, 10001, 12395, 13832, 20357, 2508, 6157, 3817, 3766, 6156, 11804, 6770, 2386, 8746, 16405, 17080, 2063, 6341, 6640, 2829, 9761, 15951, 2665, 9402, 102, 0, 

354 / 12941


11/12/2021 02:29:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hoddle began his managerial career as player boss with swindon before moving on to chelsea and then taking up the england job 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7570, 20338, 2211, 2010, 24465, 2476, 2004, 2447, 5795, 2007, 22350, 2077, 3048, 2006, 2000, 9295, 1998, 2059, 2635, 2039, 1996, 2563, 3105, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

355 / 12941


11/12/2021 02:29:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i ve done it with dennis bergkamp  kanu  everybody 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2310, 2589, 2009, 2007, 6877, 15214, 27052, 2361, 22827, 2226, 7955, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

356 / 12941


11/12/2021 02:29:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:29:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it has been interesting to watch games from a different perspective and i have learned things 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2038, 2042, 5875, 2000, 3422, 2399, 2013, 1037, 2367, 7339, 1998, 1045, 2031, 4342, 2477, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

357 / 12941


11/12/2021 02:30:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but how well we do depends how often we can get our best team out 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2129, 2092, 2057, 2079, 9041, 2129, 2411, 2057, 2064, 2131, 2256, 2190, 2136, 2041, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

358 / 12941


11/12/2021 02:30:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the reason thomas is not speaking to the club is because the agent wants to see the outcome of what happens to me 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3114, 2726, 2003, 2025, 4092, 2000, 1996, 2252, 2003, 2138, 1996, 4005, 4122, 2000, 2156, 1996, 9560, 1997, 2054, 6433, 2000, 2033, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

359 / 12941


11/12/2021 02:30:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: what i saw and felt made it easier to understand a few things 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2054, 1045, 2387, 1998, 2371, 2081, 2009, 6082, 2000, 3305, 1037, 2261, 2477, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

360 / 12941


11/12/2021 02:30:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he has a six month contract so we can test each other out and see if it works 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2038, 1037, 2416, 3204, 3206, 2061, 2057, 2064, 3231, 2169, 2060, 2041, 1998, 2156, 2065, 2009, 2573, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

361 / 12941


11/12/2021 02:30:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i said to harry  i hope you don t go to southampton   and he told me  absolutely not    he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2056, 2000, 4302, 1045, 3246, 2017, 2123, 1056, 2175, 2000, 11833, 1998, 2002, 2409, 2033, 7078, 2025, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

362 / 12941


11/12/2021 02:30:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but lions chairman theo paphitis has denied the claims 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 7212, 3472, 14833, 6643, 21850, 7315, 2038, 6380, 1996, 4447, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

363 / 12941


11/12/2021 02:30:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:30:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i have no diplomatic relations with him   the arsenal boss is quoted as saying 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2031, 2053, 8041, 4262, 2007, 2032, 1996, 9433, 5795, 2003, 9339, 2004, 3038, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

364 / 12941


11/12/2021 02:31:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: for liverpool  this result is very  very important 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2005, 6220, 2023, 2765, 2003, 2200, 2200, 2590, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

365 / 12941


11/12/2021 02:31:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: english clubs make euro historyall four of england s champions league representatives have reached the knockout stages for the first time 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2394, 4184, 2191, 9944, 2381, 8095, 2176, 1997, 2563, 1055, 3966, 2223, 4505, 2031, 2584, 1996, 11369, 5711, 2005, 1996, 2034, 2051, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

366 / 12941


11/12/2021 02:31:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  in the second half we had some good moments in attack 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1999, 1996, 2117, 2431, 2057, 2018, 2070, 2204, 5312, 1999, 2886, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

367 / 12941


11/12/2021 02:31:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

368 / 12941


11/12/2021 02:31:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: man utd through after exeter testmanchester united avoided an fa cup upset by edging past exeter city in their third round replay 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2158, 21183, 2094, 2083, 2044, 12869, 3231, 2386, 25322, 2142, 9511, 2019, 6904, 2452, 6314, 2011, 3968, 4726, 2627, 12869, 2103, 1999, 2037, 2353, 2461, 15712, 102, 0, 0, 0, 0, 0,

369 / 12941


11/12/2021 02:31:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   the friendly also saw the end of julie foudy and joy fawcett s us careers 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 5379, 2036, 2387, 1996, 2203, 1997, 7628, 1042, 19224, 2100, 1998, 6569, 6904, 16526, 6582, 1055, 2149, 10922, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

370 / 12941


11/12/2021 02:31:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but wenger said earlier this week that his indifferent form was down to pressure caused by being under scrutiny from the media 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 19181, 4590, 2056, 3041, 2023, 2733, 2008, 2010, 24436, 2433, 2001, 2091, 2000, 3778, 3303, 2011, 2108, 2104, 17423, 2013, 1996, 2865, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

371 / 12941


11/12/2021 02:31:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: our intention is that we will never let him go 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2256, 6808, 2003, 2008, 2057, 2097, 2196, 2292, 2032, 2175, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

372 / 12941


11/12/2021 02:31:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:31:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   pearce initially joined city as a player under keegan in 2001 before becoming part of the coaching staff 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 19560, 3322, 2587, 2103, 2004, 1037, 2447, 2104, 17710, 20307, 1999, 2541, 2077, 3352, 2112, 1997, 1996, 7748, 3095, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

373 / 12941


11/12/2021 02:32:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: more than 70 000 people abandoned the ground with the score at 1 1 and only three minutes left to play 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2062, 2084, 3963, 2199, 2111, 4704, 1996, 2598, 2007, 1996, 3556, 2012, 1015, 1015, 1998, 2069, 2093, 2781, 2187, 2000, 2377, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

374 / 12941


11/12/2021 02:32:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the homes side regained the lead in controversial fashion when robert pires won a dubious free kick 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 5014, 2217, 11842, 1996, 2599, 1999, 6801, 4827, 2043, 2728, 14255, 6072, 2180, 1037, 22917, 2489, 5926, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

375 / 12941


11/12/2021 02:32:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the bernabeu was evacuated with the score at 1 1 and two minutes of normal time remaining in the game 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 16595, 16336, 2226, 2001, 13377, 2007, 1996, 3556, 2012, 1015, 1015, 1998, 2048, 2781, 1997, 3671, 2051, 3588, 1999, 1996, 2208, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

376 / 12941


11/12/2021 02:32:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he said  can i take it please   he was very polite 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2056, 2064, 1045, 2202, 2009, 3531, 2002, 2001, 2200, 13205, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

377 / 12941


11/12/2021 02:32:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it s going as it should with the knee 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 2183, 2004, 2009, 2323, 2007, 1996, 6181, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

378 / 12941


11/12/2021 02:32:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i definitely want to stay at city because i have really improved as a player here 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 5791, 2215, 2000, 2994, 2012, 2103, 2138, 1045, 2031, 2428, 5301, 2004, 1037, 2447, 2182, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

379 / 12941


11/12/2021 02:32:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the other outgoing directors have agreed to leave their loans of Â£numberm in the company for the next four years 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2060, 22011, 5501, 2031, 3530, 2000, 2681, 2037, 10940, 1997, 1037, 29646, 19172, 5677, 2213, 1999, 1996, 2194, 2005, 1996, 2279, 2176, 2086, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

380 / 12941


11/12/2021 02:32:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:32:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it was a great opportunity   and we haven t delivered 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 1037, 2307, 4495, 1998, 2057, 4033, 1056, 5359, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

381 / 12941


11/12/2021 02:33:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: malcolm  a substitute on the day  was taken from the rangers dug out and spoken to by police about an alleged gesture he made 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8861, 1037, 7681, 2006, 1996, 2154, 2001, 2579, 2013, 1996, 7181, 8655, 2041, 1998, 5287, 2000, 2011, 2610, 2055, 2019, 6884, 9218, 2002, 2081, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

382 / 12941


11/12/2021 02:33:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: no one has come to me and said i would like to buy nicolas anelka 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2053, 2028, 2038, 2272, 2000, 2033, 1998, 2056, 1045, 2052, 2066, 2000, 4965, 9473, 2019, 2884, 2912, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

383 / 12941


11/12/2021 02:33:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   united went ahead through alan smith in the 33rd minute before bouba diop s superb 25 yard strike cancelled out the visitors  lead in the 87th minute 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2142, 2253, 3805, 2083, 5070, 3044, 1999, 1996, 20883, 3371, 2077, 8945, 19761, 4487, 7361, 1055, 21688, 2423, 4220, 4894, 8014, 2041, 1996, 5731, 2599, 1999, 

384 / 12941


11/12/2021 02:33:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: weir told bbc radio five live   we don t want to rest on our laurels and say we have achieved anything yet 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16658, 2409, 4035, 2557, 2274, 2444, 2057, 2123, 1056, 2215, 2000, 2717, 2006, 2256, 11893, 2015, 1998, 2360, 2057, 2031, 4719, 2505, 2664, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

385 / 12941


11/12/2021 02:33:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: spain s minister of sport jaime lissavetzky was quick to give his backing to the federation s decision 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3577, 1055, 2704, 1997, 4368, 14519, 20244, 19510, 2480, 4801, 2001, 4248, 2000, 2507, 2010, 5150, 2000, 1996, 4657, 1055, 3247, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

386 / 12941


11/12/2021 02:33:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: aragones insisted the comments  made to henry s arsenal club mate jose antonio reyes  were meant to motivate the player  and were not intended to be offensive 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16146, 2229, 7278, 1996, 7928, 2081, 2000, 2888, 1055, 9433, 2252, 6775, 4560, 4980, 12576, 2020, 3214, 2000, 9587, 29068, 3686, 1996, 2447, 1998, 2020,

387 / 12941


11/12/2021 02:33:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: benitez said   it was difficult for jerzy 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3841, 4221, 2480, 2056, 2009, 2001, 3697, 2005, 15333, 28534, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

388 / 12941


11/12/2021 02:33:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   souness dropped bellamy for sunday s game against arsenal  claiming the welshman had feigned injury after being asked to play out of position 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 26639, 2015, 3333, 25544, 2005, 4465, 1055, 2208, 2114, 9433, 6815, 1996, 6124, 2386, 2018, 24664, 19225, 4544, 2044, 2108, 2356, 2000, 2377, 2041, 1997, 2597, 1

389 / 12941


11/12/2021 02:33:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but boss mcclaren is looking for a victory which would mean they avoid a team that has played in the champions league in friday s third round draw 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 5795, 23680, 8017, 2368, 2003, 2559, 2005, 1037, 3377, 2029, 2052, 2812, 2027, 4468, 1037, 2136, 2008, 2038, 2209, 1999, 1996, 3966, 2223, 1999, 5958, 1055, 2

390 / 12941


11/12/2021 02:33:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:33:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he had offers from two other clubs but he decided to come to tottenham   said spurs sporting director frank arnesen 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2018, 4107, 2013, 2048, 2060, 4184, 2021, 2002, 2787, 2000, 2272, 2000, 18127, 2056, 18205, 7419, 2472, 3581, 12098, 14183, 2078, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

391 / 12941


11/12/2021 02:34:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we ve let the fans down 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2310, 2292, 1996, 4599, 2091, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

392 / 12941


11/12/2021 02:34:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i want to make sure by the end of the five years i would have been in charge that villa are achieving top six finishes in the premiership on a regular basis   said o leary  who took over at villa park in may 2003 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2215, 2000, 2191, 2469, 2011, 1996, 2203, 1997, 1996, 2274, 2086, 1045, 2052, 2031, 2042, 1

393 / 12941


11/12/2021 02:34:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we will meet with the player s representative to finalise the contract and decide when he will sign   said atletico sporting director toni munoz 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2097, 3113, 2007, 1996, 2447, 1055, 4387, 2000, 2345, 5562, 1996, 3206, 1998, 5630, 2043, 2002, 2097, 3696, 2056, 16132, 7419, 2472, 16525, 23685, 102, 0, 0, 0

394 / 12941


11/12/2021 02:34:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he added   unfortunately  i m not in control of the situation 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2794, 6854, 1045, 1049, 2025, 1999, 2491, 1997, 1996, 3663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

395 / 12941


11/12/2021 02:34:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if we get through and have another european tie it may encourage players to stay at least until the end of the season   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2057, 2131, 2083, 1998, 2031, 2178, 2647, 5495, 2009, 2089, 8627, 2867, 2000, 2994, 2012, 2560, 2127, 1996, 2203, 1997, 1996, 2161, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

396 / 12941


11/12/2021 02:34:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: campbell number 170 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6063, 2193, 10894, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

397 / 12941


11/12/2021 02:34:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bowyer  77  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6812, 10532, 6255, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

398 / 12941


11/12/2021 02:34:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  everything s right   it s 10 minutes away  there are good players there  a good set up  a good atmosphere at the ground 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2673, 1055, 2157, 2009, 1055, 2184, 2781, 2185, 2045, 2024, 2204, 2867, 2045, 1037, 2204, 2275, 2039, 1037, 2204, 7224, 2012, 1996, 2598, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

399 / 12941


11/12/2021 02:34:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:34:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: male and female captains of every national team will be able to vote  as well as their coaches and fipro   the global organisation for professional players 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3287, 1998, 2931, 15755, 1997, 2296, 2120, 2136, 2097, 2022, 2583, 2000, 3789, 2004, 2092, 2004, 2037, 7850, 1998, 10882, 21572, 1996, 3795, 5502, 2005, 26

400 / 12941


11/12/2021 02:35:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i am in a position where i want to play  and i will have to look elsewhere to do that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2572, 1999, 1037, 2597, 2073, 1045, 2215, 2000, 2377, 1998, 1045, 2097, 2031, 2000, 2298, 6974, 2000, 2079, 2008, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

401 / 12941


11/12/2021 02:35:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the crowd gave me a massive standing ovation when i came off on saturday which was nice   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 4306, 2435, 2033, 1037, 5294, 3061, 1051, 21596, 2043, 1045, 2234, 2125, 2006, 5095, 2029, 2001, 3835, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

402 / 12941


11/12/2021 02:35:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he said   i m signed here for this season and another two so there is no situation 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2056, 1045, 1049, 2772, 2182, 2005, 2023, 2161, 1998, 2178, 2048, 2061, 2045, 2003, 2053, 3663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

403 / 12941


11/12/2021 02:35:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i have wanted to do that since i went to the bobby charlton school 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2031, 2359, 2000, 2079, 2008, 2144, 1045, 2253, 2000, 1996, 6173, 17821, 2082, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

404 / 12941


11/12/2021 02:35:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

405 / 12941


11/12/2021 02:35:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  they have organised a game at a rather awkward time   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2031, 7362, 1037, 2208, 2012, 1037, 2738, 9596, 2051, 2002, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

406 / 12941


11/12/2021 02:35:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and cole put a bit of gloss on a hard fought win when he put a low shot into the bottom of the pompey net 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 5624, 2404, 1037, 2978, 1997, 27068, 2006, 1037, 2524, 4061, 2663, 2043, 2002, 2404, 1037, 2659, 2915, 2046, 1996, 3953, 1997, 1996, 13433, 8737, 3240, 5658, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

407 / 12941


11/12/2021 02:35:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we have said from day one we want to strengthen  and that is what we are hoping to do in the coming weeks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2031, 2056, 2013, 2154, 2028, 2057, 2215, 2000, 12919, 1998, 2008, 2003, 2054, 2057, 2024, 5327, 2000, 2079, 1999, 1996, 2746, 3134, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

408 / 12941


11/12/2021 02:35:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  playing so many games is certainly not healthy  especially for teams who still have european commitment 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2652, 2061, 2116, 2399, 2003, 5121, 2025, 7965, 2926, 2005, 2780, 2040, 2145, 2031, 2647, 8426, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

409 / 12941


11/12/2021 02:35:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:35:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but it is more enjoyable when you play like we did against fulham 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2009, 2003, 2062, 22249, 2043, 2017, 2377, 2066, 2057, 2106, 2114, 21703, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

410 / 12941


11/12/2021 02:36:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ferguson hails man utd s resolvemanchester united s alex ferguson has praised his players  gutsy performance in the 1 0 win at aston villa 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 11262, 16889, 2015, 2158, 21183, 2094, 1055, 10663, 2386, 25322, 2142, 1055, 4074, 11262, 2038, 5868, 2010, 2867, 18453, 2100, 2836, 1999, 1996, 1015, 1014, 2663, 2012, 14

411 / 12941


11/12/2021 02:36:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  when the whole world apart from the referee has seen there should be a goal at old trafford  that just reinforces what i feel   there should be video evidence   said wenger 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2043, 1996, 2878, 2088, 4237, 2013, 1996, 5330, 2038, 2464, 2045, 2323, 2022, 1037, 3125, 2012, 2214, 26894, 2008, 2074, 19444, 2015, 205

412 / 12941


11/12/2021 02:36:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   however  the fa will not take action over comments on the same issue by chelsea defender john terry after the match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 1996, 6904, 2097, 2025, 2202, 2895, 2058, 7928, 2006, 1996, 2168, 3277, 2011, 9295, 8291, 2198, 6609, 2044, 1996, 2674, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

413 / 12941


11/12/2021 02:36:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we want to win something  but i think it is impossible to win all four trophies   mourinho admitted 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2215, 2000, 2663, 2242, 2021, 1045, 2228, 2009, 2003, 5263, 2000, 2663, 2035, 2176, 22236, 9587, 9496, 25311, 2080, 4914, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

414 / 12941


11/12/2021 02:36:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i still think arsenal will be the champions  even though it doesn t look that way at the moment   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2145, 2228, 9433, 2097, 2022, 1996, 3966, 2130, 2295, 2009, 2987, 1056, 2298, 2008, 2126, 2012, 1996, 2617, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

415 / 12941


11/12/2021 02:36:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   henchoz seems certain to leave liverpool after recently criticising manager rafael benitez 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 21863, 9905, 2480, 3849, 3056, 2000, 2681, 6220, 2044, 3728, 6232, 9355, 3208, 10999, 3841, 4221, 2480, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

416 / 12941


11/12/2021 02:36:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: clinton morrison  stephen carr  andy o brien  matt holland  andy reid and jon macken are all back after injury 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7207, 9959, 4459, 12385, 5557, 1051, 9848, 4717, 7935, 5557, 9027, 1998, 6285, 11349, 2368, 2024, 2035, 2067, 2044, 4544, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

417 / 12941


11/12/2021 02:36:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   souness may offer olivier bernard  titus bramble or lee bowyer as bait in a deal for the 27 year old 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 26639, 2015, 2089, 3749, 14439, 6795, 18828, 20839, 3468, 2030, 3389, 6812, 10532, 2004, 17395, 1999, 1037, 3066, 2005, 1996, 2676, 2095, 2214, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

418 / 12941


11/12/2021 02:36:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if we do not make it this time  then there is the european championship 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2057, 2079, 2025, 2191, 2009, 2023, 2051, 2059, 2045, 2003, 1996, 2647, 2528, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

419 / 12941


11/12/2021 02:36:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:36:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: under delaney s leadership  the fai has agreed to irish government demands to adhere to the recommendations of the genesis report 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2104, 22101, 1055, 4105, 1996, 26208, 2038, 3530, 2000, 3493, 2231, 7670, 2000, 25276, 2000, 1996, 11433, 1997, 1996, 11046, 3189, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

420 / 12941


11/12/2021 02:37:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:37:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: on thursday we will work on something but when it comes to dealing with bergkamp and henry i will probably say to my defenders just close your eyes and get your fingers crossed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2006, 9432, 2057, 2097, 2147, 2006, 2242, 2021, 2043, 2009, 3310, 2000, 7149, 2007, 15214, 27052, 2361, 1998, 2888, 1045, 2097, 2763, 

421 / 12941


11/12/2021 02:37:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:37:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the most important thing for me is that i m comfortable with the way i m playing   i must admit  the last month has been my best month for a while 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2087, 2590, 2518, 2005, 2033, 2003, 2008, 1045, 1049, 6625, 2007, 1996, 2126, 1045, 1049, 2652, 1045, 2442, 6449, 1996, 2197, 3204, 2038, 2042, 2026, 2190, 3

422 / 12941


11/12/2021 02:37:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:37:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i m not worried because i am proud of the england team and i want the best for it 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 2025, 5191, 2138, 1045, 2572, 7098, 1997, 1996, 2563, 2136, 1998, 1045, 2215, 1996, 2190, 2005, 2009, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

423 / 12941


11/12/2021 02:37:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:37:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: tenemos que trabajar mucho 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2702, 6633, 2891, 10861, 19817, 19736, 16084, 2172, 2080, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

424 / 12941


11/12/2021 02:37:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:37:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but to pull together a sub plot or conspiracy theory that the own goal  combined with liverpool s defeat  has finally put gerrard on the road to stamford bridge is nonsense 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2000, 4139, 2362, 1037, 4942, 5436, 2030, 9714, 3399, 2008, 1996, 2219, 3125, 4117, 2007, 6220, 1055, 4154, 2038, 2633, 2404, 16216,

425 / 12941


11/12/2021 02:38:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: however  it could be that the fa decide to leave that matter to the premier league  in which case barwick would certainly have enough to get to grips with 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 2009, 2071, 2022, 2008, 1996, 6904, 5630, 2000, 2681, 2008, 3043, 2000, 1996, 4239, 2223, 1999, 2029, 2553, 3347, 7184, 2052, 5121, 2031, 2438, 2000, 

426 / 12941


11/12/2021 02:38:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   barwick is determined to get more respect for referees at all levels of the game 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3347, 7184, 2003, 4340, 2000, 2131, 2062, 4847, 2005, 25118, 2012, 2035, 3798, 1997, 1996, 2208, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

427 / 12941


11/12/2021 02:38:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   mendes shot from 50 yards and united goalkeeper roy carroll spilled the ball into his own net before hooking it clear 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 27916, 2915, 2013, 2753, 4210, 1998, 2142, 9653, 6060, 10767, 13439, 1996, 3608, 2046, 2010, 2219, 5658, 2077, 8103, 2075, 2009, 3154, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

428 / 12941


11/12/2021 02:38:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   and the william hill concession does not apply to correct score or outright bet results 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 1996, 2520, 2940, 16427, 2515, 2025, 6611, 2000, 6149, 3556, 2030, 13848, 6655, 3463, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

429 / 12941


11/12/2021 02:38:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: teenager rooney returned to everton after euro 2004 with superstar status assured after stunning performances 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 10563, 24246, 2513, 2000, 18022, 2044, 9944, 2432, 2007, 18795, 3570, 8916, 2044, 14726, 4616, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

430 / 12941


11/12/2021 02:38:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the fact that there is no delay to the game has impressed the ifab  which is made up of four fifa representatives plus a member of each of the four home associations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2755, 2008, 2045, 2003, 2053, 8536, 2000, 1996, 2208, 2038, 7622, 1996, 2065, 7875, 2029, 2003, 2081, 2039, 1997, 2176, 5713, 4505, 4606, 1

431 / 12941


11/12/2021 02:38:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:38:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they beat blackburn  who didn t half put a foot in against them  and then came through a real tough one at everton  albeit aided by james beattie s sending off 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 3786, 13934, 2040, 2134, 1056, 2431, 2404, 1037, 3329, 1999, 2114, 2068, 1998, 2059, 2234, 2083, 1037, 2613, 7823, 2028, 2012, 18022, 12167, 1155

432 / 12941


11/12/2021 02:39:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:39:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  not in a million years would john terry have gone down in the same way 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2025, 1999, 1037, 2454, 2086, 2052, 2198, 6609, 2031, 2908, 2091, 1999, 1996, 2168, 2126, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

433 / 12941


11/12/2021 02:39:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:39:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he is a bit introverted but he has got character 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2003, 1037, 2978, 17174, 26686, 2021, 2002, 2038, 2288, 2839, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

434 / 12941


11/12/2021 02:39:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:39:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i don t personally like to see weakened sides in the competition  but it seems in some cases it s the only chance for the young lads to get some first team experience 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2123, 1056, 7714, 2066, 2000, 2156, 11855, 3903, 1999, 1996, 2971, 2021, 2009, 3849, 1999, 2070, 3572, 2009, 1055, 1996, 2069, 3382, 2005,

435 / 12941


11/12/2021 02:39:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:39:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i send a minder with them to look after them  not because i don t trust them 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 4604, 1037, 2568, 2121, 2007, 2068, 2000, 2298, 2044, 2068, 2025, 2138, 1045, 2123, 1056, 3404, 2068, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

436 / 12941


11/12/2021 02:39:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:39:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fa executive director david davies said   we will be very interested to see this presentation   it s true we have been more interested in the use of technology than other members of the board 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6904, 3237, 2472, 2585, 9082, 2056, 2057, 2097, 2022, 2200, 4699, 2000, 2156, 2023, 8312, 2009, 1055, 2995, 2057, 2031,

437 / 12941


11/12/2021 02:40:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  they are making the odd one or two serious errors and the spotlight is going to be on them but they have to put it to the back of their minds and roll their sleeves up 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2024, 2437, 1996, 5976, 2028, 2030, 2048, 3809, 10697, 1998, 1996, 17763, 2003, 2183, 2000, 2022, 2006, 2068, 2021, 2027, 2031, 2000, 24

438 / 12941


11/12/2021 02:40:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as for potential winners  you can t look beyond the top three or four clubs  and i can t see too many upsets 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2005, 4022, 4791, 2017, 2064, 1056, 2298, 3458, 1996, 2327, 2093, 2030, 2176, 4184, 1998, 1045, 2064, 1056, 2156, 2205, 2116, 6314, 2015, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

439 / 12941


11/12/2021 02:40:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but the ex france coach admitted he  dug his own grave  by agreeing to join the club before the end of euro 2004 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1996, 4654, 2605, 2873, 4914, 2002, 8655, 2010, 2219, 6542, 2011, 16191, 2000, 3693, 1996, 2252, 2077, 1996, 2203, 1997, 9944, 2432, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

440 / 12941


11/12/2021 02:40:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but the premier league say it is  extremely unlikely  it would be implemented in england 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1996, 4239, 2223, 2360, 2009, 2003, 5186, 9832, 2009, 2052, 2022, 7528, 1999, 2563, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

441 / 12941


11/12/2021 02:40:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the european xi were more subdued despite the presence of zinedine zidane and raul  though del piero did halve the deficit with a lovely finish 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2647, 8418, 2020, 2062, 20442, 2750, 1996, 3739, 1997, 1062, 21280, 3170, 1062, 8524, 2638, 1998, 16720, 2295, 3972, 10356, 2080, 2106, 11085, 3726, 1996, 15074,

442 / 12941


11/12/2021 02:40:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: there are also several players with african roots set to play in the match with french stars zinedine zidane and patrick vieira and belgium s vincent kompany among those confirmed by fifa 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2024, 2036, 2195, 2867, 2007, 3060, 6147, 2275, 2000, 2377, 1999, 1996, 2674, 2007, 2413, 3340, 1062, 21280, 3170, 10

443 / 12941


11/12/2021 02:40:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fa cup losing its sheenthe fa cup used to be sacrosanct 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 6904, 2452, 3974, 2049, 20682, 10760, 6904, 2452, 2109, 2000, 2022, 17266, 7352, 2319, 6593, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

444 / 12941


11/12/2021 02:40:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:40:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the dutch lost 2 1 to portugal in the semi final and that was enough to prompt a change in the dutch hierarchy 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3803, 2439, 1016, 1015, 2000, 5978, 1999, 1996, 4100, 2345, 1998, 2008, 2001, 2438, 2000, 25732, 1037, 2689, 1999, 1996, 3803, 12571, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

445 / 12941


11/12/2021 02:41:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: his victory in fifa s annual vote may be hard for henry and shevchenko  who have amazed with their exploits  but ronaldinho just has that little extra va va voom 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2010, 3377, 1999, 5713, 1055, 3296, 3789, 2089, 2022, 2524, 2005, 2888, 1998, 2016, 25465, 19767, 2040, 2031, 15261, 2007, 2037, 20397, 2021, 8923, 2

446 / 12941


11/12/2021 02:41:13 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:13 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the thought struck me that the accumulation of talent on the belo horizonte pitch was at least as outstanding as that about to go into action in portugal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2245, 4930, 2033, 2008, 1996, 20299, 1997, 5848, 2006, 1996, 19337, 2080, 9154, 2618, 6510, 2001, 2012, 2560, 2004, 5151, 2004, 2008, 2055, 2000, 2175,

447 / 12941


11/12/2021 02:41:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the striker enjoyed a glittering career  playing with boca juniors  river plate  roma  benfica  dundee and rangers among others 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 11854, 5632, 1037, 20332, 2476, 2652, 2007, 22765, 16651, 2314, 5127, 12836, 26542, 14252, 1998, 7181, 2426, 2500, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

448 / 12941


11/12/2021 02:41:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i was surprised by the number of people who questioned whether the referee was trying to level things out 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2001, 4527, 2011, 1996, 2193, 1997, 2111, 2040, 8781, 3251, 1996, 5330, 2001, 2667, 2000, 2504, 2477, 2041, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

449 / 12941


11/12/2021 02:41:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: blatter wants to simplify the current laws which partially rely on the interpretation of assistant referees 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1038, 20051, 3334, 4122, 2000, 21934, 28250, 1996, 2783, 4277, 2029, 6822, 11160, 2006, 1996, 7613, 1997, 3353, 25118, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

450 / 12941


11/12/2021 02:41:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but it is a once in a lifetime opportunity and i don t think he will spoil it by being too petulant 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2009, 2003, 1037, 2320, 1999, 1037, 6480, 4495, 1998, 1045, 2123, 1056, 2228, 2002, 2097, 27594, 2009, 2011, 2108, 2205, 9004, 7068, 3372, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

451 / 12941


11/12/2021 02:41:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:41:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  meanwhile  everton boss david moyes told bbc radio five live that managers often approach a player s agent before talking to the club   but he insisted that talking with a player without his club s consent is a no go area 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5564, 18022, 5795, 2585, 9587, 23147, 2409, 4035, 2557, 2274, 2444, 2008, 10489, 2411, 3

452 / 12941


11/12/2021 02:42:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: barcelona  for me  is perfect 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7623, 2005, 2033, 2003, 3819, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

453 / 12941


11/12/2021 02:42:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and to gaucci s reported description of her as  very beautiful  with a  great figure   prinz responded   i m not especially suited to be a glamour girl 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2000, 11721, 16835, 1055, 2988, 6412, 1997, 2014, 2004, 2200, 3376, 2007, 1037, 2307, 3275, 26927, 14191, 5838, 1045, 1049, 2025, 2926, 10897, 2000, 2022

454 / 12941


11/12/2021 02:42:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: cantona was appearing on a show where supporters quiz guests and it was broadcast at numberpm 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8770, 2050, 2001, 6037, 2006, 1037, 2265, 2073, 6793, 19461, 6368, 1998, 2009, 2001, 3743, 2012, 2193, 9737, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

455 / 12941


11/12/2021 02:42:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  ryan isn t asking for any more money and he certainly doesn t want to leave manchester united 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4575, 3475, 1056, 4851, 2005, 2151, 2062, 2769, 1998, 2002, 5121, 2987, 1056, 2215, 2000, 2681, 5087, 2142, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

456 / 12941


11/12/2021 02:42:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

457 / 12941


11/12/2021 02:42:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: city tie on as weather hits gamesoldham athletic s fa cup tie against manchester city will go ahead despite severe weather conditons  which have hit several games across britain 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2103, 5495, 2006, 2004, 4633, 4978, 2399, 11614, 3511, 5188, 1055, 6904, 2452, 5495, 2114, 5087, 2103, 2097, 2175, 3805, 2750, 5729,

458 / 12941


11/12/2021 02:42:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it certainly doesn t move around like a remote control bumble bee or something 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 5121, 2987, 1056, 2693, 2105, 2066, 1037, 6556, 2491, 26352, 3468, 10506, 2030, 2242, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

459 / 12941


11/12/2021 02:42:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  my only regret is having to put an end to my career because of an injury like marco van basten or glenn hoddle 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2026, 2069, 9038, 2003, 2383, 2000, 2404, 2019, 2203, 2000, 2026, 2476, 2138, 1997, 2019, 4544, 2066, 8879, 3158, 19021, 6528, 2030, 9465, 7570, 20338, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

460 / 12941


11/12/2021 02:42:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:42:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: tapping up row is so much hot airthe big talking point of the week is the issue of making illegal approaches or  tapping up  a player 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 15135, 2039, 5216, 2003, 2061, 2172, 2980, 2250, 10760, 2502, 3331, 2391, 1997, 1996, 2733, 2003, 1996, 3277, 1997, 2437, 6206, 8107, 2030, 15135, 2039, 1037, 2447, 102, 0, 0, 

461 / 12941


11/12/2021 02:43:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  there is a way the player gets got to but that is part of football 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2003, 1037, 2126, 1996, 2447, 4152, 2288, 2000, 2021, 2008, 2003, 2112, 1997, 2374, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

462 / 12941


11/12/2021 02:43:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

463 / 12941


11/12/2021 02:43:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

464 / 12941


11/12/2021 02:43:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but wherever pennant ends up  talk will not be of his potential but whether he can steer clear of trouble off the pitch 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 11210, 22690, 4515, 2039, 2831, 2097, 2025, 2022, 1997, 2010, 4022, 2021, 3251, 2002, 2064, 20634, 3154, 1997, 4390, 2125, 1996, 6510, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

465 / 12941


11/12/2021 02:43:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but if one day we lose a game or two points and the gap goes from 10 to seven or eight  we are ready to accept it as natural and keep going  controlling the distance 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2065, 2028, 2154, 2057, 4558, 1037, 2208, 2030, 2048, 2685, 1998, 1996, 6578, 3632, 2013, 2184, 2000, 2698, 2030, 2809, 2057, 2024, 3201, 

466 / 12941


11/12/2021 02:43:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   faye could well stay after fellow midfielder nigel quashie completed his move to re join former pompey boss harry redknapp at rivals southampton on monday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 19243, 2071, 2092, 2994, 2044, 3507, 8850, 12829, 24209, 12914, 2063, 2949, 2010, 2693, 2000, 2128, 3693, 2280, 13433, 8737, 3240, 5795, 4302, 2417, 2243,

467 / 12941


11/12/2021 02:43:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the last player to score past cech in the premiership was arsenal s thierry henry in the 2 2 draw on 12 december 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2197, 2447, 2000, 3556, 2627, 8292, 2818, 1999, 1996, 11264, 2001, 9433, 1055, 26413, 2888, 1999, 1996, 1016, 1016, 4009, 2006, 2260, 2285, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

468 / 12941


11/12/2021 02:43:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:43:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it was dutch destroyer robben who made the early breakthrough  latching on to an eidur gudjohnsen header to turn ryan nelson inside out and fire under the body of friedel from 20 yards 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 3803, 9799, 26211, 2368, 2040, 2081, 1996, 2220, 12687, 25635, 2075, 2006, 2000, 2019, 1041, 3593, 3126, 19739, 20

469 / 12941


11/12/2021 02:44:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it will be an easier decision for him if he scores every week as he is now only 14 away from jackie milburn s club record  of 186  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2097, 2022, 2019, 6082, 3247, 2005, 2032, 2065, 2002, 7644, 2296, 2733, 2004, 2002, 2003, 2085, 2069, 2403, 2185, 2013, 9901, 23689, 8022, 1055, 2252, 2501, 1997, 19609, 102,

470 / 12941


11/12/2021 02:44:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: tenga said that fifa has also offered to organize courses for tanzanian referees and support development programmes 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2702, 3654, 2056, 2008, 5713, 2038, 2036, 3253, 2000, 10939, 5352, 2005, 11959, 2078, 25118, 1998, 2490, 2458, 8497, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

471 / 12941


11/12/2021 02:44:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and riise came within inches of the breakthrough when he latched onto a superb ball from garcia and hit a rasping drive  which kiely tipped on to the charlton bar 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 15544, 5562, 2234, 2306, 5282, 1997, 1996, 12687, 2043, 2002, 25635, 2098, 3031, 1037, 21688, 3608, 2013, 7439, 1998, 2718, 1037, 20710, 4691,

472 / 12941


11/12/2021 02:44:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but my team would never lose their confidence or mentality just because of a defeat here 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2026, 2136, 2052, 2196, 4558, 2037, 7023, 2030, 5177, 3012, 2074, 2138, 1997, 1037, 4154, 2182, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

473 / 12941


11/12/2021 02:44:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   however  o driscoll accepts that with the loan market blocked  clubs such as bournemouth will have to look to their own youth ranks when injuries begin to bite 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2174, 1051, 2852, 2483, 26895, 13385, 2008, 2007, 1996, 5414, 3006, 8534, 4184, 2107, 2004, 22882, 2097, 2031, 2000, 2298, 2000, 2037, 2219, 3360, 69

474 / 12941


11/12/2021 02:44:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:44:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: despite his concerns  he said he was relishing the move and said there was a  certain magic  about anfield  which was far calmer than the bernabeu 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2750, 2010, 5936, 2002, 2056, 2002, 2001, 2128, 13602, 2075, 1996, 2693, 1998, 2056, 2045, 2001, 1037, 3056, 3894, 2055, 2019, 3790, 2029, 2001, 2521, 5475, 2121, 2

475 / 12941


11/12/2021 02:45:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: chelsea have not dropped a point in the premiership since drawing 2 2 at arsenal on 12 december and are moving inexorably towards their first premiership title 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9295, 2031, 2025, 3333, 1037, 2391, 1999, 1996, 11264, 2144, 5059, 1016, 1016, 2012, 9433, 2006, 2260, 2285, 1998, 2024, 3048, 1999, 10288, 6525, 6321,

476 / 12941


11/12/2021 02:45:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

477 / 12941


11/12/2021 02:45:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: sheff utd  kenny  thirlwell 28   geary  bromby  jagielka  harley  liddell  tonge  quinn 59   montgomery  cullip  shaw  forte 59   gray 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2016, 4246, 21183, 2094, 8888, 16215, 4313, 2140, 4381, 2654, 6718, 2100, 22953, 14905, 2100, 14855, 11239, 26518, 13653, 11876, 12662, 15740, 2063, 8804, 5354, 8482, 12731, 68

478 / 12941


11/12/2021 02:45:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they  government  are banning football in uganda and not obua 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2231, 2024, 21029, 2374, 1999, 10031, 1998, 2025, 27885, 6692, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

479 / 12941


11/12/2021 02:45:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: man utd women s team to be axedmanchester united will scrap their women s team once the current season ends  just three months before the north west hosts the women s euro 2005 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 2158, 21183, 2094, 2308, 1055, 2136, 2000, 2022, 12946, 21804, 25322, 2142, 2097, 15121, 2037, 2308, 1055, 2136, 2320, 1996, 2783, 21

480 / 12941


11/12/2021 02:45:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   the acceptance of blatter s suggestions means oliphant will stand for re election as safa president instead of stepping aside later this year as he had indicated he would 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 9920, 1997, 1038, 20051, 3334, 1055, 15690, 2965, 19330, 11514, 4819, 2102, 2097, 3233, 2005, 2128, 2602, 2004, 7842, 7011, 2343, 26

481 / 12941


11/12/2021 02:45:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and pizarro profited from more poor defending by toure to head mehmet scholl s free kick past goalkeeper jens lehmann 12 minutes after half time 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 14255, 9057, 3217, 5618, 2098, 2013, 2062, 3532, 6984, 2011, 2778, 2063, 2000, 2132, 2033, 14227, 3388, 8040, 14854, 2140, 1055, 2489, 5926, 2627, 9653, 25093, 

482 / 12941


11/12/2021 02:45:54 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:45:54 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fabregas was lucky to escape with just a booking  but montgomery was perhaps luckier to get up from the challenge 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6904, 13578, 12617, 2001, 5341, 2000, 4019, 2007, 2074, 1037, 21725, 2021, 8482, 2001, 3383, 6735, 3771, 2000, 2131, 2039, 2013, 1996, 4119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

483 / 12941


11/12/2021 02:46:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: crusaders made sure that ballymena had a nervous final few minutes when morrow darted in to score 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 18831, 2081, 2469, 2008, 3608, 25219, 2532, 2018, 1037, 6091, 2345, 2261, 2781, 2043, 19084, 14051, 1999, 2000, 3556, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

484 / 12941


11/12/2021 02:46:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: united have also been hit by injuries to both alan smith and louis saha during van nistelrooy s absence  meaning wayne rooney has sometimes had to play in a lone role up front 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2142, 2031, 2036, 2042, 2718, 2011, 6441, 2000, 2119, 5070, 3044, 1998, 3434, 7842, 3270, 2076, 3158, 9152, 13473, 20974, 9541, 2100, 1

485 / 12941


11/12/2021 02:46:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it takes two to tango 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 3138, 2048, 2000, 17609, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

486 / 12941


11/12/2021 02:46:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we will have 10 000 tickets which may extend to 13 000 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2097, 2031, 2184, 2199, 9735, 2029, 2089, 7949, 2000, 2410, 2199, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

487 / 12941


11/12/2021 02:46:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: morocco 5 1 kenyamorocco thrashed kenya 5 1 in a 2006 world cup qualifier on wednesday in rabat 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 9835, 1019, 1015, 7938, 5302, 3217, 21408, 27042, 2098, 7938, 1019, 1015, 1999, 1037, 2294, 2088, 2452, 10981, 2006, 9317, 1999, 10958, 14479, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

488 / 12941


11/12/2021 02:46:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i felt as soon as it hit my boot it had missed 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2371, 2004, 2574, 2004, 2009, 2718, 2026, 9573, 2009, 2018, 4771, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

489 / 12941


11/12/2021 02:46:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he only returned to the england side last weekend after a long term back injury  which was followed by a fractured eye socket 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2069, 2513, 2000, 1996, 2563, 2217, 2197, 5353, 2044, 1037, 2146, 2744, 2067, 4544, 2029, 2001, 2628, 2011, 1037, 21726, 3239, 22278, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

490 / 12941


11/12/2021 02:46:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   france coach bernard laporte accepted his side had not played well 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2605, 2873, 6795, 5001, 11589, 2063, 3970, 2010, 2217, 2018, 2025, 2209, 2092, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

491 / 12941


11/12/2021 02:46:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:46:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the return of jauzion is going to be a plus for us   said laporte 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2709, 1997, 14855, 17040, 3258, 2003, 2183, 2000, 2022, 1037, 4606, 2005, 2149, 2056, 5001, 11589, 2063, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

492 / 12941


11/12/2021 02:47:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  as far as i m concerned  that incident and mark cueto s effort from charlie hodgson s cross field kick that led to what looked like a good try were the two key elements in the game 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2521, 2004, 1045, 1049, 4986, 2008, 5043, 1998, 2928, 16091, 3406, 1055, 3947, 2013, 4918, 26107, 1055, 2892, 2492, 5926, 2

493 / 12941


11/12/2021 02:47:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we have done everything but win a game of rugby  but ireland are a good side 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2031, 2589, 2673, 2021, 2663, 1037, 2208, 1997, 4043, 2021, 3163, 2024, 1037, 2204, 2217, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

494 / 12941


11/12/2021 02:47:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: o gara revels in ireland victoryireland fly half ronan o gara hailed his side s 19 13 victory over england as a  special  win 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 1051, 11721, 2527, 7065, 9050, 1999, 3163, 3377, 7442, 3122, 4875, 2431, 18633, 1051, 11721, 2527, 16586, 2010, 2217, 1055, 2539, 2410, 3377, 2058, 2563, 2004, 1037, 2569, 2663, 102, 0

495 / 12941


11/12/2021 02:47:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he has been vice captain all along throughout the championship 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2038, 2042, 3580, 2952, 2035, 2247, 2802, 1996, 2528, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

496 / 12941


11/12/2021 02:47:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we worked hard as a squad and i m a proud welshman 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2499, 2524, 2004, 1037, 4686, 1998, 1045, 1049, 1037, 7098, 6124, 2386, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

497 / 12941


11/12/2021 02:47:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: with three minutes of normal time left  narraway was driven over for a try in the corner which levelled the scores at 27 27 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2007, 2093, 2781, 1997, 3671, 2051, 2187, 6583, 11335, 4576, 2001, 5533, 2058, 2005, 1037, 3046, 1999, 1996, 3420, 2029, 2504, 3709, 1996, 7644, 2012, 2676, 2676, 102, 0, 0, 0, 0, 0, 0, 0

498 / 12941


11/12/2021 02:47:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:47:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: irish replied with three penalties and a mark mapletoft drop goal before scott staniforth ran in a consolation try 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3493, 3880, 2007, 2093, 12408, 1998, 1037, 2928, 11035, 3406, 6199, 4530, 3125, 2077, 3660, 9761, 10128, 28610, 2743, 1999, 1037, 24831, 3046, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

499 / 12941


11/12/2021 02:48:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i m just glad to be part of it all 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 2074, 5580, 2000, 2022, 2112, 1997, 2009, 2035, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

500 / 12941


11/12/2021 02:48:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we will go to twickenham with a little fear and it ll give us a boost   said the french coach 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2097, 2175, 2000, 1056, 7184, 23580, 2007, 1037, 2210, 3571, 1998, 2009, 2222, 2507, 2149, 1037, 12992, 2056, 1996, 2413, 2873, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

501 / 12941


11/12/2021 02:48:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: davies  one of the stars of saturday s rbs six nations win over england  is only on a year contract at kingsholm 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9082, 2028, 1997, 1996, 3340, 1997, 5095, 1055, 21144, 2015, 2416, 3741, 2663, 2058, 2563, 2003, 2069, 2006, 1037, 2095, 3206, 2012, 5465, 18884, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

502 / 12941


11/12/2021 02:48:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  ultimately there is a match assessor at every international game to give an impartial and objective view of the performance of the officials 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4821, 2045, 2003, 1037, 2674, 14358, 2953, 2012, 2296, 2248, 2208, 2000, 2507, 2019, 17727, 8445, 4818, 1998, 7863, 3193, 1997, 1996, 2836, 1997, 1996, 4584, 102, 0, 0, 

503 / 12941


11/12/2021 02:48:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: robinson now has a tricky decision over whether to withdraw from the firing line  after just one outing  a player he regards as central to england s future 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6157, 2085, 2038, 1037, 24026, 3247, 2058, 3251, 2000, 10632, 2013, 1996, 7493, 2240, 2044, 2074, 2028, 26256, 1037, 2447, 2002, 12362, 2004, 2430, 2000, 2

504 / 12941


11/12/2021 02:48:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bath coach john connolly rates barkley as no better than a 50 50 chance to make the dublin trip 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7198, 2873, 2198, 21018, 6165, 11286, 3051, 2004, 2053, 2488, 2084, 1037, 2753, 2753, 3382, 2000, 2191, 1996, 5772, 4440, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

505 / 12941


11/12/2021 02:48:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we had our first attack in the italian half after 22 minutes   said o sullivan 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2018, 2256, 2034, 2886, 1999, 1996, 3059, 2431, 2044, 2570, 2781, 2056, 1051, 7624, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

506 / 12941


11/12/2021 02:48:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: murphy  horgan  o driscoll  d arcy  hickie  o gara  stringer  corrigan  byrne  hayes  o kelly  o connell  s easterby  leamy  foley 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7104, 7570, 16998, 1051, 2852, 2483, 26895, 1040, 8115, 2100, 7632, 18009, 2063, 1051, 11721, 2527, 5164, 2121, 2522, 28706, 14928, 10192, 1051, 5163, 1051, 17199, 1055, 10957, 376

507 / 12941


11/12/2021 02:48:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i m glad he s welsh 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 5580, 2002, 1055, 6124, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

508 / 12941


11/12/2021 02:48:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:48:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: there was so much build up before england  but we fly out to rome on thursday and we ll be back playing again 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2001, 2061, 2172, 3857, 2039, 2077, 2563, 2021, 2057, 4875, 2041, 2000, 4199, 2006, 9432, 1998, 2057, 2222, 2022, 2067, 2652, 2153, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

509 / 12941


11/12/2021 02:49:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  orquera s kicking was off but he showed great courage in defence 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2030, 4226, 2527, 1055, 10209, 2001, 2125, 2021, 2002, 3662, 2307, 8424, 1999, 4721, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

510 / 12941


11/12/2021 02:49:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: laporte is expected to announce france s starting line up on wednesday 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5001, 11589, 2063, 2003, 3517, 2000, 14970, 2605, 1055, 3225, 2240, 2039, 2006, 9317, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

511 / 12941


11/12/2021 02:49:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: grewcock was sin binned with wales captain gareth thomas for retaliation 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3473, 13959, 2001, 8254, 8026, 7228, 2007, 3575, 2952, 20243, 2726, 2005, 18695, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

512 / 12941


11/12/2021 02:49:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hogg made a couple of good runs while white had a pretty robust game   his defence is right up there 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 27589, 2290, 2081, 1037, 3232, 1997, 2204, 3216, 2096, 2317, 2018, 1037, 3492, 15873, 2208, 2010, 4721, 2003, 2157, 2039, 2045, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

513 / 12941


11/12/2021 02:49:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as yet we do not know the full extent of the injuries  but it does not that good 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2664, 2057, 2079, 2025, 2113, 1996, 2440, 6698, 1997, 1996, 6441, 2021, 2009, 2515, 2025, 2008, 2204, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

514 / 12941


11/12/2021 02:49:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  jonathan thomas is unlucky to lose his spot after performing well against italy and scoring a try  but such is the competition for places that every position is debated in detail 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5655, 2726, 2003, 4895, 7630, 17413, 2000, 4558, 2010, 3962, 2044, 4488, 2092, 2114, 3304, 1998, 4577, 1037, 3046, 2021, 2107, 2003

515 / 12941


11/12/2021 02:49:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:49:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if that was a paltry reward for their early pressure  scotland got the try they deserved when paterson s searing break and andy craig s pass sent southwell streaking to the right corner 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2008, 2001, 1037, 14412, 11129, 10377, 2005, 2037, 2220, 3778, 3885, 2288, 1996, 3046, 2027, 10849, 2043, 19162, 1055, 

516 / 12941


11/12/2021 02:50:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but a clever high kick from henson almost brought a try for hal luscombe when roland de marigny and ludovico nitoglia made a hash of claiming it as the ball bounced into touch 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1037, 12266, 2152, 5926, 2013, 27227, 2471, 2716, 1037, 3046, 2005, 11085, 11320, 9363, 18552, 2043, 8262, 2139, 16266, 19393, 19

517 / 12941


11/12/2021 02:50:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  gethin jenkins is starting at loose head for them 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2131, 10606, 11098, 2003, 3225, 2012, 6065, 2132, 2005, 2068, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

518 / 12941


11/12/2021 02:50:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: shane horgan threw an overhead pass as he was about to be forced into touch and stringer scooted over  with o gara landing the tricky conversion 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8683, 7570, 16998, 4711, 2019, 8964, 3413, 2004, 2002, 2001, 2055, 2000, 2022, 3140, 2046, 3543, 1998, 5164, 2121, 24289, 2058, 2007, 1051, 11721, 2527, 4899, 1996, 2

519 / 12941


11/12/2021 02:50:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  and it s the same again in getting into the senior squad 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2009, 1055, 1996, 2168, 2153, 1999, 2893, 2046, 1996, 3026, 4686, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

520 / 12941


11/12/2021 02:50:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   ruddock  though  is keen to protect his players from injury and fatigue 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 25298, 7432, 2295, 2003, 10326, 2000, 4047, 2010, 2867, 2013, 4544, 1998, 16342, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

521 / 12941


11/12/2021 02:50:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ulster were the last irish team to play at the paseo de anoeta stadium where they faced a euskarians side during a pre season tour in 1998 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11059, 2020, 1996, 2197, 3493, 2136, 2000, 2377, 2012, 1996, 14674, 8780, 2139, 2019, 8913, 2696, 3346, 2073, 2027, 4320, 1037, 7327, 8337, 23543, 2217, 2076, 1037, 3653, 2

522 / 12941


11/12/2021 02:50:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it is so absurd that it borders on the humorous 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2003, 2061, 18691, 2008, 2009, 6645, 2006, 1996, 14742, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

523 / 12941


11/12/2021 02:50:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:50:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: pountney pleaded guilty to the offence before a panel consisting of chairman robert horner  nigel gillingham and jeff probyn 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 13433, 16671, 5420, 12254, 5905, 2000, 1996, 15226, 2077, 1037, 5997, 5398, 1997, 3472, 2728, 24084, 2099, 12829, 12267, 16445, 1998, 5076, 4013, 3762, 2078, 102, 0, 0, 0, 0, 0, 0, 0, 0,

524 / 12941


11/12/2021 02:51:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: w servat  o milloud  g lamboley  i harinordoquy  p mignoni  f michalak  j p grandclaude 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1059, 14262, 22879, 1051, 4971, 19224, 1043, 12559, 9890, 2100, 1045, 21291, 12131, 3527, 28940, 2100, 1052, 19117, 8540, 2072, 1042, 23025, 19531, 2243, 1046, 1052, 2882, 20464, 19513, 2063, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0,

525 / 12941


11/12/2021 02:51:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  players tend to know better than most coaches 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2867, 7166, 2000, 2113, 2488, 2084, 2087, 7850, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

526 / 12941


11/12/2021 02:51:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it will be fundamental to keep cool in the difficult moments   in the key situations of the game 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2097, 2022, 8050, 2000, 2562, 4658, 1999, 1996, 3697, 5312, 1999, 1996, 3145, 8146, 1997, 1996, 2208, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

527 / 12941


11/12/2021 02:51:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we have had a good week s training and we are all looking forward to the challenge 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2031, 2018, 1037, 2204, 2733, 1055, 2731, 1998, 2057, 2024, 2035, 2559, 2830, 2000, 1996, 4119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

528 / 12941


11/12/2021 02:51:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: simon raiwalui and ben russell are also selected in the pack while kevin sorrell comes in at outside centre 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 4079, 15547, 13476, 10179, 1998, 3841, 5735, 2024, 2036, 3479, 1999, 1996, 5308, 2096, 4901, 2061, 14069, 3310, 1999, 2012, 2648, 2803, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

529 / 12941


11/12/2021 02:51:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hanley  mayor  payne  rhys jones  wigglesworth  hercus  redpath  capt   turner  roddam  stewart  day  schofield  caillet  carter  chabal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 7658, 3051, 3664, 13470, 13919, 3557, 24405, 17125, 5172, 2014, 7874, 2417, 15069, 14408, 6769, 8473, 17130, 5954, 2154, 8040, 14586, 12891, 29080, 22592, 5708, 15775, 10264,

530 / 12941


11/12/2021 02:51:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but if we can under perform and lose by only two points then i am sure if we play well this week we will get the win we need 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2065, 2057, 2064, 2104, 4685, 1998, 4558, 2011, 2069, 2048, 2685, 2059, 1045, 2572, 2469, 2065, 2057, 2377, 2092, 2023, 2733, 2057, 2097, 2131, 1996, 2663, 2057, 2342, 102, 0, 0, 

531 / 12941


11/12/2021 02:51:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: sella admitted he had been impressed by current fly half yann delaigue in the rbs six nations to date 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5271, 2050, 4914, 2002, 2018, 2042, 7622, 2011, 2783, 4875, 2431, 13619, 2078, 3972, 4886, 9077, 1999, 1996, 21144, 2015, 2416, 3741, 2000, 3058, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

532 / 12941


11/12/2021 02:51:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:51:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  there are so many people that were affected  are still affected and will be affected for a long time 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2024, 2061, 2116, 2111, 2008, 2020, 5360, 2024, 2145, 5360, 1998, 2097, 2022, 5360, 2005, 1037, 2146, 2051, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

533 / 12941


11/12/2021 02:52:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he told bbc sport   it s potentially the most fearsome line up i ve ever come up against 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2409, 4035, 4368, 2009, 1055, 9280, 1996, 2087, 10069, 8462, 2240, 2039, 1045, 2310, 2412, 2272, 2039, 2114, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

534 / 12941


11/12/2021 02:52:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  added to that  the senior players aren t standing up and they can t do anything when the pressure mounts 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2794, 2000, 2008, 1996, 3026, 2867, 4995, 1056, 3061, 2039, 1998, 2027, 2064, 1056, 2079, 2505, 2043, 1996, 3778, 19363, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

535 / 12941


11/12/2021 02:52:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the north side have been hit by the withdrawals of scotland duo gordon bulloch and chris cusiter  plus france captain fabien pelous 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2167, 2217, 2031, 2042, 2718, 2011, 1996, 10534, 2015, 1997, 3885, 6829, 5146, 7087, 11663, 1998, 3782, 12731, 28032, 2121, 4606, 2605, 2952, 6904, 11283, 2078, 21877, 15534

536 / 12941


11/12/2021 02:52:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bourgoin lock pascal pape  who has recovered from a sprained ankle  returns to the 22 man squad 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8945, 12514, 28765, 5843, 17878, 6643, 5051, 2040, 2038, 6757, 2013, 1037, 11867, 27361, 10792, 5651, 2000, 1996, 2570, 2158, 4686, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

537 / 12941


11/12/2021 02:52:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: r mcbryde  scarlets   j yapp  blues   j thomas  ospreys   r jones  ospreys   g cooper  dragons   c sweeney  dragons   k morgan  dragons  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1054, 11338, 25731, 3207, 11862, 2015, 1046, 8038, 9397, 5132, 1046, 2726, 9808, 28139, 7274, 1054, 3557, 9808, 28139, 7274, 1043, 6201, 8626, 1039, 21178, 8626, 1047, 5253, 

538 / 12941


11/12/2021 02:52:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it will be stevens  first start after two caps as a replacement against the all blacks last year 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2097, 2022, 8799, 2034, 2707, 2044, 2048, 9700, 2004, 1037, 6110, 2114, 1996, 2035, 10823, 2197, 2095, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

539 / 12941


11/12/2021 02:52:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: now the italians travel to edinburgh hoping to claim their first away win in the six nations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2085, 1996, 16773, 3604, 2000, 5928, 5327, 2000, 4366, 2037, 2034, 2185, 2663, 1999, 1996, 2416, 3741, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

540 / 12941


11/12/2021 02:52:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:52:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it s a huge task but it is a great opportunity for us 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 1037, 4121, 4708, 2021, 2009, 2003, 1037, 2307, 4495, 2005, 2149, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

541 / 12941


11/12/2021 02:53:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we will see how good france are and the scrum is the key 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2097, 2156, 2129, 2204, 2605, 2024, 1998, 1996, 8040, 6824, 2003, 1996, 3145, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

542 / 12941


11/12/2021 02:53:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a decision will be taken on saturday as to whether the 26 year old will be declared fit 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 3247, 2097, 2022, 2579, 2006, 5095, 2004, 2000, 3251, 1996, 2656, 2095, 2214, 2097, 2022, 4161, 4906, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

543 / 12941


11/12/2021 02:53:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i m looking forward to working with such outstanding players   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 2559, 2830, 2000, 2551, 2007, 2107, 5151, 2867, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

544 / 12941


11/12/2021 02:53:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i m just glad i m not the one who has to make the decision 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 2074, 5580, 1045, 1049, 2025, 1996, 2028, 2040, 2038, 2000, 2191, 1996, 3247, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

545 / 12941


11/12/2021 02:53:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: charvis has missed all three of wales  victories with an ankle injury and his recovery has been slower than expected 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 25869, 11365, 2038, 4771, 2035, 2093, 1997, 3575, 9248, 2007, 2019, 10792, 4544, 1998, 2010, 7233, 2038, 2042, 12430, 2084, 3517, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

546 / 12941


11/12/2021 02:53:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: robinson out of six nationsengland captain jason robinson will miss the rest of the six nations because of injury 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 6157, 2041, 1997, 2416, 3741, 13159, 3122, 2952, 4463, 6157, 2097, 3335, 1996, 2717, 1997, 1996, 2416, 3741, 2138, 1997, 4544, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

547 / 12941


11/12/2021 02:53:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:53:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  should the rfu resolve the issue to our satisfaction  as happened last month when the scotland coach matt williams apologised for remarks made  it would be the end of the matter 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2323, 1996, 21792, 2226, 10663, 1996, 3277, 2000, 2256, 9967, 2004, 3047, 2197, 3204, 2043, 1996, 3885, 2873, 4717, 3766, 9706, 1289

548 / 12941


11/12/2021 02:54:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  imanol has been dropped from the squad because the least i can say is that he didn t make a thundering comeback against wales   said laporte 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 10047, 6761, 2140, 2038, 2042, 3333, 2013, 1996, 4686, 2138, 1996, 2560, 1045, 2064, 2360, 2003, 2008, 2002, 2134, 1056, 2191, 1037, 8505, 2075, 12845, 2114, 3575, 2056,

549 / 12941


11/12/2021 02:54:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he will probably have to come off the bench to start and it would be ridiculous and irresponsible to put him straight back into a test match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2097, 2763, 2031, 2000, 2272, 2125, 1996, 6847, 2000, 2707, 1998, 2009, 2052, 2022, 9951, 1998, 20868, 6072, 26029, 19307, 2000, 2404, 2032, 3442, 2067, 2046, 1037

550 / 12941


11/12/2021 02:54:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it would not have been acceptable in the zurich premiership 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2052, 2025, 2031, 2042, 11701, 1999, 1996, 10204, 11264, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

551 / 12941


11/12/2021 02:54:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and moore believes the last two games against italy and scotland are a good opportunity to experiment further 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 5405, 7164, 1996, 2197, 2048, 2399, 2114, 3304, 1998, 3885, 2024, 1037, 2204, 4495, 2000, 7551, 2582, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

552 / 12941


11/12/2021 02:54:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he has put his own unique stamp on things 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2038, 2404, 2010, 2219, 4310, 11359, 2006, 2477, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

553 / 12941


11/12/2021 02:54:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: italy aim to rattle englanditaly coach john kirwan believes his side can upset england as the six nations wooden spoon battle hots up 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 3304, 6614, 2000, 23114, 2563, 18400, 2100, 2873, 2198, 11382, 2099, 7447, 7164, 2010, 2217, 2064, 6314, 2563, 2004, 1996, 2416, 3741, 4799, 15642, 2645, 2980, 2015, 2039, 102,

554 / 12941


11/12/2021 02:54:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we are going to scotland for the first away win and nothing else   said manager marco bollesan 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2024, 2183, 2000, 3885, 2005, 1996, 2034, 2185, 2663, 1998, 2498, 2842, 2056, 3208, 8879, 8945, 20434, 2319, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

555 / 12941


11/12/2021 02:54:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: j robinson  sale sharks  capt   m cueto  sale sharks   m tait  newcastle   j noon  newcastle   j lewsey  wasps   c hodgson  sale sharks   m dawson  wasps   g rowntree  leicester   s thompson  northampton   j white  leicester   d grewcock  bath   b kay  leicester   l moody  leicester   a hazell  gloucester   j worsley  wasps  
Tokenized: 
 	None
Features: 
 	i

556 / 12941


11/12/2021 02:54:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but despite their traditional hospitality when the irish are visiting  wood believes wales might end their four match losing run against england in cardiff 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2750, 2037, 3151, 15961, 2043, 1996, 3493, 2024, 5873, 3536, 7164, 3575, 2453, 2203, 2037, 2176, 2674, 3974, 2448, 2114, 2563, 1999, 10149, 102, 0, 0

557 / 12941


11/12/2021 02:54:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the borders flanker has a knee injury and joins donnie macfadyen and allister hogg on the sidelines 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 6645, 12205, 2121, 2038, 1037, 6181, 4544, 1998, 9794, 28486, 6097, 7011, 5149, 2368, 1998, 2035, 12911, 27589, 2290, 2006, 1996, 2217, 12735, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

558 / 12941


11/12/2021 02:54:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:54:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ireland coach eddie o sullivan appears to have done that quite successfully in the run up to this season s six nations championship 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3163, 2873, 5752, 1051, 7624, 3544, 2000, 2031, 2589, 2008, 3243, 5147, 1999, 1996, 2448, 2039, 2000, 2023, 2161, 1055, 2416, 3741, 2528, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

559 / 12941


11/12/2021 02:55:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   last week williams  along with fellow flanker colin charvis   who is unlikely to play for at least a month while he recovers from a foot injury   was all but ruled out of the millennium stadium clash 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2197, 2733, 3766, 2247, 2007, 3507, 12205, 2121, 6972, 25869, 11365, 2040, 2003, 9832, 2000, 2377, 2005, 2012

560 / 12941


11/12/2021 02:55:15 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:15 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dawson joins england injury listscrum half matt dawson is an injury doubt for england s six nations opener against wales next weekend 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 11026, 9794, 2563, 4544, 7201, 26775, 2819, 2431, 4717, 11026, 2003, 2019, 4544, 4797, 2005, 2563, 1055, 2416, 3741, 16181, 2114, 3575, 2279, 5353, 102, 0, 0, 0, 0, 0, 0, 0, 0,

561 / 12941


11/12/2021 02:55:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: saints have a week s training in portugal next week  while wales will play england in the opening six nations match on 5 february 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6586, 2031, 1037, 2733, 1055, 2731, 1999, 5978, 2279, 2733, 2096, 3575, 2097, 2377, 2563, 1999, 1996, 3098, 2416, 3741, 2674, 2006, 1019, 2337, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

562 / 12941


11/12/2021 02:55:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it also comes at the end of a very  very difficult week 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2036, 3310, 2012, 1996, 2203, 1997, 1037, 2200, 2200, 3697, 2733, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

563 / 12941


11/12/2021 02:55:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he s developing quickly  but i hope he isn t pushed too quickly in a way that would hurt his development 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 1055, 4975, 2855, 2021, 1045, 3246, 2002, 3475, 1056, 3724, 2205, 2855, 1999, 1037, 2126, 2008, 2052, 3480, 2010, 2458, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

564 / 12941


11/12/2021 02:55:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i think there s an injury audit coming out in march that s got some great information in there that i think everybody in the english game has got to look at   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2228, 2045, 1055, 2019, 4544, 15727, 2746, 2041, 1999, 2233, 2008, 1055, 2288, 2070, 2307, 2592, 1999, 2045, 2008, 1045, 2228, 7955, 1999

565 / 12941


11/12/2021 02:55:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

566 / 12941


11/12/2021 02:55:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: an amalgamated under 18 side formed from separate schools and national youth teams plays its first match on thursday  against italy at the gnoll 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2019, 17143, 2104, 2324, 2217, 2719, 2013, 3584, 2816, 1998, 2120, 3360, 2780, 3248, 2049, 2034, 2674, 2006, 9432, 2114, 3304, 2012, 1996, 1043, 3630, 3363, 102, 0, 0

567 / 12941


11/12/2021 02:55:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:55:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  so i wouldn t be surprised if they reviewed their position 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2061, 1045, 2876, 1056, 2022, 4527, 2065, 2027, 8182, 2037, 2597, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

568 / 12941


11/12/2021 02:56:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: there can be no doubt that ireland s representation will be the biggest ever  albeit in a proposed 44 man squad 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2064, 2022, 2053, 4797, 2008, 3163, 1055, 6630, 2097, 2022, 1996, 5221, 2412, 12167, 1999, 1037, 3818, 4008, 2158, 4686, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

569 / 12941


11/12/2021 02:56:16 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:16 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but we were told in no uncertain terms that the financial situation did not allow that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2057, 2020, 2409, 1999, 2053, 9662, 3408, 2008, 1996, 3361, 3663, 2106, 2025, 3499, 2008, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

570 / 12941


11/12/2021 02:56:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the only person who saw me at my worst was my wife   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2069, 2711, 2040, 2387, 2033, 2012, 2026, 5409, 2001, 2026, 2564, 2002, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

571 / 12941


11/12/2021 02:56:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   johnson will always be revered by england fans for captaining england to their dramatic world cup win against australia in sydney  but his list of achievements does not stop at that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3779, 2097, 2467, 2022, 23886, 2011, 2563, 4599, 2005, 2952, 2075, 2563, 2000, 2037, 6918, 2088, 2452, 2663, 2114, 2660, 1999, 

572 / 12941


11/12/2021 02:56:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the final weekend threw the world pecking order into renewed confusion  with australia s triumph in london followed by france s capitulation to new zealand 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2345, 5353, 4711, 1996, 2088, 18082, 2075, 2344, 2046, 9100, 6724, 2007, 2660, 1055, 10911, 1999, 2414, 2628, 2011, 2605, 1055, 6178, 4183, 9513, 200

573 / 12941


11/12/2021 02:56:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   amor  who will captain england in this season s opening irb sevens tournament  the dubai sevens  which start on thursday  was delighted with his award 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16095, 2040, 2097, 2952, 2563, 1999, 2023, 2161, 1055, 3098, 20868, 2497, 19463, 2977, 1996, 11558, 19463, 2029, 2707, 2006, 9432, 2001, 15936, 2007, 2010, 24

574 / 12941


11/12/2021 02:56:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:56:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: robinson  a former rugby league international before switching codes in 2000  leads england against australia at twickenham at 1430 gmt 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6157, 1037, 2280, 4043, 2223, 2248, 2077, 11991, 9537, 1999, 2456, 5260, 2563, 2114, 2660, 2012, 1056, 7184, 23580, 2012, 16065, 2692, 13938, 2102, 102, 0, 0, 0, 0, 0, 0, 0, 0

575 / 12941


11/12/2021 02:57:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the first south africa tour of new zealand in 1921 saw honours shared in a three test series  starting the greatest rivalry in rugby   and the long running controversy between the countries over the all blacks  inclusion of maori players 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2034, 2148, 3088, 2778, 1997, 2047, 3414, 1999, 4885, 2387, 8762, 4

576 / 12941


11/12/2021 02:57:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the season would start with the celtic league in october  followed by the heineken cup in february and march  and the six nations moved to april and may 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2161, 2052, 2707, 2007, 1996, 8730, 2223, 1999, 2255, 2628, 2011, 1996, 2002, 3170, 7520, 2452, 1999, 2337, 1998, 2233, 1998, 1996, 2416, 3741, 2333, 20

577 / 12941


11/12/2021 02:57:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: england s acting head coach andy robinson said   he is a natural leader  holds the respect of the squad and is a formidable talent on the pitch 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2563, 1055, 3772, 2132, 2873, 5557, 6157, 2056, 2002, 2003, 1037, 3019, 3003, 4324, 1996, 4847, 1997, 1996, 4686, 1998, 2003, 1037, 18085, 5848, 2006, 1996, 6510, 102,

578 / 12941


11/12/2021 02:57:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   tokumasu added that the 2002 football world cup  co hosted by japan and south korea  had been a huge success 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2000, 5283, 9335, 2226, 2794, 2008, 1996, 2526, 2374, 2088, 2452, 2522, 4354, 2011, 2900, 1998, 2148, 4420, 2018, 2042, 1037, 4121, 3112, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

579 / 12941


11/12/2021 02:57:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: even the end of his nine year career came out of the blue  just four days before the start of the season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2130, 1996, 2203, 1997, 2010, 3157, 2095, 2476, 2234, 2041, 1997, 1996, 2630, 2074, 2176, 2420, 2077, 1996, 2707, 1997, 1996, 2161, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

580 / 12941


11/12/2021 02:57:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:57:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: even their lack of discipline in defence   which presented the admirable van ginsberg with 26 points   could not undo them as they held out for a famous win 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2130, 2037, 3768, 1997, 9009, 1999, 4721, 2029, 3591, 1996, 4748, 14503, 3085, 3158, 18353, 11711, 2007, 2656, 2685, 2071, 2025, 25672, 2068, 2004, 2027, 

581 / 12941


11/12/2021 02:58:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   h shimange  cj van der linde  g britz  d rossouw  m claassens  j de villiers  g du toit j fourie 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1044, 11895, 2386, 3351, 1039, 3501, 3158, 4315, 11409, 3207, 1043, 28101, 2480, 1040, 5811, 7140, 2860, 1049, 18856, 11057, 14416, 2015, 1046, 2139, 25333, 1043, 4241, 2000, 4183, 1046, 2176, 2666, 102, 0, 0, 0,

582 / 12941


11/12/2021 02:58:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  they ran five tries past the french in the summer  so we will not take them for granted 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2743, 2274, 5363, 2627, 1996, 2413, 1999, 1996, 2621, 2061, 2057, 2097, 2025, 2202, 2068, 2005, 4379, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

583 / 12941


11/12/2021 02:58:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: t bowe  ulster   k campbell  ulster   g d arcy  ulster   g dempsey  leinster   g duffy  harlequins   g easterby  leinster   d hickie  leinster   a horgan  munster   s horgan  leinster   d humphreys  ulster   k maggs  ulster   g murphy  leicester   b o driscoll   leinster   r o gara  munster   s payne  munster   p stringer  munster  
Tokenized: 
 	None
Feature

584 / 12941


11/12/2021 02:58:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ireland turned the ball over and manuel contepomi broke through an unstructured defence before feeding his midfield partner aramburu to sprint in under the posts 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3163, 2357, 1996, 3608, 2058, 1998, 7762, 9530, 2618, 6873, 4328, 3631, 2083, 2019, 4895, 3367, 26134, 4721, 2077, 8521, 2010, 23071, 4256, 19027, 14

585 / 12941


11/12/2021 02:58:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they then took charge with scores from pat sanderson  kai horstman  mathew tait and rob thirlby  but fiji rallied to force a tense finale 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2059, 2165, 3715, 2007, 7644, 2013, 6986, 12055, 2239, 11928, 28565, 2386, 25436, 13843, 2102, 1998, 6487, 16215, 4313, 14510, 2021, 11464, 24356, 2000, 2486, 1037, 90

586 / 12941


11/12/2021 02:58:40 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:40 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: marty holah got the all blacks onslaught under way with his fifth minute try before rush hit back moments later 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 12578, 7570, 14431, 2288, 1996, 2035, 10823, 28644, 2104, 2126, 2007, 2010, 3587, 3371, 3046, 2077, 5481, 2718, 2067, 5312, 2101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

587 / 12941


11/12/2021 02:58:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   newcastle s 18 year old centre mathew tait is also in the training squad 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8142, 1055, 2324, 2095, 2214, 2803, 25436, 13843, 2102, 2003, 2036, 1999, 1996, 2731, 4686, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

588 / 12941


11/12/2021 02:58:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:58:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: calum macrae surrendered possession before centre steinmetz sent a chip into the danger zone 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 10250, 2819, 6097, 16652, 10795, 6664, 2077, 2803, 14233, 11368, 2480, 2741, 1037, 9090, 2046, 1996, 5473, 4224, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

589 / 12941


11/12/2021 02:59:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i took lessons from 2001 in that they did make a mistake in taking lawrence dallaglio when he wasn t fit and went on the trip 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2165, 8220, 2013, 2541, 1999, 2008, 2027, 2106, 2191, 1037, 6707, 1999, 2635, 5623, 17488, 17802, 12798, 2043, 2002, 2347, 1056, 4906, 1998, 2253, 2006, 1996, 4440, 102, 0, 0, 0,

590 / 12941


11/12/2021 02:59:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i hope i can contribute to the planning and preparation  and to ensuring the media and public get the most out of the tour itself   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 3246, 1045, 2064, 9002, 2000, 1996, 4041, 1998, 7547, 1998, 2000, 12725, 1996, 2865, 1998, 2270, 2131, 1996, 2087, 2041, 1997, 1996, 2778, 2993, 2002, 2056, 102, 0,

591 / 12941


11/12/2021 02:59:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it is an open book   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2003, 2019, 2330, 2338, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

592 / 12941


11/12/2021 02:59:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he was subsequently replaced as england captain by full back jason robinson 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2001, 3525, 2999, 2004, 2563, 2952, 2011, 2440, 2067, 4463, 6157, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

593 / 12941


11/12/2021 02:59:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but dallaglio  who called time on england earlier this year  said   i assure you i wouldn t let anyone down 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 17488, 17802, 12798, 2040, 2170, 2051, 2006, 2563, 3041, 2023, 2095, 2056, 1045, 14306, 2017, 1045, 2876, 1056, 2292, 3087, 2091, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

594 / 12941


11/12/2021 02:59:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dawson set for new wasps contracteuropean champions wasps are set to offer matt dawson a new deal 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 11026, 2275, 2005, 2047, 23146, 3206, 11236, 17635, 2319, 3966, 23146, 2024, 2275, 2000, 3749, 4717, 11026, 1037, 2047, 3066, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

595 / 12941


11/12/2021 02:59:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we have got to show the same desire again this week 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2031, 2288, 2000, 2265, 1996, 2168, 4792, 2153, 2023, 2733, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

596 / 12941


11/12/2021 02:59:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  if you retire from international rugby  you know what you re getting into   or rather  you know what you re getting out of   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2017, 11036, 2013, 2248, 4043, 2017, 2113, 2054, 2017, 2128, 2893, 2046, 2030, 2738, 2017, 2113, 2054, 2017, 2128, 2893, 2041, 1997, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 

597 / 12941


11/12/2021 02:59:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 02:59:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wilkinson  who also endured an eight month lay off with a shoulder injury  said   i was physically prepared for my first return but mentally it was hard 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16237, 2040, 2036, 16753, 2019, 2809, 3204, 3913, 2125, 2007, 1037, 3244, 4544, 2056, 1045, 2001, 8186, 4810, 2005, 2026, 2034, 2709, 2021, 10597, 2009, 2001,

598 / 12941


11/12/2021 03:00:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: we could never relax and get into any stride 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2071, 2196, 9483, 1998, 2131, 2046, 2151, 18045, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

599 / 12941


11/12/2021 03:00:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  when steve was appointed  it was with a view that he would succeed rod should he take the decision to leave saracens 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2043, 3889, 2001, 2805, 2009, 2001, 2007, 1037, 3193, 2008, 2002, 2052, 9510, 8473, 2323, 2002, 2202, 1996, 3247, 2000, 2681, 7354, 19023, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

600 / 12941


11/12/2021 03:00:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: together with jim mallinder  diamond steered the club to second in the premiership as well as parker pen shield glory 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2362, 2007, 3958, 6670, 22254, 2121, 6323, 23424, 1996, 2252, 2000, 2117, 1999, 1996, 11264, 2004, 2092, 2004, 6262, 7279, 6099, 8294, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

601 / 12941


11/12/2021 03:00:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   hurter said   i ve had an amazing time here 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3480, 2121, 2056, 1045, 2310, 2018, 2019, 6429, 2051, 2182, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

602 / 12941


11/12/2021 03:00:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he then became the exiles  director of rugby before taking up the managing director role in 2003 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2059, 2150, 1996, 27127, 2472, 1997, 4043, 2077, 2635, 2039, 1996, 6605, 2472, 2535, 1999, 2494, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

603 / 12941


11/12/2021 03:00:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but the news they will not be able to advance this time around will further dent their ambitions this season 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1996, 2739, 2027, 2097, 2025, 2022, 2583, 2000, 5083, 2023, 2051, 2105, 2097, 2582, 21418, 2037, 19509, 2023, 2161, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

604 / 12941


11/12/2021 03:00:45 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:45 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a knee injury has ruled him out of england s first two matches against wales and france  but he is hoping to feature against ireland on 27 february 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 6181, 4544, 2038, 5451, 2032, 2041, 1997, 2563, 1055, 2034, 2048, 3503, 2114, 3575, 1998, 2605, 2021, 2002, 2003, 5327, 2000, 3444, 2114, 3163, 2006, 2676, 2

605 / 12941


11/12/2021 03:00:51 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:51 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i am committed to new zealand rugby this season   he added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2572, 5462, 2000, 2047, 3414, 4043, 2023, 2161, 2002, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

606 / 12941


11/12/2021 03:00:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:00:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

607 / 12941


11/12/2021 03:01:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  mike has been at bath for eight years and wants to remain with the club and his demands are anything but excessive   the agent added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3505, 2038, 2042, 2012, 7198, 2005, 2809, 2086, 1998, 4122, 2000, 3961, 2007, 1996, 2252, 1998, 2010, 7670, 2024, 2505, 2021, 11664, 1996, 4005, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 

608 / 12941


11/12/2021 03:01:05 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:05 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the bath player was already out of the opener against wales on 5 february because of a hand problem 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 7198, 2447, 2001, 2525, 2041, 1997, 1996, 16181, 2114, 3575, 2006, 1019, 2337, 2138, 1997, 1037, 2192, 3291, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

609 / 12941


11/12/2021 03:01:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wales have some way to go before they can be remotely considered in a similar light 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3575, 2031, 2070, 2126, 2000, 2175, 2077, 2027, 2064, 2022, 19512, 2641, 1999, 1037, 2714, 2422, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

610 / 12941


11/12/2021 03:01:19 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:19 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  we all realise how difficult a task it is to go up to scotland and beat them 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2057, 2035, 19148, 2129, 3697, 1037, 4708, 2009, 2003, 2000, 2175, 2039, 2000, 3885, 1998, 3786, 2068, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

611 / 12941


11/12/2021 03:01:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: there are a lot of good young players who are pushing for places anyway 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2024, 1037, 2843, 1997, 2204, 2402, 2867, 2040, 2024, 6183, 2005, 3182, 4312, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

612 / 12941


11/12/2021 03:01:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: if he does move across to union  wells believes he would better off playing in the backs  at least initially 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2065, 2002, 2515, 2693, 2408, 2000, 2586, 7051, 7164, 2002, 2052, 2488, 2125, 2652, 1999, 1996, 10457, 2012, 2560, 3322, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

613 / 12941


11/12/2021 03:01:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  this guy is an absolute sporting icon 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2023, 3124, 2003, 2019, 7619, 7419, 12696, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

614 / 12941


11/12/2021 03:01:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: scotland and ireland are in pool a together with the all blacks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3885, 1998, 3163, 2024, 1999, 4770, 1037, 2362, 2007, 1996, 2035, 10823, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

615 / 12941


11/12/2021 03:01:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:01:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: without further ado  he nervelessly slotted the kick that ended five years of english dominance and 12 years of waiting in cardiff 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2302, 2582, 4748, 2080, 2002, 9113, 10895, 10453, 3064, 1996, 5926, 2008, 3092, 2274, 2086, 1997, 2394, 13811, 1998, 2260, 2086, 1997, 3403, 1999, 10149, 102, 0, 0, 0, 0, 0, 0, 0, 

616 / 12941


11/12/2021 03:02:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:02:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wallabies captain george gregan said the charity match was a  great initiative  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2813, 28518, 2229, 2952, 2577, 6754, 2319, 2056, 1996, 5952, 2674, 2001, 1037, 2307, 6349, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

617 / 12941


11/12/2021 03:02:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:02:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: england are enduring their worst run in the championship since captain richard hill was dumped in favour of mike harrison after three straight losses in 1987 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2563, 2024, 16762, 2037, 5409, 2448, 1999, 1996, 2528, 2144, 2952, 2957, 2940, 2001, 14019, 1999, 7927, 1997, 3505, 6676, 2044, 2093, 3442, 6409, 1999, 3

618 / 12941


11/12/2021 03:02:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:02:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but there were many more  and one should not take away from those 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2045, 2020, 2116, 2062, 1998, 2028, 2323, 2025, 2202, 2185, 2013, 2216, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

619 / 12941


11/12/2021 03:02:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:02:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but despite one near miss with the pack over the line   not checked on the tv replay by referee jonathan kaplan   england were unable to pull off a face saving win 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2750, 2028, 2379, 3335, 2007, 1996, 5308, 2058, 1996, 2240, 2025, 7039, 2006, 1996, 2694, 15712, 2011, 5330, 5655, 22990, 2563, 2020, 4039, 2

620 / 12941


11/12/2021 03:02:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:02:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: they defended magnificently and they ve got every chance of winning this six nations 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 8047, 12047, 2135, 1998, 2027, 2310, 2288, 2296, 3382, 1997, 3045, 2023, 2416, 3741, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

621 / 12941


11/12/2021 03:03:01 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:01 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as the half wore on  both sides squandered promising spells of momentum with sloppy penalties  and the period fizzled out with scotland numerically  if not psychologically  on top 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 1996, 2431, 5078, 2006, 2119, 3903, 5490, 13860, 4063, 2098, 10015, 11750, 1997, 11071, 2007, 28810, 12408, 1998, 1996, 2558,

622 / 12941


11/12/2021 03:03:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: this year  italy opened up with a stubborn display against ireland but ended up losing 28 17 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2023, 2095, 3304, 2441, 2039, 2007, 1037, 14205, 4653, 2114, 3163, 2021, 3092, 2039, 3974, 2654, 2459, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

623 / 12941


11/12/2021 03:03:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it s very dangerous to think that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 2200, 4795, 2000, 2228, 2008, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

624 / 12941


11/12/2021 03:03:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   m blair  edinburgh   a craig  glasgow   c cusiter  borders   s danielli  borders   m di rollo  edinburgh   a henderson  glasgow   b hinshelwood  worcester   r lamont  glasgow   s lamont  glasgow   d parks  glasgow   c paterson  edinburgh   g ross  leeds   h southwell  edinburgh   s webster  edinburgh   r beattie  northampton   g bulloch  glasgow  capt   b d

625 / 12941


11/12/2021 03:03:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he will undergo an operation on monday and is expected to be out for at least six weeks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2097, 13595, 2019, 3169, 2006, 6928, 1998, 2003, 3517, 2000, 2022, 2041, 2005, 2012, 2560, 2416, 3134, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

626 / 12941


11/12/2021 03:03:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the champions only just saw off the scots in paris  then needed england to self destruct in last week s 18 17 win 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 3966, 2069, 2074, 2387, 2125, 1996, 12196, 1999, 3000, 2059, 2734, 2563, 2000, 2969, 4078, 18300, 1999, 2197, 2733, 1055, 2324, 2459, 2663, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

627 / 12941


11/12/2021 03:03:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but mike  ruddock  asked me to carry on for another season  which i ve done  still part of the squad  still trying to help them out as much as i can 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 3505, 25298, 7432, 2356, 2033, 2000, 4287, 2006, 2005, 2178, 2161, 2029, 1045, 2310, 2589, 2145, 2112, 1997, 1996, 4686, 2145, 2667, 2000, 2393, 2068, 2041

628 / 12941


11/12/2021 03:03:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:03:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: england sent on ben cohen and matt dawson  and after barkley s kick saw christophe dominici take the ball over his own line  the stage was set for a victory platform 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2563, 2741, 2006, 3841, 9946, 1998, 4717, 11026, 1998, 2044, 11286, 3051, 1055, 5926, 2387, 23978, 11282, 2072, 2202, 1996, 3608, 2058, 2010, 221

629 / 12941


11/12/2021 03:04:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   the 25 year old has not played for england since the 2003 world cup final after a succession of injuries 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2423, 2095, 2214, 2038, 2025, 2209, 2005, 2563, 2144, 1996, 2494, 2088, 2452, 2345, 2044, 1037, 8338, 1997, 6441, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

630 / 12941


11/12/2021 03:04:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: de marigny landed a penalty to make it 19 8 and a nitoglia break through the middle threatened a try only for the move to break down with a knock on 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2139, 16266, 19393, 5565, 1037, 6531, 2000, 2191, 2009, 2539, 1022, 1998, 1037, 9152, 3406, 20011, 3338, 2083, 1996, 2690, 5561, 1037, 3046, 2069, 2005, 1996, 269

631 / 12941


11/12/2021 03:04:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: o connor will be winning his third cap after making his debut in the victory over south africa last november 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1051, 6720, 2097, 2022, 3045, 2010, 2353, 6178, 2044, 2437, 2010, 2834, 1999, 1996, 3377, 2058, 2148, 3088, 2197, 2281, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

632 / 12941


11/12/2021 03:04:29 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:29 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the only other change to the ireland side sees wasps flanker johnny o connor replacing denis leamy 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2069, 2060, 2689, 2000, 1996, 3163, 2217, 5927, 23146, 12205, 2121, 5206, 1051, 6720, 6419, 11064, 12203, 8029, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

633 / 12941


11/12/2021 03:04:33 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:33 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but after missing out on the 2001 tour to australia with a knee injury  tindall says he will be happy just to have an opportunity to wear the red shirt 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2044, 4394, 2041, 2006, 1996, 2541, 2778, 2000, 2660, 2007, 1037, 6181, 4544, 9543, 9305, 2140, 2758, 2002, 2097, 2022, 3407, 2074, 2000, 2031, 2019, 449

634 / 12941


11/12/2021 03:04:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   it looks a definite head to head battle between himself and 23 year old leamy   three stone heavier than o connor   for the number seven role against the world champions 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 3504, 1037, 15298, 2132, 2000, 2132, 2645, 2090, 2370, 1998, 2603, 2095, 2214, 12203, 8029, 2093, 2962, 11907, 2084, 1051, 6720, 2005

635 / 12941


11/12/2021 03:04:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: a furious rusedski slammed his racket onto the ground in disgust and was warned by the umpire 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 9943, 26307, 5104, 3211, 7549, 2010, 14513, 3388, 3031, 1996, 2598, 1999, 12721, 1998, 2001, 7420, 2011, 1996, 20887, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

636 / 12941


11/12/2021 03:04:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: and he maintained the momentum early in the second set  breaking the russian with the help of an inspired volley 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 2002, 5224, 1996, 11071, 2220, 1999, 1996, 2117, 2275, 4911, 1996, 2845, 2007, 1996, 2393, 1997, 2019, 4427, 28073, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

637 / 12941


11/12/2021 03:04:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:04:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it was difficult for me not to be in the friday matches but i had to understand 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 3697, 2005, 2033, 2025, 2000, 2022, 1999, 1996, 5958, 3503, 2021, 1045, 2018, 2000, 3305, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

638 / 12941


11/12/2021 03:05:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  roddick won the last four points of the first set tie break before being broken at the start of the second set 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8473, 24066, 2180, 1996, 2197, 2176, 2685, 1997, 1996, 2034, 2275, 5495, 3338, 2077, 2108, 3714, 2012, 1996, 2707, 1997, 1996, 2117, 2275, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

639 / 12941


11/12/2021 03:05:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  after those first three games it was no match at all 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2044, 2216, 2034, 2093, 2399, 2009, 2001, 2053, 2674, 2012, 2035, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

640 / 12941


11/12/2021 03:05:16 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:16 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: despite that he was happy with his tour debut 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2750, 2008, 2002, 2001, 3407, 2007, 2010, 2778, 2834, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

641 / 12941


11/12/2021 03:05:24 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:24 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i didn t want him to play his game 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2134, 1056, 2215, 2032, 2000, 2377, 2010, 2208, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

642 / 12941


11/12/2021 03:05:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i haven t been playing well  but i ve been coming through 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 4033, 1056, 2042, 2652, 2092, 2021, 1045, 2310, 2042, 2746, 2083, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

643 / 12941


11/12/2021 03:05:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: argentine wild card mariano puerta continued his improbable run  outlasting felix mantilla 6 4 3 6 7 6 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8511, 3748, 4003, 22695, 16405, 8743, 2050, 2506, 2010, 17727, 3217, 3676, 3468, 2448, 2041, 8523, 3436, 8383, 2158, 28345, 2050, 1020, 1018, 1017, 1020, 1021, 1020, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

644 / 12941


11/12/2021 03:05:39 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:39 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hantuchova in dubai last eightdaniela hantuchova moved into the quarter finals of the dubai open  after beating elene likhotseva of russia 7 5 6 4  and now faces serena williams 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 7658, 8525, 9905, 3567, 1999, 11558, 2197, 2809, 7847, 9257, 2050, 7658, 8525, 9905, 3567, 2333, 2046, 1996, 4284, 4399, 1997, 1996,

645 / 12941


11/12/2021 03:05:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: bogdanovic and murray both pulled out of tournaments last week through injury but are expected to be fit 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 22132, 7847, 9142, 1998, 6264, 2119, 2766, 2041, 1997, 8504, 2197, 2733, 2083, 4544, 2021, 2024, 3517, 2000, 2022, 4906, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

646 / 12941


11/12/2021 03:05:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  hopefully we will be able to change people s minds 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 11504, 2057, 2097, 2022, 2583, 2000, 2689, 2111, 1055, 9273, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

647 / 12941


11/12/2021 03:05:59 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:05:59 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dogged federer claims dubai crownworld number one roger federer added the dubai championship trophy to his long list of successes   but not before he was given a test by ivan ljubicic 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 28844, 2098, 28294, 4447, 11558, 4410, 11108, 2193, 2028, 5074, 28294, 2794, 1996, 11558, 2528, 5384, 2000, 2010, 2146, 2862, 

648 / 12941


11/12/2021 03:06:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: henman had looked on course to level the match after going 2 0 up in the second set  but his progress was halted as the rain intervened again 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 21863, 2386, 2018, 2246, 2006, 2607, 2000, 2504, 1996, 2674, 2044, 2183, 1016, 1014, 2039, 1999, 1996, 2117, 2275, 2021, 2010, 5082, 2001, 12705, 2004, 1996, 4542, 21116

649 / 12941


11/12/2021 03:06:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it was mirza s sixth straight victory following her first wta tournament win in hyderabad last month 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2001, 18366, 1055, 4369, 3442, 3377, 2206, 2014, 2034, 21925, 2977, 2663, 1999, 13624, 2197, 3204, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

650 / 12941


11/12/2021 03:06:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: hantuchova in dubai last eightdaniela hantuchova moved into the quarter finals of the dubai open  after beating elene likhotseva of russia 7 5 6 4  and now faces serena williams 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 7658, 8525, 9905, 3567, 1999, 11558, 2197, 2809, 7847, 9257, 2050, 7658, 8525, 9905, 3567, 2333, 2046, 1996, 4284, 4399, 1997, 1996,

651 / 12941


11/12/2021 03:06:26 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:26 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  he s got to be stronger  he s got a lot of ability but he s got to be more disciplined mentally and physically and if he does that he s got a good chance 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 1055, 2288, 2000, 2022, 6428, 2002, 1055, 2288, 1037, 2843, 1997, 3754, 2021, 2002, 1055, 2288, 2000, 2022, 2062, 28675, 10597, 1998, 8186, 1998, 2065

652 / 12941


11/12/2021 03:06:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  women s tennis is exciting 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2308, 1055, 5093, 2003, 10990, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

653 / 12941


11/12/2021 03:06:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but everyone is beatable and i am looking forward to a great match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 3071, 2003, 3786, 3085, 1998, 1045, 2572, 2559, 2830, 2000, 1037, 2307, 2674, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

654 / 12941


11/12/2021 03:06:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 1994 wimbledon champion won 5 7 6 0 6 2 to earn a second round meeting with french open champion anastasia myskina 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2807, 13411, 3410, 2180, 1019, 1021, 1020, 1014, 1020, 1016, 2000, 7796, 1037, 2117, 2461, 3116, 2007, 2413, 2330, 3410, 19447, 2026, 29334, 2050, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

655 / 12941


11/12/2021 03:06:46 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:46 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: men s champion marat safin remains fourth in the atp rankings while beaten finalist lleyton hewitt replaces andy roddick as world number two 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2273, 1055, 3410, 13955, 2102, 7842, 16294, 3464, 2959, 1999, 1996, 12649, 10385, 2096, 7854, 9914, 2222, 3240, 2669, 19482, 20736, 5557, 8473, 24066, 2004, 2088, 2193, 2

656 / 12941


11/12/2021 03:06:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   her appearance will also benefit charities in the region and the swiss star will donate her prize money 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2014, 3311, 2097, 2036, 5770, 15430, 1999, 1996, 2555, 1998, 1996, 5364, 2732, 2097, 21357, 2014, 3396, 2769, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

657 / 12941


11/12/2021 03:06:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: clijsters set for february returntennis star kim clijsters will make her return from a career threatening injury at the antwerp wta event in february 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 18856, 28418, 15608, 2275, 2005, 2337, 2709, 6528, 8977, 2732, 5035, 18856, 28418, 15608, 2097, 2191, 2014, 2709, 2013, 1037, 2476, 8701, 4544, 2012, 1996, 1400

658 / 12941


11/12/2021 03:06:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:06:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dent will face juan ignacio chela next after the fourth seed was too strong for jurgen melzer 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 21418, 2097, 2227, 5348, 22988, 18178, 2721, 2279, 2044, 1996, 2959, 6534, 2001, 2205, 2844, 2005, 23171, 11463, 6290, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

659 / 12941


11/12/2021 03:07:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but serena denied their challenge was fading  saying   that s not fair   i m tired of not saying anything 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 14419, 6380, 2037, 4119, 2001, 14059, 3038, 2008, 1055, 2025, 4189, 1045, 1049, 5458, 1997, 2025, 3038, 2505, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

660 / 12941


11/12/2021 03:07:08 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:08 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the 10th seed equalled her best performance at a grand slam event when she beat unseeded russian nadia petrova 6 3 6 2 to reach the fourth round 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 6049, 6534, 5020, 3709, 2014, 2190, 2836, 2012, 1037, 2882, 9555, 2724, 2043, 2016, 3786, 4895, 19763, 5732, 2845, 14942, 9004, 12298, 2050, 1020, 1017, 1020, 1

661 / 12941


11/12/2021 03:07:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  then finally  the coach  called my parents up and said  the way she hits the ball  i ve never seen a six year old hit a ball like that    mirza told the associated press 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2059, 2633, 1996, 2873, 2170, 2026, 3008, 2039, 1998, 2056, 1996, 2126, 2016, 4978, 1996, 3608, 1045, 2310, 2196, 2464, 1037, 2416, 2095, 22

662 / 12941


11/12/2021 03:07:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he conceded kuznetsova might have taken a medicine which contained the banned substance 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 15848, 13970, 2480, 22781, 7103, 2453, 2031, 2579, 1037, 4200, 2029, 4838, 1996, 7917, 9415, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

663 / 12941


11/12/2021 03:07:38 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:38 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the yugoslavian born bogdanovic  though  is 184 places below henman in the world rankings and has played just two cup ties   winning one and losing the other 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 8936, 2078, 2141, 22132, 7847, 9142, 2295, 2003, 19681, 3182, 2917, 21863, 2386, 1999, 1996, 2088, 10385, 1998, 2038, 2209, 2074, 2048, 2452, 7208,

664 / 12941


11/12/2021 03:07:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it just keeps getting better and better every year   hewitt said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2074, 7906, 2893, 2488, 1998, 2488, 2296, 2095, 19482, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

665 / 12941


11/12/2021 03:07:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:07:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the spaniard then donated his Â£28 000 prize money to relief efforts for the victims of the asian tsunami 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 8487, 14619, 2059, 6955, 2010, 1037, 29646, 22407, 2199, 3396, 2769, 2000, 4335, 4073, 2005, 1996, 5694, 1997, 1996, 4004, 19267, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

666 / 12941


11/12/2021 03:08:02 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:02 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

667 / 12941


11/12/2021 03:08:06 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:06 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: afterwards  dent said he rated us open semi finalist johansson as a top contender at the australian open  which starts on 17 january 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5728, 21418, 2056, 2002, 6758, 2149, 2330, 4100, 9914, 26447, 2004, 1037, 2327, 20127, 2012, 1996, 2827, 2330, 2029, 4627, 2006, 2459, 2254, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

668 / 12941


11/12/2021 03:08:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i don t know how my body will react 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2123, 1056, 2113, 2129, 2026, 2303, 2097, 10509, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

669 / 12941


11/12/2021 03:08:12 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:12 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  world number 31 hantuchova  ranked two places above dulko  looked nervous as she dropped the first four games of the match 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2088, 2193, 2861, 7658, 8525, 9905, 3567, 4396, 2048, 3182, 2682, 4241, 13687, 2080, 2246, 6091, 2004, 2016, 3333, 1996, 2034, 2176, 2399, 1997, 1996, 2674, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0

670 / 12941


11/12/2021 03:08:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dementieva prevails in hong kongelena dementieva swept aside defending champion venus williams 6 3 6 2 to win hong kong s champions challenge event 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 17183, 4765, 2666, 3567, 3653, 3567, 12146, 1999, 4291, 4290, 12260, 2532, 17183, 4765, 2666, 3567, 7260, 4998, 6984, 3410, 11691, 3766, 1020, 1017, 1020, 1016, 2

671 / 12941


11/12/2021 03:08:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: defending women s champion justine henin hardenne is also out of the sydney event because of a knee injury 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6984, 2308, 1055, 3410, 26377, 21863, 2378, 28751, 2638, 2003, 2036, 2041, 1997, 1996, 3994, 2724, 2138, 1997, 1037, 6181, 4544, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

672 / 12941


11/12/2021 03:08:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

673 / 12941


11/12/2021 03:08:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: lopez showed glimpses of resolve early in the second set when he held his first service game and came close to breaking federer 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8685, 3662, 12185, 2015, 1997, 10663, 2220, 1999, 1996, 2117, 2275, 2043, 2002, 2218, 2010, 2034, 2326, 2208, 1998, 2234, 2485, 2000, 4911, 28294, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

674 / 12941


11/12/2021 03:08:34 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:34 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: her career has been hit by a series of injuries but last year she started hitting top form and won seven titles 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2014, 2476, 2038, 2042, 2718, 2011, 1037, 2186, 1997, 6441, 2021, 2197, 2095, 2016, 2318, 7294, 2327, 2433, 1998, 2180, 2698, 4486, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

675 / 12941


11/12/2021 03:08:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: rusedski angry over supplementsgreg rusedski has criticised the governing body of men s tennis for not releasing contamination free supplements in time for the new season 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 26307, 5104, 3211, 4854, 2058, 25654, 17603, 2290, 26307, 5104, 3211, 2038, 10648, 1996, 8677, 2303, 1997, 2273, 1055, 5093, 2005, 2025, 82

676 / 12941


11/12/2021 03:08:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i might be playing some singles events this season  depending on the surface   she added 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2453, 2022, 2652, 2070, 3895, 2824, 2023, 2161, 5834, 2006, 1996, 3302, 2016, 2794, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

677 / 12941


11/12/2021 03:08:48 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:48 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: fabrice santoro  the 2000 champion  beat sweden s thomas johansson 6 4 6 2 but fourth seed mikhail youzhny lost 6 3 7 6  7 3  to rafael nadal 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8313, 2063, 11685, 3217, 1996, 2456, 3410, 3786, 4701, 1055, 2726, 26447, 1020, 1018, 1020, 1016, 2021, 2959, 6534, 11318, 2017, 27922, 4890, 2439, 1020, 1017, 1021, 102

678 / 12941


11/12/2021 03:08:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:08:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  they know what s been given to them and all they have to do is give back the effort  and every minute of practice they were doing that 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2027, 2113, 2054, 1055, 2042, 2445, 2000, 2068, 1998, 2035, 2027, 2031, 2000, 2079, 2003, 2507, 2067, 1996, 3947, 1998, 2296, 3371, 1997, 3218, 2027, 2020, 2725, 2008, 102, 0,

679 / 12941


11/12/2021 03:09:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

680 / 12941


11/12/2021 03:09:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: roddick battled hard and had chances in the second set  but moya s clay court expertise proved the difference 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8473, 24066, 19787, 2524, 1998, 2018, 9592, 1999, 1996, 2117, 2275, 2021, 9587, 3148, 1055, 5726, 2457, 11532, 4928, 1996, 4489, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

681 / 12941


11/12/2021 03:09:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: capriati also decided against competing in the australian open warm up event  the sydney international 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 6178, 4360, 3775, 2036, 2787, 2114, 6637, 1999, 1996, 2827, 2330, 4010, 2039, 2724, 1996, 3994, 2248, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

682 / 12941


11/12/2021 03:09:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: as much as i enjoyed the two weeks off i don t think it s so productive 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2004, 2172, 2004, 1045, 5632, 1996, 2048, 3134, 2125, 1045, 2123, 1056, 2228, 2009, 1055, 2061, 13318, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

683 / 12941


11/12/2021 03:09:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   spain s victory was also remarkable for the performance of rafael nadal  who beat roddick in the opening singles 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3577, 1055, 3377, 2001, 2036, 9487, 2005, 1996, 2836, 1997, 10999, 23233, 2389, 2040, 3786, 8473, 24066, 1999, 1996, 3098, 3895, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

684 / 12941


11/12/2021 03:09:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   
Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

685 / 12941


11/12/2021 03:09:52 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:52 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: she is believed to have picked up the injury at the advanta championships at philadelphia in november and had to pull out of an exhibition match with wimbledon champion maria sharapova on 17 december 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2016, 2003, 3373, 2000, 2031, 3856, 2039, 1996, 4544, 2012, 1996, 4748, 18941, 2050, 3219, 2012, 4407, 1999, 22

686 / 12941


11/12/2021 03:09:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:09:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he played the right way but the bryans are great doubles players 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2209, 1996, 2157, 2126, 2021, 1996, 8527, 2015, 2024, 2307, 7695, 2867, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

687 / 12941


11/12/2021 03:10:03 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:03 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: people say thanks back and that is nice 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2111, 2360, 4283, 2067, 1998, 2008, 2003, 3835, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

688 / 12941


11/12/2021 03:10:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   mauresmo completed her first match in the season ending championship in 54 minutes as russia s zvonareva struggled to return her serve and failed to achieve a single break point 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 5003, 14900, 5302, 2949, 2014, 2034, 2674, 1999, 1996, 2161, 4566, 2528, 1999, 5139, 2781, 2004, 3607, 1055, 1062, 17789, 12069, 35

689 / 12941


11/12/2021 03:10:25 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:25 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  that wasn t comfortable out there at all  what i was feeling 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2008, 2347, 1056, 6625, 2041, 2045, 2012, 2035, 2054, 1045, 2001, 3110, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

690 / 12941


11/12/2021 03:10:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it s a tricky match   federer said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 1037, 24026, 2674, 28294, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

691 / 12941


11/12/2021 03:10:42 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:42 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i d worked hard to be down here and ready 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1040, 2499, 2524, 2000, 2022, 2091, 2182, 1998, 3201, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

692 / 12941


11/12/2021 03:10:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: open chief paul mcnamee had said   kim s wrist obviously isn t going to be rehabilitated 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2330, 2708, 2703, 11338, 18442, 2063, 2018, 2056, 5035, 1055, 7223, 5525, 3475, 1056, 2183, 2000, 2022, 24497, 18622, 16238, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

693 / 12941


11/12/2021 03:10:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:10:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: open chief paul mcnamee had said   kim s wrist obviously isn t going to be rehabilitated 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2330, 2708, 2703, 11338, 18442, 2063, 2018, 2056, 5035, 1055, 7223, 5525, 3475, 1056, 2183, 2000, 2022, 24497, 18622, 16238, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

694 / 12941


11/12/2021 03:11:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i have had a lot of problems with that ankle before   it will be ok   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2031, 2018, 1037, 2843, 1997, 3471, 2007, 2008, 10792, 2077, 2009, 2097, 2022, 7929, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

695 / 12941


11/12/2021 03:11:11 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:11 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: roddick hired gilbert after deciding to part from coach tarik benhabiles in the wake of his first round exit at the 2003 french open 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8473, 24066, 5086, 7664, 2044, 10561, 2000, 2112, 2013, 2873, 16985, 5480, 3841, 25459, 9463, 2015, 1999, 1996, 5256, 1997, 2010, 2034, 2461, 6164, 2012, 1996, 2494, 2413, 2330, 

696 / 12941


11/12/2021 03:11:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the best fit 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2190, 4906, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

697 / 12941


11/12/2021 03:11:21 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:21 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he then produced a love service game to finish off the match in four hours and five minutes 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2059, 2550, 1037, 2293, 2326, 2208, 2000, 3926, 2125, 1996, 2674, 1999, 2176, 2847, 1998, 2274, 2781, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

698 / 12941


11/12/2021 03:11:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: moya  spain s davis cup final hero in their recent win over the us  had to retire with an ankle injury in the first set of the final 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 9587, 3148, 3577, 1055, 4482, 2452, 2345, 5394, 1999, 2037, 3522, 2663, 2058, 1996, 2149, 2018, 2000, 11036, 2007, 2019, 10792, 4544, 1999, 1996, 2034, 2275, 1997, 1996, 2345, 10

699 / 12941


11/12/2021 03:11:32 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:32 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: koubek suspended after drugs teststefan koubek says he has been banned for three months by the international tennis federation  itf  after testing positive for a banned substance 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 12849, 12083, 5937, 6731, 2044, 5850, 5852, 2618, 15143, 12849, 12083, 5937, 2758, 2002, 2038, 2042, 7917, 2005, 2093, 2706, 2011, 

700 / 12941


11/12/2021 03:11:35 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:35 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   roddick was furious with himself for failing to take advantage of leads in both tie breaks 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8473, 24066, 2001, 9943, 2007, 2370, 2005, 7989, 2000, 2202, 5056, 1997, 5260, 1999, 2119, 5495, 7807, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

701 / 12941


11/12/2021 03:11:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: but he said he would play on whatever surface he had to in order to have a chance of winning 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 2002, 2056, 2002, 2052, 2377, 2006, 3649, 3302, 2002, 2018, 2000, 1999, 2344, 2000, 2031, 1037, 3382, 1997, 3045, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

702 / 12941


11/12/2021 03:11:49 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:49 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: williams later explained her injury problem  saying it was the result of lunging for a ball early in the first set 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 3766, 2101, 4541, 2014, 4544, 3291, 3038, 2009, 2001, 1996, 2765, 1997, 11192, 2075, 2005, 1037, 3608, 2220, 1999, 1996, 2034, 2275, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

703 / 12941


11/12/2021 03:11:57 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:11:57 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: i feel like i can t waste my time  my energy on that surface   he said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2514, 2066, 1045, 2064, 1056, 5949, 2026, 2051, 2026, 2943, 2006, 2008, 3302, 2002, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

704 / 12941


11/12/2021 03:12:09 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:12:09 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  it s a relief for me 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 1055, 1037, 4335, 2005, 2033, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

705 / 12941


11/12/2021 03:12:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:12:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: roddick hangs in on his serve to level matters but nadal is making him fight for every point 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 8473, 24066, 17991, 1999, 2006, 2010, 3710, 2000, 2504, 5609, 2021, 23233, 2389, 2003, 2437, 2032, 2954, 2005, 2296, 2391, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

706 / 12941


11/12/2021 03:12:55 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:12:55 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i think that will give me the best possible chance of doing well at the australian open 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2228, 2008, 2097, 2507, 2033, 1996, 2190, 2825, 3382, 1997, 2725, 2092, 2012, 1996, 2827, 2330, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

707 / 12941


11/12/2021 03:13:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  the impression i came away with after just seeing david and the other coaches for three days was one like i ve never seen before  especially over here 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 8605, 1045, 2234, 2185, 2007, 2044, 2074, 3773, 2585, 1998, 1996, 2060, 7850, 2005, 2093, 2420, 2001, 2028, 2066, 1045, 2310, 2196, 2464, 2077, 2926, 205

708 / 12941


11/12/2021 03:13:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: federer was named swiss sportsman of the year on saturday  to add to the bbc overseas sportsman and european sports journalists association awards he has already won 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 28294, 2001, 2315, 5364, 27168, 1997, 1996, 2095, 2006, 5095, 2000, 5587, 2000, 1996, 4035, 6931, 27168, 1998, 2647, 2998, 8845, 2523, 2982, 2002

709 / 12941


11/12/2021 03:13:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the year started in regulation fashion as justine henin hardenne beat compatriot kim clijsters at the australian open 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2095, 2318, 1999, 7816, 4827, 2004, 26377, 21863, 2378, 28751, 2638, 3786, 4012, 4502, 18886, 4140, 5035, 18856, 28418, 15608, 2012, 1996, 2827, 2330, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

710 / 12941


11/12/2021 03:13:30 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:30 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: he played one more game  but his movement was hampered and he quit 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2002, 2209, 2028, 2062, 2208, 2021, 2010, 2929, 2001, 22532, 1998, 2002, 8046, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

711 / 12941


11/12/2021 03:13:36 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:36 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: wessels was unable to compete in the mixed doubles but slovakia had already booked their place in the final for the second year running 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 14008, 11246, 2015, 2001, 4039, 2000, 5566, 1999, 1996, 3816, 7695, 2021, 10991, 2018, 2525, 17414, 2037, 2173, 1999, 1996, 2345, 2005, 1996, 2117, 2095, 2770, 102, 0, 0, 0, 0

712 / 12941


11/12/2021 03:13:41 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:41 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: dent continues adelaide progressamerican taylor dent reached the final of the australian hardcourt event in adelaide with a crushing 6 1 6 1 win over argentine juan ignacio chela 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 21418, 4247, 7364, 5082, 14074, 14735, 2078, 4202, 21418, 2584, 1996, 2345, 1997, 1996, 2827, 2524, 13421, 2724, 1999, 7364, 2007, 

713 / 12941


11/12/2021 03:13:44 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:44 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  since the davis cup in seville i have been working on my condition as well as technical and medical aspects of my game which will allow me to come into the big events of the year in top form 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2144, 1996, 4482, 2452, 1999, 18983, 1045, 2031, 2042, 2551, 2006, 2026, 4650, 2004, 2092, 2004, 4087, 1998, 2966, 5919

714 / 12941


11/12/2021 03:13:47 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:47 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: great britain captain jeremy bates paid tribute to henman s efforts over the years 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2307, 3725, 2952, 7441, 11205, 3825, 7050, 2000, 21863, 2386, 1055, 4073, 2058, 1996, 2086, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

715 / 12941


11/12/2021 03:13:56 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:56 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ivanovic beat czink in the last round of qualifying but the hungarian made the main draw as a lucky loser after katarina srebotnik withdrew injured 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 26333, 2594, 3786, 1039, 17168, 2243, 1999, 1996, 2197, 2461, 1997, 6042, 2021, 1996, 5588, 2081, 1996, 2364, 4009, 2004, 1037, 5341, 10916, 2044, 29354, 11796, 50

716 / 12941


11/12/2021 03:13:58 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:13:58 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  how can you be happy when you see your face on the cover page and talking about doping   dechy said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2129, 2064, 2017, 2022, 3407, 2043, 2017, 2156, 2115, 2227, 2006, 1996, 3104, 3931, 1998, 3331, 2055, 23799, 11703, 10536, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

717 / 12941


11/12/2021 03:14:07 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:07 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  you can win big titles and you can beat huge players in the finals and semi finals 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2017, 2064, 2663, 2502, 4486, 1998, 2017, 2064, 3786, 4121, 2867, 1999, 1996, 4399, 1998, 4100, 4399, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

718 / 12941


11/12/2021 03:14:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: veteran martinez wins thai titleconchita martinez won her first title in almost five years with victory over anna lena groenefeld at the volvo women s open in pattaya  thailand 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 8003, 10337, 5222, 7273, 2516, 8663, 5428, 2696, 10337, 2180, 2014, 2034, 2516, 1999, 2471, 2274, 2086, 2007, 3377, 2058, 4698, 14229

719 / 12941


11/12/2021 03:14:18 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:18 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i m delighted to have won against such a good opponent in a tournament of this importance   said soderling 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 1049, 15936, 2000, 2031, 2180, 2114, 2107, 1037, 2204, 7116, 1999, 1037, 2977, 1997, 2023, 5197, 2056, 2061, 4063, 2989, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

720 / 12941


11/12/2021 03:14:20 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:20 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: there were also wins for slovakia s karol beck and croatian duo ivan ljubicic and mario ancic 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2045, 2020, 2036, 5222, 2005, 10991, 1055, 10556, 13153, 10272, 1998, 7963, 6829, 7332, 1048, 9103, 13592, 2594, 1998, 7986, 2019, 19053, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

721 / 12941


11/12/2021 03:14:23 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:23 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  a lot of people are under the assumption that it s easy to play well every week and it s not 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1037, 2843, 1997, 2111, 2024, 2104, 1996, 11213, 2008, 2009, 1055, 3733, 2000, 2377, 2092, 2296, 2733, 1998, 2009, 1055, 2025, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

722 / 12941


11/12/2021 03:14:28 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:28 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: nadal  playing in the outdoor clay event for the first time  hit some powerful forehands to oust starace in a match delayed over an hour by rain 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 23233, 2389, 2652, 1999, 1996, 7254, 5726, 2724, 2005, 1996, 2034, 2051, 2718, 2070, 3928, 18921, 11774, 2015, 2000, 15068, 3367, 2732, 10732, 1999, 1037, 2674, 8394,

723 / 12941


11/12/2021 03:14:31 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:31 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  but i didn t convert on the big points 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2021, 1045, 2134, 1056, 10463, 2006, 1996, 2502, 2685, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

724 / 12941


11/12/2021 03:14:37 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:37 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i wanted to win this tournament very badly since it was in my hometown 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2359, 2000, 2663, 2023, 2977, 2200, 6649, 2144, 2009, 2001, 1999, 2026, 9627, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

725 / 12941


11/12/2021 03:14:43 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:43 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   saulnier battled to a 6 7  3 7  6 3 6 3 win over seventh seed jurgen melzer  who twisted his ankle early in the second set 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 16897, 14862, 19787, 2000, 1037, 1020, 1021, 1017, 1021, 1020, 1017, 1020, 1017, 2663, 2058, 5066, 6534, 23171, 11463, 6290, 2040, 6389, 2010, 10792, 2220, 1999, 1996, 2117, 2275, 102, 0

726 / 12941


11/12/2021 03:14:50 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:50 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   china  taiwan  india  korea  new zealand  singapore and kazakhstan are the other nations competing in new delhi from 18 24 april 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2859, 6629, 2634, 4420, 2047, 3414, 5264, 1998, 11769, 2024, 1996, 2060, 3741, 6637, 1999, 2047, 6768, 2013, 2324, 2484, 2258, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

727 / 12941


11/12/2021 03:14:53 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:14:53 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   and the belgian is looking forward to facing a severe test from williams 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1998, 1996, 6995, 2003, 2559, 2830, 2000, 5307, 1037, 5729, 3231, 2013, 3766, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

728 / 12941


11/12/2021 03:15:00 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:00 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: it would affect my team mates and playing for my country as well   roddick said 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 2009, 2052, 7461, 2026, 2136, 14711, 1998, 2652, 2005, 2026, 2406, 2004, 2092, 8473, 24066, 2056, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

729 / 12941


11/12/2021 03:15:04 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:04 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: henman hopes ended in dubai rainthird seed tim henman slumped to a straight sets defeat in his rain interrupted dubai championships quarter final against ivan ljubicic 

Tokenized: 
 	None
Features: 
 	input_ids: [101, 21863, 2386, 8069, 3092, 1999, 11558, 4542, 15222, 4103, 6534, 5199, 21863, 2386, 14319, 2000, 1037, 3442, 4520, 4154, 1999, 2010, 4542, 7153,

730 / 12941


11/12/2021 03:15:10 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:10 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: ferrero  the 2001 champion  overcame florian mayer of germany 6 3 6 4 while robredo had to fight hard to beat olivier rochus of belgium 7 6  7 5  7 6  7 5  
Tokenized: 
 	None
Features: 
 	input_ids: [101, 28390, 2080, 1996, 2541, 3410, 26463, 29517, 14687, 1997, 2762, 1020, 1017, 1020, 1018, 2096, 6487, 23417, 2018, 2000, 2954, 2524, 2000, 3786, 14439, 21326

731 / 12941


11/12/2021 03:15:14 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:14 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text: the frenchwoman produced some superb volleys in the final set and claimed victory on her fifth match point after two hours and 18 minutes 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1996, 2413, 10169, 2550, 2070, 21688, 28073, 2015, 1999, 1996, 2345, 2275, 1998, 3555, 3377, 2006, 2014, 3587, 2674, 2391, 2044, 2048, 2847, 1998, 2324, 2781, 102, 0, 0, 0, 

732 / 12941


11/12/2021 03:15:17 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:17 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i knew i had to fight hard today and that s exactly what happened 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2354, 1045, 2018, 2000, 2954, 2524, 2651, 1998, 2008, 1055, 3599, 2054, 3047, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

733 / 12941


11/12/2021 03:15:22 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:22 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:  i will try not to think about what happened but i don t think it will have any effect on the british team s concentration 
Tokenized: 
 	None
Features: 
 	input_ids: [101, 1045, 2097, 3046, 2025, 2000, 2228, 2055, 2054, 3047, 2021, 1045, 2123, 1056, 2228, 2009, 2097, 2031, 2151, 3466, 2006, 1996, 2329, 2136, 1055, 6693, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

734 / 12941


11/12/2021 03:15:27 - INFO - farm.data_handler.processor -   *** Show 1 random examples ***
11/12/2021 03:15:27 - INFO - farm.data_handler.processor -   

      .--.        _____                       _      
    .'_\/_'.     / ____|                     | |     
    '. /\ .'    | (___   __ _ _ __ ___  _ __ | | ___ 
      "||"       \___ \ / _` | '_ ` _ \| '_ \| |/ _ \ 
       || /\     ____) | (_| | | | | | | |_) | |  __/
    /\ ||//\)   |_____/ \__,_|_| |_| |_| .__/|_|\___|
   (/\||/                             |_|           
______\||/___________________________________________                     

ID: None
Clear Text: 
 	text:   

Tokenized: 
 	None
Features: 
 	input_ids: [101, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

735 / 12941
736
Done with embedding file.


## 🔥 4. Preprocessing, Topic Modeling and Contextual Embedding Aggregation

Here, we stick for most part to the original HOTT code. In this step, the vocabulary is created and it is ensured that there is no out of vocabulary problem by looking up the words from pre-trained Word2Vec or GloVe embeddings and restricting the overall vocabulary for LDA on these words.

Make sure you select the correct dataset below

In [12]:
!pip install lda
!pip install pot

# download glove vectors
!wget -P data https://nlp.stanford.edu/data/glove.6B.zip


--2021-11-12 09:01:04--  https://nlp.stanford.edu/data/glove.6B.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://downloads.cs.stanford.edu/nlp/data/glove.6B.zip [following]
--2021-11-12 09:01:05--  http://downloads.cs.stanford.edu/nlp/data/glove.6B.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 862182613 (822M) [application/zip]
Saving to: ‘data/glove.6B.zip’


2021-11-12 09:05:24 (3,18 MB/s) - ‘data/glove.6B.zip’ saved [862182613/862182613]



In [13]:
!mkdir data/glove.6B
!unzip data/glove.6B.zip -d data/glove.6B

Archive:  data/glove.6B.zip
  inflating: glove.6B.50d.txt        
  inflating: glove.6B.100d.txt       
  inflating: glove.6B.200d.txt       
  inflating: glove.6B.300d.txt       


In [4]:
from sklearn.preprocessing import normalize
from sklearn.model_selection import train_test_split
import time
import lda
import numpy as np
from knn_classifier import knn
import distances
import hott
import pickle
import lda
import os
import csv
import re
import sys
from collections import defaultdict
from hott import sparse_ot
from sklearn.metrics.pairwise import euclidean_distances
from scipy.io import loadmat
from nltk.corpus import stopwords
from nltk.stem.snowball import SnowballStemmer
from bert import tokenization
data_path = 'data/'

data_name = 'bbcsport-emd_tr_te_split.mat'
raw_data_name='bbsport_by_sentence.txt'
train_test_dataset=False
by_sentence=True

embeddings_path = './data/glove.6B/glove.6B.300d.txt'

In [5]:

''' this function maps the tokens used in the topic model to the tokens 
 from the contextual word embeddings
 obtained by the BERT tokenizer'''
 
def check_previous(i, x, z):
    m=z[i-1]+x.strip("##")
    if m.startswith("##"):
        m=check_previous(i-1,m,z)
    return m

 
def sub_aggregate(i,g,embs):
    # helper function used to average the embedding values of multi part subwords from BERT
    em=[]
    for x in range(i,g+1):
        em.append(embs[x])
    if len(em)>1:
        return np.mean(np.asarray(em), axis=0)
    else:
        return np.asarray(em[0])
    
 
def pre_aggregate(c_embeddings):
    # this function is the first step for resolving BERT subwords into the topic model's tokens
    todo=str(len(c_embeddings))
    
    word_emb = defaultdict(list)
    word_doc = defaultdict(list)
    doc_word = defaultdict(list)
    doc_emb ={}
    try:
      
            for d in range(0,len(c_embeddings)):
                print(str(str(d)+"\\"+todo))
                for c in range(len(c_embeddings[d])):
                    if isinstance(c_embeddings[d][c]["context"], list):
                        docwords= c_embeddings[d][c]["context"]
                    else:
                         docwords= c_embeddings[d][c]["context"].split()
                    for i in range(1,len(docwords)):
                        v=docwords[i]
                        if v.startswith("##") and not docwords[i-1].startswith("##"):
                            u=str(docwords[i-1]+v.lstrip("##"))
                            for g in range(i, len(docwords)):
                                if len(docwords)>g+1:
                                    if docwords[g+1].startswith("##"):
                                        u=u+str(docwords[g+1].lstrip("##"))
                                    else:
                                        word_emb[u].append(sub_aggregate(i-1,g, c_embeddings[d][c]["vec"]))
                                        word_doc[u].append(d)
                                        doc_emb[d]=np.mean(list(c_embeddings[d][c]["vec"]), axis=0)
                                        doc_word[d].append(docwords)
                                        break
                        elif len(docwords)>i+1:
                            if docwords[i+1].startswith("##"):
                                continue
                            if not docwords[i+1].startswith("##"):
                                word_emb[v].append(c_embeddings[d][c]["vec"][i])
                                word_doc[v].append(d)
                                doc_emb[d]=np.mean(list(c_embeddings[d][c]["vec"]), axis=0)
                                doc_word[d].append(docwords)
                                                        
                        else:
                            if len(docwords)==i and not docwords[i].startswith("##"):
                                word_emb[v].append(c_embeddings[d][c]["vec"][i])
                                word_doc[v].append(d)
                                doc_emb[d]=np.mean(list(c_embeddings[d][c]["vec"]), axis=0)
                                doc_word[d].append(docwords)   
       
    except Exception as E:
                    print("error")
                    print(E)
                    print(d)
                    print(v)
                    traceback.print_exc()
    print("reached the end")                
    return [word_emb, word_doc, doc_word, doc_emb]

def map_to_emb(o,w, word_emb, word_doc, doc_word, doc_emb ,m_topic_wdocs):
    """"uses the output of the pre-aggregate function to come up with the final word embeddings"""
    #print(o)
    print(w)
    try:
        semb=[]
        memb=[]
        msemb=[]
        emb=[]
        stemmer=SnowballStemmer("english")
        print(len(word_doc.keys()))
        for g,v in enumerate(word_doc.keys()):
    
                if v.lower()==w:
                    if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                        semb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                    else:
                        semb.append(np.asarray(word_emb[v],dtype=float))
                    #print("semb 1")
                
                    for d in word_doc[v]:
                        emb.append(np.asarray(doc_emb[d],dtype=float))
                        if w in m_topic_wdocs:
                           
                            for b in m_topic_wdocs[w]: 
                                if b==d:
                                    #for d in word_doc[v]:
                                    memb.append(doc_emb[d])
                                    if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                        msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                    else:
                                        msemb.append(np.asarray(word_emb[v],dtype=float))
                           
                        else:
                            #for d in word_doc[v]:
                            memb.append(doc_emb[d])
                            if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                            else:
                                msemb.append(np.asarray(word_emb[v],dtype=float))
        else:
            for g,v in enumerate(word_doc.keys()):
              if v.lower().startswith(w):
                    if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                        semb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                    else:
                        semb.append(np.asarray(word_emb[v],dtype=float))
                    #print("semb 2")
                   
                    for d in word_doc[v]:
                        emb.append(np.asarray(doc_emb[d],dtype=float))
                        if w in m_topic_wdocs:
                           
                            for b in m_topic_wdocs[w]: 
                                if b==d:
                                    #for d in word_doc[v]:
                                    memb.append(doc_emb[d])
                                    if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                        msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                    else:
                                        msemb.append(np.asarray(word_emb[v],dtype=float))
                           
                        else:
                            for d in word_doc[v]:
                                memb.append(doc_emb[d])
                            if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                            else:
                                msemb.append(np.asarray(word_emb[v],dtype=float))
            else:
                for g,v in enumerate(word_doc.keys()):
                  j=stemmer.stem(v)
                  if j.lower().startswith(w):
                        if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                            semb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                        else:
                            semb.append(np.asarray(word_emb[v],dtype=float))
                        #print("semb 3")
                        
                        
                        for d in word_doc[v]:
                            emb.append(np.asarray(doc_emb[d],dtype=float))
                            if w in m_topic_wdocs:
                               
                                for b in m_topic_wdocs[w]: 
                                    if b==d:
                                        #for d in word_doc[v]:
                                        memb.append(doc_emb[d])
                                        if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                            msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                        else:
                                            msemb.append(np.asarray(word_emb[v],dtype=float))
                               
                            else:
                                #or d in word_doc[v]:
                                memb.append(doc_emb[d])
                                if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                    msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                else:
                                    msemb.append(np.asarray(word_emb[v],dtype=float))
                
        if len(emb)>1 and len(emb)<=300000000:
            #print(emb)
            #input()
            emb=np.mean(emb, axis=0)
        elif len(emb)>1 and len(emb)>300000000:
            emb=emb[:300000000]
            emb=np.mean(emb, axis=0)
            #print(emb)
            #print("emb after mean")
            #input()
        if len(semb)>1 and len(semb)<=300000000:

            #print(semb)
            #input()
            semb=np.mean(semb, axis=0)
        elif len(semb)>1 and len(semb)>300000000:
            semb=semb[:300000000]
            semb=np.mean(semb, axis=0)

            #print(semb)
            #print("semb after mean")
            #input()
        if len(memb)>1 and len(memb)<=300000000:

            memb=np.mean(memb, axis=0)

        elif len(memb)>1 and len(memb)>300000000:
            memb=memb[:300000000]
            memb=np.mean(memb, axis=0)

        if len(msemb)>1 and len(msemb)<=300000000:

            msemb=np.mean(msemb, axis=0)
        elif len(msemb)>1 and len(msemb)>300000000:
            msemb=msemb[:300000000]
            msemb=np.mean(msemb, axis=0)

        if np.asarray(emb).shape==(1,768):
            emb=np.asarray(emb).reshape(768,)
        if np.asarray(semb).shape==(1,768):
            semb=np.asarray(semb).reshape(768,)
        if np.asarray(memb).shape==(1,768):
            memb=np.asarray(memb).reshape(768,)
        if np.asarray(msemb).shape==(1,768):
            msemb=np.asarray(msemb).reshape(768,)
        if np.asarray(emb).shape==(768,) and np.asarray(semb).shape==(768,) and np.asarray(memb).shape==(768,) and np.asarray(msemb).shape==(768,):
            return [w, np.asarray(emb), np.asarray(semb), np.asarray(memb), np.asarray(msemb)]
        else:
            
            
            print("Word could not be resolved in BERT tokens")
            #print(w)
            should_restart = True
            z=w
            while should_restart:
                should_restart = False
                
                z = z[:-1]
                #print(np.asarray(emb).shape)
                #print(np.asarray(semb).shape)
                #print(np.asarray(msemb).shape)
                #print(np.asarray(memb).shape)
                #print(semb)
                #print(m_topic_wdocs[w])
                semb=list(semb)
                emb=list(emb)
                memb=list(memb)
                msemb=list(msemb)
                print(z)
                #input()
                for g,v in enumerate(word_doc.keys()):
                    if v.lower()==z:
                        if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                            semb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                        else:
                            semb.append(np.asarray(word_emb[v],dtype=float))
                        #print("semb 4")
                       
                        for d in word_doc[v]:
                            emb.append(np.asarray(doc_emb[d],dtype=float))
                            if w in m_topic_wdocs:
                               
                                for b in m_topic_wdocs[w]: 
                                    if b==d:
                                        #for d in word_doc[v]:
                                        memb.append(doc_emb[d])
                                        if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                            msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                        else:
                                            msemb.append(np.asarray(word_emb[v],dtype=float))
                                               
                            else:
                                #for d in word_doc[v]:
                                memb.append(doc_emb[d])
                                if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                    msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                else:
                                    msemb.append(np.asarray(word_emb[v],dtype=float))
                            
                else:
                    for g,v in enumerate(word_doc.keys()):
                      if v.lower().startswith(z):
                            if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                semb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                            else:
                                semb.append(np.asarray(word_emb[v],dtype=float))
                            #print("semb 5")
                      
                            for d in word_doc[v]:
                                emb.append(np.asarray(doc_emb[d],dtype=float))
                                if w in m_topic_wdocs:
                                   
                                    for b in m_topic_wdocs[w]: 
                                        if b==d:
                                            #for d in word_doc[v]:
                                            memb.append(doc_emb[d])
                                            if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                                msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                            else:
                                                msemb.append(np.asarray(word_emb[v],dtype=float))
                                   
                                    else:
                                        memb.append(doc_emb[d])
                                        if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                            msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                        else:
                                            msemb.append(np.asarray(word_emb[v],dtype=float))
                                else:
                                  
                                    memb.append(doc_emb[d])
                                    if np.asarray(word_emb[v],dtype=float).shape!=(768,):
                                        msemb.append(np.mean(np.asarray(word_emb[v],dtype=float), axis=0))
                                    else:
                                        msemb.append(np.asarray(word_emb[v],dtype=float))
                #print(memb)
                #print(len(memb))
                print(np.asarray(emb).shape)
                print(np.asarray(semb).shape)
                print(np.asarray(memb).shape)
                print(np.asarray(msemb).shape)
                #input()
                if len(emb)>1:
                    emb=np.mean(emb, axis=0)
                if len(semb)>1:
                    semb=np.mean(semb, axis=0)
                if len(memb)>1:
                    memb=np.mean(memb, axis=0)
                if len(msemb)>1:
                    msemb=np.mean(msemb, axis=0)
                if np.asarray(emb).shape==(1,768):
                    emb=np.asarray(emb).reshape(768,)
                if np.asarray(semb).shape==(1,768):
                    semb=np.asarray(semb).reshape(768,)
                if np.asarray(memb).shape==(1,768):
                    memb=np.asarray(memb).reshape(768,)
                if np.asarray(msemb).shape==(1,768):
                    msemb=np.asarray(msemb).reshape(768,)
                if  np.asarray(emb).shape==(768,) and np.asarray(semb).shape==(768,) and np.asarray(memb).shape==(768,) and np.asarray(msemb).shape==(768,):
                    return [w, np.asarray(emb), np.asarray(semb), np.asarray(memb), np.asarray(msemb)]
                else:
                        should_restart=True
    except Exception as e:
        print(e)
        print(traceback.print_exc()) 
 
    
 

def load_wmd_data(path):
    """Load data used in the WMD paper for baselines.
    """
    
    mat_data = loadmat(path, squeeze_me=True, chars_as_strings=True)

    try:
        y = mat_data['Y'].astype(np.int)
    except KeyError:
        y = np.concatenate((mat_data['ytr'].astype(np.int),
                            mat_data['yte'].astype(np.int)))
    try:
        embeddings_of_doc_words = mat_data['X']
    except KeyError:
        embeddings_of_doc_words = np.concatenate((mat_data['xtr'],
                                                  mat_data['xte']))
    try:
        doc_word_counts = mat_data['BOW_X']
    except KeyError:
        doc_word_counts = np.concatenate((mat_data['BOW_xtr'], mat_data['BOW_xte']))
    try:
        doc_words = mat_data['words']
    except KeyError:
        print(mat_data.keys())
        doc_words = np.concatenate((mat_data['words_tr'], mat_data['words_te']))
        
    vocab = []
    embed_vocab = {}
    for d_w, d_e in zip(doc_words, embeddings_of_doc_words):
        if type(d_w) == str:
            d_w = [d_w]
        words = [w for w in d_w if type(w) == str]
        if len(words) == 1:
            d_e = d_e.reshape((-1, 1))
        for i, w in enumerate(words):
            if w not in vocab:
                vocab.append(w)
                embed_vocab[w] = d_e[:, i]
            else:
                if not np.allclose(embed_vocab[w], d_e[:, i]):
                    print('Problem with embeddings')
                    break
    
    bow_data = np.zeros((len(doc_word_counts), len(vocab)),
                        dtype=np.int)
    for doc_idx, (d_w, d_c) in enumerate(zip(doc_words, doc_word_counts)):
        if type(d_w) == str:
            d_w = [d_w]
        words = [w for w in d_w if type(w) == str]
        if len(words) == 1:
            d_c = np.array([d_c])
        words_idx = np.array([vocab.index(w) for w in words])
        bow_data[doc_idx, words_idx] = d_c.astype(np.int)

    return vocab, embed_vocab, bow_data, y, doc_words

def load_20news():
    #20ng is not in kusner's dataset, we need to fetch it elsewhere
    from sklearn.datasets import fetch_20newsgroups
    from sklearn.feature_extraction.text import CountVectorizer
    newsgroups_train = fetch_20newsgroups(subset='train', remove=('headers', 'footers', 'quotes'))
    newsgroups_test = fetch_20newsgroups(subset='test', remove=('headers', 'footers', 'quotes'))
    
    tf_vectorizer = CountVectorizer(max_df=0.95, min_df=5, stop_words='english')
    X_train = tf_vectorizer.fit_transform(newsgroups_train.data)
    vocab = sorted(tf_vectorizer.vocabulary_.items(), key = lambda x: x[1])
    vocab = [v[0] for v in vocab]
    vocab_start = 1520 # removing weird words
    min_doc_length = 20 - 1 # removing short docs
    vocab = vocab[vocab_start:]
    X_train = X_train.toarray()[:,vocab_start:]
    y_train = newsgroups_train.target[X_train.sum(axis=1)>min_doc_length]
    X_train = X_train[X_train.sum(axis=1)>min_doc_length]
    doc_words=newsgroups_train.data + newsgroups_test.data
    X_test = tf_vectorizer.transform(newsgroups_test.data)
    X_test = X_test.toarray()[:,vocab_start:]
    y_test = newsgroups_test.target[X_test.sum(axis=1)>min_doc_length]
    X_test = X_test[X_test.sum(axis=1)>min_doc_length]
    
    X = np.vstack((X_train, X_test)).astype(np.int)
    y = np.concatenate((y_train, y_test))
    

    
    return vocab, X, y,  doc_words



def reduce_vocab(bow_data, vocab, embed_vocab, embed_aggregate='mean'):
    """Reduce vocabulary size by stemming and removing stop words.
    """
    vocab = np.array(vocab)
    short = np.array([len(w) > 2 for w in vocab])
    stop_words = set(stopwords.words('english'))
    stop = np.array([w not in stop_words for w in vocab])
    reduced_vocab = vocab[np.logical_and(short, stop)]
    reduced_bow_data = bow_data[:, np.logical_and(short, stop)]
    stemmer = SnowballStemmer("english")
    stemmed_dict = {}
    stemmed_idx_mapping = {}
    stemmed_vocab = []
    for i, w in enumerate(reduced_vocab):
        stem_w = stemmer.stem(w)
        if stem_w in stemmed_vocab:
            stemmed_dict[stem_w].append(w)
            stemmed_idx_mapping[stemmed_vocab.index(stem_w)].append(i)
        else:
            stemmed_dict[stem_w] = [w]
            stemmed_vocab.append(stem_w)
            stemmed_idx_mapping[stemmed_vocab.index(stem_w)] = [i]

    stemmed_bow_data = np.zeros((bow_data.shape[0], len(stemmed_vocab)),
                                dtype=np.int)
    for i in range(len(stemmed_vocab)):
        stemmed_bow_data[:, i] = reduced_bow_data[:, stemmed_idx_mapping[i]].sum(axis=1).flatten()

    word_counts = stemmed_bow_data.sum(axis=0)
    stemmed_reduced_vocab = np.array(stemmed_vocab)[word_counts > 2].tolist()
    stemmed_reduced_bow_data = stemmed_bow_data[:, word_counts > 2]

    stemmed_reduced_embed_vocab = {}
    for w in stemmed_reduced_vocab:
        old_w_embed = [embed_vocab[w_old] for w_old in stemmed_dict[w]]
        if embed_aggregate == 'mean':
            new_w_embed = np.mean(old_w_embed, axis=0)
        elif embed_aggregate == 'first':
            new_w_embed = old_w_embed[0]
        else:
            print('Unknown embedding aggregation')
            break
        stemmed_reduced_embed_vocab[w] = new_w_embed

    return (stemmed_reduced_vocab,
            stemmed_reduced_embed_vocab,
            stemmed_reduced_bow_data)


def get_embedded_data(bow_data, embed_vocab, vocab):
    """Map bag-of-words data to embedded representation."""
    M, V = bow_data.shape
    embed_data = [[] for _ in range(M)]
    for i in range(V):
        for d in range(M):
            if bow_data[d, i] > 0:
                for _ in range(bow_data[d, i]):
                    embed_data[d].append(embed_vocab[vocab[i]])
    return [np.array(embed_doc) for embed_doc in embed_data]


def change_embeddings(vocab, bow_data, embed_path):
    """Change embedding data if vocabulary has been reduced."""
    all_embed_vocab = {}
    with open(embed_path, 'r') as file:
        for line in file.readlines():
            word = line.split(' ')[0]
            embedding = [float(x) for x in line.split(' ')[1:]]
            all_embed_vocab[word] = embedding

    data_embed_vocab = {}
    new_vocab_idx = []
    new_vocab = []
    for i, w in enumerate(vocab):
        if w in all_embed_vocab:
            data_embed_vocab[w] = all_embed_vocab[w]
            new_vocab_idx.append(i)
            new_vocab.append(w)
    #print(new_vocab_idx[-1])
    #print(len(bow_data))
    
    bow_data = bow_data[:, new_vocab_idx]
    return new_vocab, data_embed_vocab, bow_data


def fit_topics(data, embeddings, vocab, K, tpath, ldapath):
    """Fit a topic model to bag-of-words data."""
    if os.path.exists(ldapath):
        model=pickle.load(open(ldapath, "rb"))
    else:
        model = lda.LDA(n_topics=K, n_iter=1500, random_state=1)
        model.fit(data)
        pickle.dump(model, open(ldapath, 'wb'))
    topics = model.topic_word_
    #lda_centers not used?
    #lda_centers = np.matmul(topics, embeddings)
    
    # todo add ITF support by multiplying ITF with topic dist
    
    print('LDA Gibbs topics')
    n_top_words = 20
    top_topic_words = {}
    for i, topic_dist in enumerate(topics):
        topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words + 1):-1]
        print('Topic {}: {}'.format(i, ' '.join(topic_words)))
        top_topic_words[i] = topic_words
    print('\n')
    
    topic_proportions = model.doc_topic_
    with open(tpath, "w", encoding="UTF-8") as f:
        for i in range(K):
            f.write(str(i) +"\t"+ re.sub("\n","",str(top_topic_words[i])))
            f.write("\n")

    return topics,  topic_proportions, top_topic_words




def loader(data_path,
           embeddings_path,
           contextual_embeddings,
           raw_data,
           p=1,
           K_lda=70,
           glove_embeddings=True,
           stemming=True,
           n_words_keep = 20, 
           train_test_dataset=False, by_sentence=False):
    """ Load dataset and embeddings from data path."""
    sys.setrecursionlimit(100000)
    print("reclimit")
    print(sys.getrecursionlimit())
    # Will segfault without this line.
    import resource
    resource.setrlimit(resource.RLIMIT_STACK, [0x10000000, resource.RLIM_INFINITY])
    sys.setrecursionlimit(0x100000)
    # Load dataset from data_path
    if data_path =="data/20news":
        vocab, bow_data, y, words  = load_20news()
    else:
        vocab, embed_vocab, bow_data, y, words = load_wmd_data(data_path)
    y = y - 1
    raw_docs= None
    context_emb=None
    # create contextual embeddings using the raw data as an input
    if contextual_embeddings == True:
        if train_test_dataset==False:
            handle= open("data/raw_data/%s_emb.pkl"%os.path.splitext(raw_data)[0],"rb") 
            c_embeddings=pickle.load(handle)
           
        #for raw datasets with train test split  
        '''
        else:
            x=np.load(str(os.path.splitext(raw_data)[0]+"_emb.npy"), allow_pickle=True)
            if os.path.exists(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy")):
                print("loading already merged dicts")
                c_embeddings=np.load(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy"),allow_pickle=True)

            elif "train" in raw_data:              
                if os.path.exists("".join(["test" if x=="train" else x for x in raw_data.partition("train")])):
                    if os.path.exists(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_emb.npy"):
                        z=np.load(str(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_emb.npy"), allow_pickle=True)
                        merged={}
                        print("merging dicts now")
                        last=[]
                        for i in x.item().keys():
                            merged[int(i)] = x.item().get(i)
                            last.append(i)
                        last=int(sorted(last)[-1])+1
                        # merging both contextual embedding files together
                        for i in z.item().keys():
                            merged[int(i+last)] = z.item().get(i)
                        np.save(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy"),merged)
                        c_embeddings=np.load(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy"),allow_pickle=True)

                    else:
                        print("does not exist, but will create now:")
                        print(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_emb.npy")
                        z_docs, z=map_raw("".join(["test" if x=="train" else x for x in raw_data.partition("train")]), by_sentence)
                        np.save(str(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_emb.npy"), z)
                        z=np.load(str(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_emb.npy"), allow_pickle=True)
                        with open(str(os.path.splitext("".join(["test" if x=="train" else x for x in raw_data.partition("train")]))[0]+"_proc.csv"),"w", encoding="utf-8") as f:
                            w=csv.writer(f)            
                            for key, val in z_docs.items():
                                w.writerow([key, val])
                        merged={}
                        last=[]
                        for i in x.item().keys():
                            merged[int(i)] = x.item().get(i)
                            last.append(i)
                        last=int(sorted(last)[-1])+1
                        # merging both contextual embedding files together
                        for i in z.item().keys():
                            merged[int(i+last)] = z.item().get(i)
                            #del z.item()[i]
                        np.save(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy"),merged)
                        c_embeddings=np.load(str(os.path.splitext("".join(["" if x=="train_" else x for x in raw_data.partition("train_")]))[0]+"_emb.npy"),allow_pickle=True)


            elif "test" in raw_data:              
                if os.path.exists("".join(["train" if x=="test" else x for x in raw_data.partition("test")])):
                    if os.path.exists(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_emb.npy"):
                        z=np.load(str(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_emb.npy"), allow_pickle=True)
                        merged={}
                        last=[]
                        for i in x.item().keys():
                            merged[int(i)] = x.item().get(i)
                            last.append(i)
                        last=int(sorted(last)[-1])+1
                        # merging both contextual embedding files together
                        for i in z.item().keys():
                            merged[int(i+last)] = z.item().get(i)
                            #del z.item()[i]
                        np.save(str(os.path.splitext("".join(["" if x=="test_" else x for x in raw_data.partition("test_")]))[0]+"_emb.npy"),merged)
                        c_embeddings=np.load(str(os.path.splitext("".join(["" if x=="test_" else x for x in raw_data.partition("test_")]))[0]+"_emb.npy"),allow_pickle=True)

                    else:
                        print("does not exist, but will create now:")
                        print(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_emb.npy")
                        z_docs, z=map_raw("".join(["train" if x=="test" else x for x in raw_data.partition("test")]), by_sentence)
                        np.save(str(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_emb.npy"), z)
                        z=np.load(str(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_emb.npy"), allow_pickle=True)
                        with open(str(os.path.splitext("".join(["train" if x=="test" else x for x in raw_data.partition("test")]))[0]+"_proc.csv"),"w", encoding="utf-8") as f:
                            w=csv.writer(f)            
                            for key, val in z_docs.items():
                                w.writerow([key, val])
                        merged={}
                        last=[]
                        for i in x.item().keys():
                            merged[int(i)] = x.item().get(i)
                            last.append(i)
                        last=int(sorted(last)[-1])+1
                        # merging both contextual embedding files together
                        for i in z.item().keys():
                            merged[int(i+last)] = z.item().get(i)
                            #del z.item()[i]
                        np.save(str(os.path.splitext("".join(["" if x=="test_" else x for x in raw_data.partition("test_")]))[0]+"_emb.npy"),merged)
                        c_embeddings=np.load(str(os.path.splitext("".join(["" if x=="test_" else x for x in raw_data.partition("test_")]))[0]+"_emb.npy"),allow_pickle=True)

            else:
                raise("wrong flag selected, no file with train or test prefix")
            # overwrite document length if we merge raw train and test'''
      
    # Use GLOVE word embeddings
    if glove_embeddings:
        vocab, embed_vocab, bow_data = change_embeddings(
            vocab, bow_data, embeddings_path)
    # Reduce vocabulary by removing short words, stop words, and stemming
    if stemming:
        vocab, embed_vocab, bow_data = reduce_vocab(
            bow_data, vocab, embed_vocab, embed_aggregate='mean')

    # Matrix of word embeddings
    embeddings = np.array([embed_vocab[w] for w in vocab])
    '''print(embeddings.shape)
    print(embeddings[0].shape)
    print(type(embeddings))
    print(type(embeddings[0]))
    print(embeddings[0])
    print(depth(embeddings[0]))
    input()'''
    ldapath=str(os.path.splitext(raw_data)[0]+"_lda.pkl")
    tpath=str(os.path.splitext(raw_data)[0]+"_topics.txt")
    topics, topic_proportions, top_topic_words = fit_topics(
        bow_data, embeddings, vocab, K_lda, tpath, ldapath)
 
    cost_embeddings = euclidean_distances(embeddings, embeddings) ** p
    
    cost_topics = np.zeros((topics.shape[0], topics.shape[0]))
    
    
    ## Reduce topics to top-20 words
    if n_words_keep is not None:                                                                                                                                                                                                            
        for k in range(K_lda):
            to_0_idx = np.argsort(-topics[k])[n_words_keep:]
            topics[k][to_0_idx] = 0
    
    # for S-HOTTER
    cost_scontextual = np.zeros((topics.shape[0], topics.shape[0]))        
    # for A-HOTTER
    cost_contextual = np.zeros((topics.shape[0], topics.shape[0]))
    # for M-HOTTER
    cost_mcontextual = np.zeros((topics.shape[0], topics.shape[0]))
    # for MS-HOTTER
    cost_msincontextual = np.zeros((topics.shape[0], topics.shape[0]))
   
  
    for i in range(cost_topics.shape[0]):
        for j in range(i + 1, cost_topics.shape[1]):
            cost_topics[i, j] = sparse_ot(topics[i], topics[j], cost_embeddings)
    cost_topics = cost_topics + cost_topics.T
    
    # for M-HOTTER
    # getting for each topic a set of top documents
  

   
    '''Mapping the contextual embeddings from the BERT Tokenizer 
    to the tokens used by the topic model '''
    if not os.path.exists(str(os.path.splitext(raw_data)[0]+"_con_emb.npy")): 
        print("vocab list length")
        print(len(vocab))
        todo=len(vocab)
        stemmer = SnowballStemmer("english")
        # since the results come in a strange format we have to put it in a dict
        all_embeddings={}
        single_embeddings={}
        m_embeddings={}
        ms_embeddings={}
        #results=[pool.apply_async(map_to_emb, (o,w, c_embeddings, m_topic_words)) for o, w in enumerate(vocab)]
        #output = [p.get() for p in results]
        try:
            x= pre_aggregate(c_embeddings)
            print("now continuing with aggregation")
            c_embeddings=[]
            word_emb = x[0] 
            word_doc= x[1]  
            
            doc_word= x[2]  
            doc_emb= x[3] 
            
            
            
            ''''process for m hotter and t hotter (later in the code t hotter is called ms hotter)'''
            top_documents=defaultdict(list)
            for i, doc in enumerate(topic_proportions):
                prominent_topic=np.max(doc)
                prominent_index=np.argmax(doc)
                top_documents[prominent_index].append([i, prominent_topic])
            second_tier=[]
            # checking if all topics have at least one document that has that topic
            # as a promoninent topic
            for i in range(K_lda):
                try:
                    print(i)
                    print(top_documents[i])
                except KeyError as E:
                    print(E)
                    print(traceback.print_exc())
                    second_tier.append(i)
            # if there is one topic that does not have a document assigned, check all
            # documents for the second most probable topic
            if second_tier:
                #print("found second tier docs")
                #print(second_tier)
            for i, doc in enumerate(topic_proportions):
                    print(doc)
                    prominent_topic=np.sort(doc)[-2]
                    prominent_index=np.argsort(doc)[-2]
                    if prominent_index in second_tier:
                        top_documents[prominent_index].append([i, prominent_topic])
            third_tier=[]
            for i in range(K_lda):
                try:
                    print(top_documents[i])
                except KeyError as E:
                    print(E)
                    third_tier.append(i)
            # if there is one topic that does not have a document assigned, check all
            # documents for the third most probable topic
            if third_tier:
                #print("found third tier docs")
                #print(third_tier)
                #input()
            for i, doc in enumerate(topic_proportions):
                    #print(doc)
                    prominent_topic=np.sort(doc)[-3]
                    prominent_index=np.argsort(doc)[-3]
                    if prominent_index in third_tier:
                        top_documents[prominent_index].append([i, prominent_topic])
            # continue with that as needed...
            '''fourth_tier=[]
            for i in range(0,69):
                try:
                    print(top_documents[i])
                except KeyError as E:
                    print(E)
                    fourth_tier.append(i)
            if fourth_tier:
                print("found fourth tier docs")
                print(fourth_tier)
                input()'''
            m_topic_words=defaultdict(list)
            
            stemmed_word_doc=defaultdict(list)
            for w,d in word_doc.items():
                stemmed_word_doc[stemmer.stem(w)].append(d)
            
            #retrieve documents for the the top words to aggregate for M-HOTTER
            
            
            for t in range(K_lda):
                try:
                    #print(topics[t])
                    doc_list=sorted(top_documents[t], key=lambda x: x[1])
                    #print(doc_list)
                    for w in top_topic_words[t]:
                        if w in stemmed_word_doc.keys():
                            for d in stemmed_word_doc[w]:
                                if d in doc_list:
                                    m_topic_words[w].append(d)
                                
                except Exception as E:
                            print(traceback.print_exc())
                            print(E)
                            print(w)
                            #input()
            np.save("%s_topwords"%raw_data, m_topic_words)    
            
            
            
            x=None
          
          
            for o,w in enumerate(vocab):
                print(str(o)+"/"+str(todo))
                x=map_to_emb(o,w, word_emb, word_doc, doc_word, doc_emb,m_topic_words )
                # x[0] is a word
                # x[1] is the (possibly aggregated) contextual embedding for it
            
                if isinstance(x, list) or isinstance(x, tuple):
                   all_embeddings[x[0]]=x[1]
                   single_embeddings[x[0]]=x[2]
                   m_embeddings[x[0]]=x[3]
                   ms_embeddings[x[0]]=x[4]
        except Exception as e:
            print(e)
            traceback.print_exc()
        #pool.close()
        #pool.join()
        #pool.terminate()
        
  
        #saving this as a numpy array offers compression    
        np.save(str(os.path.splitext(raw_data)[0]+"_con_emb.npy"), all_embeddings)
        np.save(str(os.path.splitext(raw_data)[0]+"_s_emb.npy"), single_embeddings)
        np.save(str(os.path.splitext(raw_data)[0]+"_m_emb.npy"), m_embeddings)
        np.save(str(os.path.splitext(raw_data)[0]+"_ms_emb.npy"), ms_embeddings)
        # we have to load it back because numpy arrays 
        #are accessed differently than dictionaries
        single_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_s_emb.npy"), allow_pickle=True)
        m_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_m_emb.npy"), allow_pickle=True)
        ms_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_ms_emb.npy"), allow_pickle=True)
        all_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_con_emb.npy"), allow_pickle=True)
    else:
        
        single_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_s_emb.npy"), allow_pickle=True)
        m_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_m_emb.npy"), allow_pickle=True)
        ms_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_ms_emb.npy"), allow_pickle=True)
        all_embeddings=np.load(str(os.path.splitext(raw_data)[0]+"_con_emb.npy"), allow_pickle=True)
    cms_embeddings=np.array([np.array(ms_embeddings.item().get(w)) for w in vocab])
    cm_embeddings=np.array([np.array(m_embeddings.item().get(w)) for w in vocab])
    s_embeddings=np.array([np.array(single_embeddings.item().get(w)) for w in vocab])
    con_embeddings=np.array([np.array(all_embeddings.item().get(w)) for w in vocab])
    # making sure that everything has the same shape
    for i, w in enumerate(con_embeddings):
        #print("shape")
        #print(w.shape)
        if w.shape!=(768,):
            try:
                con_embeddings[i]=w.reshape(768,)
                #print(con_embeddings[i].shape) 
            except ValueError:
                con_embeddings[i]=np.mean(np.asarray(w), axis=0)
                print(con_embeddings[i].shape)
    con_embeddings=np.stack(con_embeddings)
    for i, w in enumerate(s_embeddings):
        if w.shape!=(768,):
            try:
                s_embeddings[i]=w.reshape(768,)
                #print(s_embeddings[i].shape) 
            except ValueError:
                s_embeddings[i]=np.mean(np.asarray(w), axis=0)
                print(s_embeddings[i].shape)
    s_embeddings=np.stack(s_embeddings)
    for i, w in enumerate(cms_embeddings):
        if w.shape!=(768,):
            try:
                cms_embeddings[i]=w.reshape(768,)
                #print(cms_embeddings[i].shape) 
            except ValueError:
                cms_embeddings[i]=np.mean(np.asarray(w), axis=0)
                print(cms_embeddings[i].shape)
    cms_embeddings=np.stack(cms_embeddings)
    for i, w in enumerate(cm_embeddings):
        if w.shape!=(768,):
            try:
                cm_embeddings[i]=w.reshape(768,)
                #print(cm_embeddings[i].shape) 
            except ValueError:
                cm_embeddings[i]=np.mean(np.asarray(w), axis=0)
                print(cm_embeddings[i].shape)
    cm_embeddings=np.stack(cm_embeddings)
    print(len(topics))            
    print(len(vocab))
    print(len(embeddings))
    print(len(con_embeddings))
    print(len(cm_embeddings))
    print(len(cms_embeddings))
    print(len(s_embeddings))
   
   
        
            
    # A-HOTTER        
    cost_c_embeddings = euclidean_distances(con_embeddings, con_embeddings) ** p
    # M-HOTTER
    cost_cm_embeddings = euclidean_distances(cm_embeddings, cm_embeddings) ** p
    # MS-HOTTER
    cost_cms_embeddings = euclidean_distances(cms_embeddings, cms_embeddings) ** p
    # S-HOTTER
    cost_s_embeddings = euclidean_distances(s_embeddings, s_embeddings) ** p
    
    
    #using the document topic proportions to obtain representative contextual embeddings per topic
    for i in range(cost_contextual.shape[0]):
        for j in range(i + 1, cost_contextual.shape[1]):
            cost_contextual[i, j] = sparse_ot(topics[i], topics[j], cost_c_embeddings)
            cost_mcontextual[i, j] = sparse_ot(topics[i], topics[j], cost_cm_embeddings)
            cost_scontextual[i, j] = sparse_ot(topics[i], topics[j], cost_s_embeddings)
            cost_msincontextual[i, j] = sparse_ot(topics[i], topics[j], cost_cms_embeddings)

    cost_contextual = cost_contextual + cost_contextual.T
    cost_mcontextual = cost_mcontextual + cost_mcontextual.T
    cost_scontextual = cost_scontextual + cost_scontextual.T
    cost_msincontextual = cost_msincontextual + cost_msincontextual.T
    
    out = {'X': bow_data, 'y': y,
           'embeddings': embeddings,
           'topics': topics, 'proportions': topic_proportions,
           'cost_E': cost_embeddings, 'cost_T': cost_topics, 'raw_docs':raw_docs, 'contextual_emb':context_emb, 'cost_CT':cost_contextual, 'cost_MCT':cost_mcontextual, 'cost_MSCT':cost_msincontextual, 'cost_SCT':cost_scontextual}

    return out


## 🔥 5. HOTT(ER) Distance +  Baseline Calculation

In [6]:

# p=1 is for W1 metric, p=2 for W2
p = 1
data = loader(data_path + data_name, embeddings_path, contextual_embeddings=True, raw_data=raw_data_name, p=p,K_lda=70,glove_embeddings=True, stemming=True,n_words_keep=20, train_test_dataset=train_test_dataset, by_sentence=by_sentence)
cost_E = data['cost_E']
#print(np.argwhere(np.isnan(cost_E)))
#input()
cost_T = data['cost_T']

bow_data, y = data['X'], data['y']
topic_proportions = data['proportions']

seed = 0
bow_train, bow_test, topic_train, topic_test, y_train, y_test = train_test_split(bow_data, topic_proportions, y, random_state=seed)

# Pick a method among RWMD, WMD, WMD-T20, HOTT, HOFTT, A-HOTTER, S-HOTTER and M-HOTTER

methods = {'S-HOTTER': hott.hott,
           'A-HOTTER': hott.hott,
           'M-HOTTER': hott.hott,
           'T-HOTTER': hott.hott,
           'HOTT': hott.hott,
           'HOFTT': hott.hoftt,
           'RWMD': distances.rwmd,
           'WMD-T20': lambda p, q, C: distances.wmd(p, q, C, truncate=20),
           'WMD': distances.wmd}
    
for method in methods.keys():
    
    t_s = time.time()
    # Get train/test data representation and transport cost
    if method in ['HOTT', 'HOFTT']:
        # If method is HOTT or HOFTT train LDA and compute topic-topic transport cost
        X_train, X_test = topic_train, topic_test
        C = data['cost_T']
    elif method in ['A-HOTTER']:
        # If method is HOTTER do the same as for HOTT except for taking contextual costs from averaged tier 1 documents
        X_train, X_test = topic_train, topic_test
        C = data['cost_CT']
    elif method in ['S-HOTTER']:
        # If method is HOTTER do the same as for HOTT except for taking contextual costs from averaged tier 1 documents
        X_train, X_test = topic_train, topic_test
        C = data['cost_SCT']
    elif method in ['M-HOTTER']:
        # If method is HOTTER do the same as for HOTT except for taking contextual costs from the maximum probability tier 1 doc
        X_train, X_test = topic_train, topic_test
        C = data['cost_MCT']
    elif method in ['T-HOTTER']:
        # If method is HOTTER do the same as for HOTT except for taking contextual costs from the maximum probability tier 1 doc
        X_train, X_test = topic_train, topic_test
        C = data['cost_MSCT']
    else:
        # Normalize BOW and compute word-word transport cost
        X_train, X_test = normalize(bow_train, 'l1'), normalize(bow_test, 'l1')
        C = data['cost_E']
    
    # Compute test error
    test_error = knn(X_train, X_test, y_train, y_test, methods[method], C, n_neighbors=7)
    print(method + ' test error is %f; took %.2f seconds' % (test_error, time.time()-t_s))

# Done!

reclimit
100000
10102
737


INFO:lda:n_documents: 737
INFO:lda:vocab_size: 3657
INFO:lda:n_words: 102776
INFO:lda:n_topics: 70
INFO:lda:n_iter: 1500
INFO:lda:<0> log likelihood: -1432582
INFO:lda:<10> log likelihood: -850644
INFO:lda:<20> log likelihood: -814569
INFO:lda:<30> log likelihood: -800267
INFO:lda:<40> log likelihood: -794318
INFO:lda:<50> log likelihood: -788032
INFO:lda:<60> log likelihood: -785772
INFO:lda:<70> log likelihood: -783560
INFO:lda:<80> log likelihood: -781132
INFO:lda:<90> log likelihood: -780450
INFO:lda:<100> log likelihood: -779637
INFO:lda:<110> log likelihood: -777405
INFO:lda:<120> log likelihood: -777171
INFO:lda:<130> log likelihood: -775959
INFO:lda:<140> log likelihood: -775654
INFO:lda:<150> log likelihood: -775088
INFO:lda:<160> log likelihood: -774323
INFO:lda:<170> log likelihood: -773964
INFO:lda:<180> log likelihood: -774161
INFO:lda:<190> log likelihood: -773395
INFO:lda:<200> log likelihood: -772434
INFO:lda:<210> log likelihood: -771497
INFO:lda:<220> log likelihood: 

LDA Gibbs topics
Topic 0: test day wicket bowl bat open one run score seri batsman inning michael centuri left team man set skipper start
Topic 1: year roddick cup beat spain play win davi doubl singl andi carlo open clay surfac nadal victori guy day court
Topic 2: goal score win half match top place premiership leagu great charlton paid conced cup back champion earn striker one spot
Topic 3: day make time lot posit thing peopl come made part put plan decis want happen point happi respect month ask
Topic 4: cont jone perform take enhanc steroid alleg white posit agenc charg investig anti world dope televis wednesday marion victor american
Topic 5: year time season old ad join good return appear made left play four talent moor first kent injur choic domest
Topic 6: olymp athlet holm britain kelli athen birmingham gold lewi garden franci doubl relay medal green compet british event campbel track
Topic 7: match world tsunami player rais hit chris said star fund earli asian game contribut 

vocab list length
3657
0\737
1\737
2\737
3\737
4\737
5\737
6\737
7\737
8\737
9\737
10\737
11\737
12\737
13\737
14\737
15\737
16\737
17\737
18\737
19\737
20\737
21\737
22\737
23\737
24\737
25\737
26\737
27\737
28\737
29\737
30\737
31\737
32\737
33\737
34\737
35\737
36\737
37\737
38\737
39\737
40\737
41\737
42\737
43\737
44\737
45\737
46\737
47\737
48\737
49\737
50\737
51\737
52\737
53\737
54\737
55\737
56\737
57\737
58\737
59\737
60\737
61\737
62\737
63\737
64\737
65\737
66\737
67\737
68\737
69\737
70\737
71\737
72\737
73\737
74\737
75\737
76\737
77\737
78\737
79\737
80\737
81\737
82\737
83\737
84\737
85\737
86\737
87\737
88\737
89\737
90\737
91\737
92\737
93\737
94\737
95\737
96\737
97\737
98\737
99\737
100\737
101\737
102\737
103\737
104\737
105\737
106\737
107\737
108\737
109\737
110\737
111\737
112\737
113\737
114\737
115\737
116\737
117\737
118\737
119\737
120\737
121\737
122\737
123\737
124\737
125\737
126\737
127\737
128\737
129\737
130\737
131\737
132\737
133\737
134\737
135\737

[0.00097087 0.00097087 0.00097087 0.03980583 0.00097087 0.00097087
 0.00097087 0.00097087 0.01067961 0.00097087 0.00097087 0.00097087
 0.00097087 0.00097087 0.00097087 0.0592233  0.02038835 0.00097087
 0.00097087 0.01067961 0.01067961 0.01067961 0.1368932  0.00097087
 0.00097087 0.00097087 0.00097087 0.00097087 0.00097087 0.00097087
 0.00097087 0.00097087 0.00097087 0.09805825 0.00097087 0.00097087
 0.01067961 0.03980583 0.00097087 0.23398058 0.02038835 0.00097087
 0.01067961 0.01067961 0.02038835 0.00097087 0.00097087 0.00097087
 0.03980583 0.00097087 0.00097087 0.00097087 0.03009709 0.00097087
 0.00097087 0.00097087 0.00097087 0.00097087 0.00097087 0.00097087
 0.00097087 0.00097087 0.00097087 0.03980583 0.00097087 0.03980583
 0.01067961 0.04951456 0.00097087 0.00097087]
[0.00071942 0.01510791 0.01510791 0.00071942 0.00071942 0.00071942
 0.00071942 0.00071942 0.00071942 0.00071942 0.07985612 0.00071942
 0.00071942 0.00071942 0.00071942 0.00071942 0.02230216 0.00071942
 0.01510791 0.02

[0.00065789 0.08618421 0.00065789 0.08618421 0.00065789 0.00723684
 0.00065789 0.00065789 0.1125     0.00065789 0.00065789 0.04671053
 0.00065789 0.00065789 0.17828947 0.00065789 0.00065789 0.00065789
 0.00065789 0.00065789 0.00065789 0.07960526 0.00065789 0.00065789
 0.00065789 0.00065789 0.00065789 0.00065789 0.00065789 0.00065789
 0.09276316 0.00065789 0.00065789 0.00065789 0.00065789 0.00723684
 0.00065789 0.00065789 0.00723684 0.00065789 0.00065789 0.00065789
 0.00065789 0.00723684 0.00065789 0.00065789 0.00065789 0.00065789
 0.00065789 0.12565789 0.06644737 0.00065789 0.00065789 0.00065789
 0.00065789 0.00065789 0.00065789 0.01381579 0.00065789 0.00065789
 0.00065789 0.01381579 0.00065789 0.00723684 0.00065789 0.00065789
 0.00065789 0.00065789 0.02697368 0.00065789]
[0.00982143 0.00089286 0.00089286 0.00089286 0.00089286 0.00089286
 0.00089286 0.00089286 0.00982143 0.00089286 0.00089286 0.30446429
 0.03660714 0.00089286 0.37589286 0.00089286 0.00089286 0.00089286
 0.00089286 0.00

1/3657
major
13533
2/3657
medal
13533
3/3657
british
13533
4/3657
hurdler
13533
5/3657
sarah
13533
6/3657
confid
13533
7/3657
win
13533
8/3657
month
13533
9/3657
european
13533
10/3657
indoor
13533
11/3657
championship
13533
12/3657
madrid
13533
13/3657
year
13533
14/3657
old
13533
15/3657
smash
13533
16/3657
record
13533
17/3657
hurdl
13533
18/3657
season
13533
19/3657
set
13533
20/3657
mark
13533
21/3657
second
13533
22/3657
titl
13533
23/3657
race
13533
24/3657
come
13533
25/3657
long
13533
26/3657
train
13533
27/3657
chanc
13533
28/3657
nation
13533
29/3657
past
13533
30/3657
struggl
13533
31/3657
translat
13533
32/3657
domest
13533
33/3657
success
13533
34/3657
intern
13533
35/3657
stage
13533
36/3657
scotland
13533
37/3657
born
13533
38/3657
athlet
13533
39/3657
own
13533
40/3657
equal
13533
41/3657
fifth
13533
42/3657
fastest
13533
43/3657
time
13533
44/3657
world
13533
45/3657
week
13533
46/3657
birmingham
13533
47/3657
grand
13533
48/3657
prix
13533
49/3657
left
13533
50/3657


389/3657
qualifi
13533
390/3657
high
13533
391/3657
class
13533
392/3657
belgium
13533
393/3657
kim
13533
394/3657
low
13533
395/3657
key
13533
396/3657
threw
13533
397/3657
shot
13533
398/3657
collin
13533
399/3657
compet
13533
400/3657
februari
13533
401/3657
join
13533
402/3657
sydney
13533
403/3657
silver
13533
404/3657
forward
13533
405/3657
strong
13533
406/3657
recept
13533
407/3657
excit
13533
408/3657
venu
13533
409/3657
shape
13533
410/3657
front
13533
411/3657
support
13533
412/3657
sprinter
13533
413/3657
even
13533
414/3657
big
13533
415/3657
defend
13533
416/3657
answer
13533
417/3657
paula
13533
418/3657
grant
13533
419/3657
extra
13533
420/3657
concern
13533
421/3657
question
13533
422/3657
huge
13533
423/3657
asset
13533
424/3657
hyde
13533
425/3657
peter
13533
426/3657
work
13533
427/3657
accommod
13533
428/3657
compromis
13533
429/3657
decis
13533
430/3657
tuesday
13533
431/3657
deadlin
13533
432/3657
hayley
13533
433/3657
yell
13533
434/3657
opt
13533
435/3657
fanta

768/3657
right
13533
769/3657
fact
13533
770/3657
edinburgh
13533
771/3657
initi
13533
772/3657
unsur
13533
773/3657
declar
13533
774/3657
add
13533
775/3657
strength
13533
776/3657
depth
13533
777/3657
newcastl
13533
778/3657
endur
13533
779/3657
north
13533
780/3657
border
13533
781/3657
scot
13533
782/3657
kathi
13533
783/3657
butler
13533
784/3657
target
13533
785/3657
rare
13533
786/3657
otherwis
13533
787/3657
point
13533
788/3657
go
13533
789/3657
lifetim
13533
790/3657
clinch
13533
791/3657
hungri
13533
792/3657
assess
13533
793/3657
difficult
13533
794/3657
prior
13533
795/3657
special
13533
796/3657
definit
13533
797/3657
possibl
13533
798/3657
michell
13533
799/3657
lodg
13533
800/3657
eight
13533
801/3657
cas
13533
802/3657
receiv
13533
803/3657
connect
13533
804/3657
scandal
13533
805/3657
admiss
13533
806/3657
use
13533
807/3657
agenc
13533
808/3657
pattern
13533
809/3657
observ
13533
810/3657
blood
13533
811/3657
investig
13533
812/3657
strip
13533
813/3657
san
13533
814

1142/3657
fee
13533
1143/3657
esteem
13533
1144/3657
stay
13533
1145/3657
control
13533
1146/3657
ignor
13533
1147/3657
olympian
13533
1148/3657
els
13533
1149/3657
agenda
13533
1150/3657
expos
13533
1151/3657
think
13533
1152/3657
enter
13533
1153/3657
tick
13533
1154/3657
simpl
13533
1155/3657
figur
13533
1156/3657
sort
13533
1157/3657
island
13533
1158/3657
central
13533
1159/3657
park
13533
1160/3657
fast
13533
1161/3657
suit
13533
1162/3657
better
13533
1163/3657
reserv
13533
1164/3657
unknown
13533
1165/3657
suggest
13533
1166/3657
health
13533
1167/3657
stress
13533
1168/3657
primari
13533
1169/3657
trip
13533
1170/3657
conced
13533
1171/3657
advantag
13533
1172/3657
one
13533
1173/3657
massiv
13533
1174/3657
opportun
13533
1175/3657
say
13533
1176/3657
drive
13533
1177/3657
hay
13533
1178/3657
owen
13533
1179/3657
jess
13533
1180/3657
usa
13533
1181/3657
establish
13533
1182/3657
outstand
13533
1183/3657
femal
13533
1184/3657
know
13533
1185/3657
sens
13533
1186/3657
worth
1353

1497/3657
vy
13533
1498/3657
turner
13533
1499/3657
comfort
13533
1500/3657
lay
13533
1501/3657
welsh
13533
1502/3657
attack
13533
1503/3657
thie
13533
1504/3657
eas
13533
1505/3657
showdown
13533
1506/3657
deacon
13533
1507/3657
stole
13533
1508/3657
thunder
13533
1509/3657
fraser
13533
1510/3657
veteran
13533
1511/3657
carl
13533
1512/3657
sale
13533
1513/3657
robert
13533
1514/3657
mitchel
13533
1515/3657
temper
13533
1516/3657
bar
13533
1517/3657
ashley
13533
1518/3657
tremend
13533
1519/3657
assur
13533
1520/3657
occupi
13533
1521/3657
nine
13533
1522/3657
john
13533
1523/3657
tough
13533
1524/3657
pollock
13533
1525/3657
urg
13533
1526/3657
undecid
13533
1527/3657
everyon
13533
1528/3657
fell
13533
1529/3657
manner
13533
1530/3657
act
13533
1531/3657
follow
13533
1532/3657
save
13533
1533/3657
streak
13533
1534/3657
entri
13533
1535/3657
ricki
13533
1536/3657
swing
13533
1537/3657
cold
13533
1538/3657
fortnight
13533
1539/3657
corner
13533
1540/3657
anniversari
13533
1541/3657
ev

1856/3657
lesson
13533
1857/3657
arm
13533
1858/3657
spinner
13533
1859/3657
batsmen
13533
1860/3657
daryl
13533
1861/3657
ball
13533
1862/3657
wicket
13533
1863/3657
rope
13533
1864/3657
marshal
13533
1865/3657
stride
13533
1866/3657
kiwi
13533
1867/3657
tempo
13533
1868/3657
centuri
13533
1869/3657
kyle
13533
1870/3657
mill
13533
1871/3657
glenn
13533
1872/3657
bat
13533
1873/3657
danger
13533
1874/3657
halt
13533
1875/3657
declin
13533
1876/3657
fifti
13533
1877/3657
limit
13533
1878/3657
squar
13533
1879/3657
hogg
13533
1880/3657
lunch
13533
1881/3657
inning
13533
1882/3657
blast
13533
1883/3657
deliveri
13533
1884/3657
replac
13533
1885/3657
hall
13533
1886/3657
bowl
13533
1887/3657
duck
13533
1888/3657
ali
13533
1889/3657
pad
13533
1890/3657
boundari
13533
1891/3657
nel
13533
1892/3657
reveng
13533
1893/3657
steer
13533
1894/3657
vacant
13533
1895/3657
midwicket
13533
1896/3657
hampshir
13533
1897/3657
batsman
13533
1898/3657
scrambl
13533
1899/3657
spell
13533
1900/3657
partners

2219/3657
tail
13533
2220/3657
banish
13533
2221/3657
langer
13533
2222/3657
sehwag
13533
2223/3657
percent
13533
2224/3657
hundr
13533
2225/3657
karthik
13533
2226/3657
deplet
13533
2227/3657
weak
13533
2228/3657
regul
13533
2229/3657
straighten
13533
2230/3657
icc
13533
2231/3657
doosra
13533
2232/3657
within
13533
2233/3657
toler
13533
2234/3657
refere
13533
2235/3657
supervis
13533
2236/3657
biomechan
13533
2237/3657
expert
13533
2238/3657
bruce
13533
2239/3657
elliott
13533
2240/3657
circul
13533
2241/3657
camera
13533
2242/3657
comparison
13533
2243/3657
video
13533
2244/3657
rotat
13533
2245/3657
techniqu
13533
2246/3657
arbit
13533
2247/3657
amid
13533
2248/3657
guard
13533
2249/3657
wicketkeep
13533
2250/3657
crush
13533
2251/3657
apiec
13533
2252/3657
bundl
13533
2253/3657
purchas
13533
2254/3657
surfac
13533
2255/3657
elton
13533
2256/3657
drew
13533
2257/3657
hodg
13533
2258/3657
prolif
13533
2259/3657
brad
13533
2260/3657
shane
13533
2261/3657
watson
13533
2262/3657
victor

2579/3657
convert
13533
2580/3657
penetr
13533
2581/3657
rash
13533
2582/3657
awkward
13533
2583/3657
dispos
13533
2584/3657
rattl
13533
2585/3657
drove
13533
2586/3657
immacul
13533
2587/3657
cash
13533
2588/3657
crisp
13533
2589/3657
reviv
13533
2590/3657
mini
13533
2591/3657
dawn
13533
2592/3657
fuller
13533
2593/3657
former
13533
2594/3657
midway
13533
2595/3657
visitor
13533
2596/3657
demis
13533
2597/3657
unthink
13533
2598/3657
sluggish
13533
2599/3657
pop
13533
2600/3657
industri
13533
2601/3657
snatch
13533
2602/3657
deficit
13533
2603/3657
seem
13533
2604/3657
belong
13533
2605/3657
could
13533
2606/3657
shine
13533
2607/3657
dhoni
13533
2608/3657
sharma
13533
2609/3657
near
13533
2610/3657
endors
13533
2611/3657
ligament
13533
2612/3657
fire
13533
2613/3657
destin
13533
2614/3657
legsid
13533
2615/3657
volum
13533
2616/3657
neat
13533
2617/3657
casualti
13533
2618/3657
aggress
13533
2619/3657
undo
13533
2620/3657
yard
13533
2621/3657
rap
13533
2622/3657
bang
13533
2623/3657


2938/3657
floyd
13533
2939/3657
blush
13533
2940/3657
palm
13533
2941/3657
offsid
13533
2942/3657
sink
13533
2943/3657
pen
13533
2944/3657
fish
13533
2945/3657
dunde
13533
2946/3657
aberdeen
13533
2947/3657
richi
13533
2948/3657
stevi
13533
2949/3657
crawford
13533
2950/3657
bullock
13533
2951/3657
robson
13533
2952/3657
tap
13533
2953/3657
onrush
13533
2954/3657
lob
13533
2955/3657
samuel
13533
2956/3657
kerr
13533
2957/3657
kenneth
13533
2958/3657
hart
13533
2959/3657
diamond
13533
2960/3657
clyde
13533
2961/3657
celtic
13533
2962/3657
brush
13533
2963/3657
slid
13533
2964/3657
bellami
13533
2965/3657
blown
13533
2966/3657
sheridan
13533
2967/3657
scenario
13533
2968/3657
flare
13533
2969/3657
intervent
13533
2970/3657
unmark
13533
2971/3657
culmin
13533
2972/3657
burn
13533
2973/3657
gibson
13533
2974/3657
lennon
13533
2975/3657
livingston
13533
2976/3657
simmon
13533
2977/3657
lifelin
13533
2978/3657
eric
13533
2979/3657
dair
13533
2980/3657
gordon
13533
2981/3657
webster
13533
298

3275/3657
manuel
13533
3276/3657
son
13533
3277/3657
mexican
13533
3278/3657
shark
13533
3279/3657
rubber
13533
3280/3657
stamp
13533
3281/3657
wood
13533
3282/3657
millwal
13533
3283/3657
mutu
13533
3284/3657
cocain
13533
3285/3657
argentin
13533
3286/3657
edu
13533
3287/3657
destroy
13533
3288/3657
mirror
13533
3289/3657
regret
13533
3290/3657
privat
13533
3291/3657
artifici
13533
3292/3657
swindon
13533
3293/3657
wigan
13533
3294/3657
milton
13533
3295/3657
anelka
13533
3296/3657
haunt
13533
3297/3657
rift
13533
3298/3657
bill
13533
3299/3657
mexico
13533
3300/3657
strachan
13533
3301/3657
ice
13533
3302/3657
cake
13533
3303/3657
laugh
13533
3304/3657
tevez
13533
3305/3657
boca
13533
3306/3657
argentina
13533
3307/3657
brightest
13533
3308/3657
bueno
13533
3309/3657
passion
13533
3310/3657
copa
13533
3311/3657
feint
13533
3312/3657
shoot
13533
3313/3657
miser
13533
3314/3657
hardest
13533
3315/3657
ghana
13533
3316/3657
off
13533
3317/3657
blight
13533
3318/3657
cartilag
13533
3319/

3637/3657
volvo
13533
3638/3657
hardcourt
13533
3639/3657
chela
13533
3640/3657
lisa
13533
3641/3657
ephedrin
13533
3642/3657
exhibit
13533
3643/3657
medicin
13533
3644/3657
bailey
13533
3645/3657
lawn
13533
3646/3657
memphi
13533
3647/3657
contamin
13533
3648/3657
gaston
13533
3649/3657
manga
13533
3650/3657
unassail
13533
3651/3657
sharapova
13533
3652/3657
houston
13533
3653/3657
hip
13533
3654/3657
mario
13533
3655/3657
crosscourt
13533
3656/3657
roch
13533
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
s

(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape
(768,)
shape

S-HOTTER test error is 0.027027; took 7.36 seconds
A-HOTTER test error is 0.021622; took 7.51 seconds
M-HOTTER test error is 0.021622; took 7.49 seconds
T-HOTTER test error is 0.016216; took 7.43 seconds
HOTT test error is 0.021622; took 7.47 seconds
HOFTT test error is 0.021622; took 31.48 seconds
RWMD test error is 0.032432; took 15.68 seconds
WMD-T20 test error is 0.048649; took 22.85 seconds
WMD test error is 0.027027; took 96.43 seconds
