This notebook implements the <b>TextFooler</b> attack on the SQuAD (Stanford Question Answering Dataset) 2.0 dataset.<br /><br />

The main purpose of this notebook is to create adversarial examples for question-answering systems using the TextFooler method. It replaces words with synonyms while preserving the overall structure and meaning of the text. The attack targets words based on their part of speech, avoiding modification of certain grammatical elements to maintain readability.<br /><br />

The generated adversarial examples are designed to challenge question-answering models by introducing subtle word-level changes that may affect the model's performance without significantly altering the human-readable content. This approach allows for testing the robustness of QA models against lexical variations while keeping the core information intact.<br /><br />

The script processes the entire SQuAD dataset, creating a new version with adversarial examples that can be used for evaluating and potentially improving the resilience of question-answering systems against this type of word-level perturbation.

### Loading libraries

Imports necessary libraries for natural language processing (spaCy, NLTK), data manipulation (json), and progress tracking (tqdm).

In [1]:
import os
import json
import random
import spacy
from tqdm import tqdm
import nltk
from nltk.corpus import wordnet
from difflib import SequenceMatcher

### NLP model loading
It loads the spaCy English model for part-of-speech tagging and tokenization.

In [2]:
# Load spaCy model
spacy.prefer_gpu()
nlp = spacy.load("en_core_web_sm")

In [3]:
# Download WordNet
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/ferhatsarikaya/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

### Synonym generation
get_synonyms() uses NLTK's WordNet to generate synonyms for a given word.

In [4]:
def get_synonyms(word):
    synonyms = set()
    for syn in wordnet.synsets(word):
        for lemma in syn.lemmas():
            synonyms.add(lemma.name().replace('_', ' '))
    return list(synonyms)


### TextFooler attack implementation
textfooler_attack() is the core function that:
<ul>
    <li>Tokenizes the input text and gets part-of-speech tags.</li>
    <li>Preserves specified phrases from modification.</li>
    <li>Replaces words with synonyms based on a random threshold, excluding certain parts of speech (punctuation, determiners, adpositions, conjunctions).</li>
</ul>

In [5]:
def textfooler_attack(text, importance_threshold=0.5, preserve_phrases=None):
    if preserve_phrases is None:
        preserve_phrases = set()
    else:
        preserve_phrases = set(phrase.lower() for phrase in preserve_phrases)
    
    doc = nlp(text)
    words = [token.text for token in doc]
    pos_tags = [token.pos_ for token in doc]
    
    # Check for preserved phrases
    for i in range(len(words)):
        for j in range(i+1, len(words)+1):
            if ' '.join(words[i:j]).lower() in preserve_phrases:
                for k in range(i, j):
                    pos_tags[k] = 'PRESERVED'
    
    for i, (word, pos) in enumerate(zip(words, pos_tags)):
        if pos not in ['PRESERVED', 'PUNCT', 'DET', 'ADP', 'CCONJ', 'SCONJ'] and random.random() > importance_threshold:
            synonyms = get_synonyms(word)
            if synonyms:
                words[i] = random.choice(synonyms)
    
    return ' '.join(words)


### Answer position finding
find_answer_position() locates the position of an answer within a modified context using fuzzy matching.

In [6]:
def find_answer_position(context, answer):
    best_match = None
    best_ratio = 0
    words = context.split()
    answer_words = answer.split()
    
    for i in range(len(words) - len(answer_words) + 1):
        substring = ' '.join(words[i:i+len(answer_words)])
        ratio = SequenceMatcher(None, substring.lower(), answer.lower()).ratio()
        if ratio > best_ratio:
            best_ratio = ratio
            best_match = ' '.join(words[:i])
    
    return len(best_match) + 1 if best_ratio > 0.8 else -1


### Dataset processing
process_squad_file() applies the TextFooler attack to the SQuAD dataset by:
<ul>
    <li>Modifying both context paragraphs and questions.</li>
    <li>Preserving answer phrases in the context.</li>
    <li>Updating answer positions in modified contexts.</li>
</ul>

In [7]:
def process_squad_file(input_file, output_file):
    with open(input_file, 'r', encoding='utf-8') as f:
        data = json.load(f)

    new_data = {"version": "v2.0", "data": []}

    for article in tqdm(data['data'], desc="Processing articles"):
        new_article = {"title": article['title'], "paragraphs": []}
        
        for paragraph in article['paragraphs']:
            context = paragraph['context']
            
            # Collect all answers to preserve
            answers = set()
            for qa in paragraph['qas']:
                if qa['answers']:
                    answers.add(qa['answers'][0]['text'])
            
            # Apply TextFooler to the context, preserving answers
            adv_context = textfooler_attack(context, preserve_phrases=answers)
            
            new_qas = []
            for qa in paragraph['qas']:
                new_qa = qa.copy()
                
                # Apply TextFooler to the question, preserving the answer
                if qa['answers']:
                    answer = qa['answers'][0]['text']
                    new_qa['question'] = textfooler_attack(qa['question'], preserve_phrases=[answer])
                else:
                    new_qa['question'] = textfooler_attack(qa['question'])
                
                if new_qa['answers']:
                    answer = new_qa['answers'][0]['text']
                    answer_start = find_answer_position(adv_context, answer)
                    
                    if answer_start != -1:
                        new_qa['answers'][0]['answer_start'] = answer_start
                    else:
                        print(f"Warning: Answer '{answer}' not found in modified context. Keeping original position.")
                        # Keep the original answer_start
                
                new_qas.append(new_qa)
            
            new_paragraph = {
                "context": adv_context,
                "qas": new_qas
            }
            new_article['paragraphs'].append(new_paragraph)
        
        new_data['data'].append(new_article)

    with open(output_file, 'w', encoding='utf-8') as f:
        json.dump(new_data, f, ensure_ascii=False, indent=2)

### Main execution
The script applies the TextFooler attack to the SQuAD v2.0 training set and generates a new file with adversarial examples.

In [8]:
# Create data set
path = "SQuAD/"
input_file = os.path.join(path, "train-v2.0.json")
output_file = os.path.join(path, "squad-v2.0-textfooler.json")
process_squad_file(input_file, output_file)

Processing articles:   0%|                              | 0/442 [00:00<?, ?it/s]



Processing articles:   0%|                      | 1/442 [00:07<54:23,  7.40s/it]



Processing articles:   0%|                      | 2/442 [00:12<44:30,  6.07s/it]



Processing articles:   1%|▏                     | 3/442 [00:16<38:27,  5.26s/it]



Processing articles:   1%|▏                     | 4/442 [00:18<28:35,  3.92s/it]



Processing articles:   1%|▏                     | 5/442 [00:20<23:02,  3.16s/it]



Processing articles:   1%|▎                     | 6/442 [00:23<23:43,  3.26s/it]



Processing articles:   2%|▎                     | 7/442 [00:26<22:36,  3.12s/it]



Processing articles:   2%|▍                     | 8/442 [00:32<28:04,  3.88s/it]



Processing articles:   2%|▍                     | 9/442 [00:35<25:35,  3.55s/it]



Processing articles:   2%|▍                    | 10/442 [00:37<23:09,  3.22s/it]



Processing articles:   2%|▌                    | 11/442 [00:42<25:52,  3.60s/it]



Processing articles:   3%|▌                    | 12/442 [00:46<26:56,  3.76s/it]



Processing articles:   3%|▌                    | 13/442 [00:50<27:44,  3.88s/it]



Processing articles:   3%|▋                    | 14/442 [00:52<23:47,  3.34s/it]



Processing articles:   3%|▋                    | 15/442 [00:56<26:14,  3.69s/it]



Processing articles:   4%|▊                    | 16/442 [00:57<19:37,  2.76s/it]



Processing articles:   4%|▊                    | 17/442 [00:58<15:23,  2.17s/it]



Processing articles:   4%|▊                    | 18/442 [00:59<12:59,  1.84s/it]



Processing articles:   4%|▉                    | 19/442 [01:00<10:34,  1.50s/it]



Processing articles:   5%|▉                    | 20/442 [01:01<11:02,  1.57s/it]



Processing articles:   5%|▉                    | 21/442 [01:02<09:05,  1.30s/it]



Processing articles:   5%|█                    | 22/442 [01:03<08:03,  1.15s/it]



Processing articles:   5%|█                    | 23/442 [01:03<06:59,  1.00s/it]



Processing articles:   5%|█▏                   | 24/442 [01:05<08:57,  1.29s/it]



Processing articles:   6%|█▏                   | 25/442 [01:07<09:30,  1.37s/it]



Processing articles:   6%|█▏                   | 26/442 [01:08<07:56,  1.15s/it]



Processing articles:   6%|█▎                   | 27/442 [01:09<07:45,  1.12s/it]



Processing articles:   6%|█▎                   | 28/442 [01:10<07:41,  1.11s/it]



Processing articles:   7%|█▍                   | 29/442 [01:11<07:44,  1.12s/it]



Processing articles:   7%|█▍                   | 30/442 [01:12<06:47,  1.01it/s]



Processing articles:   7%|█▍                   | 31/442 [01:12<05:58,  1.15it/s]



Processing articles:   7%|█▌                   | 32/442 [01:13<05:57,  1.15it/s]



Processing articles:   7%|█▌                   | 33/442 [01:16<09:48,  1.44s/it]



Processing articles:   8%|█▌                   | 34/442 [01:18<11:20,  1.67s/it]



Processing articles:   8%|█▋                   | 35/442 [01:19<09:44,  1.44s/it]



Processing articles:   8%|█▋                   | 36/442 [01:20<08:03,  1.19s/it]



Processing articles:   8%|█▊                   | 37/442 [01:20<07:09,  1.06s/it]



Processing articles:   9%|█▊                   | 38/442 [01:21<06:45,  1.00s/it]



Processing articles:   9%|█▊                   | 39/442 [01:22<06:23,  1.05it/s]



Processing articles:   9%|█▉                   | 40/442 [01:23<07:10,  1.07s/it]



Processing articles:   9%|█▉                   | 41/442 [01:24<07:05,  1.06s/it]



Processing articles:  10%|█▉                   | 42/442 [01:25<06:09,  1.08it/s]



Processing articles:  10%|██                   | 43/442 [01:26<06:34,  1.01it/s]



Processing articles:  10%|██                   | 44/442 [01:29<09:30,  1.43s/it]



Processing articles:  10%|██▏                  | 45/442 [01:31<11:11,  1.69s/it]



Processing articles:  10%|██▏                  | 46/442 [01:32<09:14,  1.40s/it]



Processing articles:  11%|██▏                  | 47/442 [01:33<09:11,  1.40s/it]



Processing articles:  11%|██▎                  | 48/442 [01:35<09:31,  1.45s/it]



Processing articles:  11%|██▎                  | 49/442 [01:35<07:52,  1.20s/it]



Processing articles:  11%|██▍                  | 50/442 [01:37<08:11,  1.25s/it]



Processing articles:  12%|██▍                  | 51/442 [01:39<10:10,  1.56s/it]



Processing articles:  12%|██▍                  | 52/442 [01:39<08:10,  1.26s/it]



Processing articles:  12%|██▌                  | 53/442 [01:40<07:44,  1.19s/it]



Processing articles:  12%|██▌                  | 54/442 [01:41<06:38,  1.03s/it]



Processing articles:  12%|██▌                  | 55/442 [01:42<06:08,  1.05it/s]



Processing articles:  13%|██▋                  | 56/442 [01:43<06:41,  1.04s/it]



Processing articles:  13%|██▋                  | 57/442 [01:46<10:22,  1.62s/it]



Processing articles:  13%|██▊                  | 58/442 [01:48<10:10,  1.59s/it]



Processing articles:  13%|██▊                  | 59/442 [01:48<08:38,  1.35s/it]



Processing articles:  14%|██▊                  | 60/442 [01:49<07:43,  1.21s/it]



Processing articles:  14%|██▉                  | 61/442 [01:50<07:22,  1.16s/it]



Processing articles:  14%|██▉                  | 62/442 [01:52<08:18,  1.31s/it]



Processing articles:  14%|██▉                  | 63/442 [01:53<07:25,  1.18s/it]



Processing articles:  14%|███                  | 64/442 [01:54<06:32,  1.04s/it]



Processing articles:  15%|███                  | 65/442 [01:54<05:49,  1.08it/s]



Processing articles:  15%|███▏                 | 66/442 [01:55<06:15,  1.00it/s]



Processing articles:  15%|███▏                 | 67/442 [01:58<09:25,  1.51s/it]



Processing articles:  15%|███▏                 | 68/442 [02:00<09:33,  1.53s/it]



Processing articles:  16%|███▎                 | 69/442 [02:01<09:36,  1.55s/it]



Processing articles:  16%|███▎                 | 70/442 [02:02<08:23,  1.35s/it]



Processing articles:  16%|███▎                 | 71/442 [02:03<08:10,  1.32s/it]



Processing articles:  16%|███▍                 | 72/442 [02:05<07:48,  1.26s/it]



Processing articles:  17%|███▍                 | 73/442 [02:07<09:17,  1.51s/it]



Processing articles:  17%|███▌                 | 74/442 [02:09<10:06,  1.65s/it]



Processing articles:  17%|███▌                 | 75/442 [02:11<10:52,  1.78s/it]



Processing articles:  17%|███▌                 | 76/442 [02:12<10:30,  1.72s/it]



Processing articles:  17%|███▋                 | 77/442 [02:15<12:31,  2.06s/it]



Processing articles:  18%|███▋                 | 78/442 [02:16<11:01,  1.82s/it]



Processing articles:  18%|███▊                 | 79/442 [02:19<11:53,  1.97s/it]



Processing articles:  18%|███▊                 | 80/442 [02:20<11:26,  1.90s/it]



Processing articles:  18%|███▊                 | 81/442 [02:23<12:32,  2.09s/it]



Processing articles:  19%|███▉                 | 82/442 [02:25<12:51,  2.14s/it]



Processing articles:  19%|███▉                 | 83/442 [02:26<10:40,  1.78s/it]



Processing articles:  19%|███▉                 | 84/442 [02:27<08:59,  1.51s/it]



Processing articles:  19%|████                 | 85/442 [02:29<10:13,  1.72s/it]



Processing articles:  19%|████                 | 86/442 [02:31<10:21,  1.75s/it]



Processing articles:  20%|████▏                | 87/442 [02:33<10:36,  1.79s/it]



Processing articles:  20%|████▏                | 88/442 [02:34<09:13,  1.56s/it]



Processing articles:  20%|████▏                | 89/442 [02:35<09:02,  1.54s/it]



Processing articles:  20%|████▎                | 90/442 [02:37<08:41,  1.48s/it]



Processing articles:  21%|████▎                | 91/442 [02:38<08:56,  1.53s/it]



Processing articles:  21%|████▎                | 92/442 [02:44<15:18,  2.62s/it]



Processing articles:  21%|████▍                | 93/442 [02:44<11:55,  2.05s/it]



Processing articles:  21%|████▍                | 94/442 [02:46<10:59,  1.90s/it]



Processing articles:  21%|████▌                | 95/442 [02:48<12:04,  2.09s/it]



Processing articles:  22%|████▌                | 96/442 [02:49<10:09,  1.76s/it]



Processing articles:  22%|████▌                | 97/442 [02:51<09:41,  1.69s/it]



Processing articles:  22%|████▋                | 98/442 [02:52<09:26,  1.65s/it]



Processing articles:  22%|████▋                | 99/442 [02:54<09:04,  1.59s/it]



Processing articles:  23%|████▌               | 100/442 [02:55<07:44,  1.36s/it]



Processing articles:  23%|████▌               | 101/442 [02:56<07:56,  1.40s/it]



Processing articles:  23%|████▌               | 102/442 [02:58<09:00,  1.59s/it]



Processing articles:  23%|████▋               | 103/442 [02:59<07:02,  1.25s/it]



Processing articles:  24%|████▋               | 104/442 [03:03<12:02,  2.14s/it]



Processing articles:  24%|████▊               | 105/442 [03:04<09:31,  1.70s/it]



Processing articles:  24%|████▊               | 106/442 [03:04<07:17,  1.30s/it]



Processing articles:  24%|████▊               | 107/442 [03:04<05:46,  1.03s/it]



Processing articles:  24%|████▉               | 108/442 [03:07<08:10,  1.47s/it]



Processing articles:  25%|████▉               | 109/442 [03:08<08:06,  1.46s/it]



Processing articles:  25%|████▉               | 110/442 [03:10<07:43,  1.40s/it]



Processing articles:  25%|█████               | 111/442 [03:11<07:56,  1.44s/it]



Processing articles:  25%|█████               | 112/442 [03:12<06:42,  1.22s/it]



Processing articles:  26%|█████               | 113/442 [03:12<05:12,  1.05it/s]



Processing articles:  26%|█████▏              | 114/442 [03:14<06:00,  1.10s/it]



Processing articles:  26%|█████▏              | 115/442 [03:15<05:59,  1.10s/it]



Processing articles:  26%|█████▏              | 116/442 [03:15<05:15,  1.03it/s]



Processing articles:  26%|█████▎              | 117/442 [03:17<06:07,  1.13s/it]



Processing articles:  27%|█████▎              | 118/442 [03:17<04:45,  1.14it/s]



Processing articles:  27%|█████▍              | 119/442 [03:19<06:53,  1.28s/it]



Processing articles:  27%|█████▍              | 120/442 [03:21<06:44,  1.26s/it]



Processing articles:  27%|█████▍              | 121/442 [03:22<06:19,  1.18s/it]



Processing articles:  28%|█████▌              | 122/442 [03:24<07:40,  1.44s/it]



Processing articles:  28%|█████▌              | 123/442 [03:24<06:44,  1.27s/it]



Processing articles:  28%|█████▌              | 124/442 [03:25<06:05,  1.15s/it]



Processing articles:  28%|█████▋              | 125/442 [03:27<07:24,  1.40s/it]



Processing articles:  29%|█████▋              | 126/442 [03:28<05:56,  1.13s/it]



Processing articles:  29%|█████▋              | 127/442 [03:29<05:34,  1.06s/it]



Processing articles:  29%|█████▊              | 128/442 [03:29<05:01,  1.04it/s]



Processing articles:  29%|█████▊              | 129/442 [03:30<04:41,  1.11it/s]



Processing articles:  29%|█████▉              | 130/442 [03:31<04:35,  1.13it/s]



Processing articles:  30%|█████▉              | 131/442 [03:33<05:49,  1.12s/it]



Processing articles:  30%|█████▉              | 132/442 [03:35<07:50,  1.52s/it]



Processing articles:  30%|██████              | 133/442 [03:39<10:50,  2.10s/it]



Processing articles:  30%|██████              | 134/442 [03:39<08:19,  1.62s/it]



Processing articles:  31%|██████              | 135/442 [03:40<07:24,  1.45s/it]



Processing articles:  31%|██████▏             | 136/442 [03:42<07:42,  1.51s/it]



Processing articles:  31%|██████▏             | 137/442 [03:44<08:38,  1.70s/it]



Processing articles:  31%|██████▏             | 138/442 [03:45<07:49,  1.54s/it]



Processing articles:  31%|██████▎             | 139/442 [03:47<08:14,  1.63s/it]



Processing articles:  32%|██████▎             | 140/442 [03:49<08:59,  1.79s/it]



Processing articles:  32%|██████▍             | 141/442 [03:51<09:43,  1.94s/it]



Processing articles:  32%|██████▍             | 142/442 [03:53<09:15,  1.85s/it]



Processing articles:  32%|██████▍             | 143/442 [03:56<10:30,  2.11s/it]



Processing articles:  33%|██████▌             | 144/442 [04:00<13:37,  2.74s/it]



Processing articles:  33%|██████▌             | 145/442 [04:03<14:28,  2.92s/it]



Processing articles:  33%|██████▌             | 146/442 [04:06<13:56,  2.82s/it]



Processing articles:  33%|██████▋             | 147/442 [04:07<11:09,  2.27s/it]



Processing articles:  33%|██████▋             | 148/442 [04:10<11:35,  2.37s/it]



Processing articles:  34%|██████▋             | 149/442 [04:10<08:56,  1.83s/it]



Processing articles:  34%|██████▊             | 150/442 [04:11<08:11,  1.68s/it]



Processing articles:  34%|██████▊             | 151/442 [04:14<09:43,  2.00s/it]



Processing articles:  34%|██████▉             | 152/442 [04:19<14:23,  2.98s/it]



Processing articles:  35%|██████▉             | 153/442 [04:22<13:08,  2.73s/it]



Processing articles:  35%|██████▉             | 154/442 [04:23<11:36,  2.42s/it]



Processing articles:  35%|███████             | 155/442 [04:25<10:27,  2.19s/it]



Processing articles:  35%|███████             | 156/442 [04:26<08:06,  1.70s/it]



Processing articles:  36%|███████             | 157/442 [04:26<06:53,  1.45s/it]



Processing articles:  36%|███████▏            | 158/442 [04:29<07:50,  1.66s/it]





Processing articles:  36%|███████▏            | 159/442 [04:35<14:35,  3.09s/it]



Processing articles:  36%|███████▏            | 160/442 [04:37<13:10,  2.80s/it]



Processing articles:  36%|███████▎            | 161/442 [04:39<12:00,  2.56s/it]



Processing articles:  37%|███████▎            | 162/442 [04:43<14:26,  3.10s/it]



Processing articles:  37%|███████▍            | 163/442 [04:46<13:21,  2.87s/it]



Processing articles:  37%|███████▍            | 164/442 [04:49<13:38,  2.94s/it]



Processing articles:  37%|███████▍            | 165/442 [04:51<12:54,  2.79s/it]



Processing articles:  38%|███████▌            | 166/442 [04:55<13:53,  3.02s/it]



Processing articles:  38%|███████▌            | 167/442 [04:58<14:14,  3.11s/it]



Processing articles:  38%|███████▌            | 168/442 [05:01<13:23,  2.93s/it]



Processing articles:  38%|███████▋            | 169/442 [05:04<13:34,  2.98s/it]



Processing articles:  38%|███████▋            | 170/442 [05:06<11:58,  2.64s/it]



Processing articles:  39%|███████▋            | 171/442 [05:07<10:45,  2.38s/it]



Processing articles:  39%|███████▊            | 172/442 [05:12<14:14,  3.16s/it]



Processing articles:  39%|███████▊            | 173/442 [05:15<12:58,  2.89s/it]



Processing articles:  39%|███████▊            | 174/442 [05:18<13:57,  3.13s/it]



Processing articles:  40%|███████▉            | 175/442 [05:23<15:24,  3.46s/it]



Processing articles:  40%|███████▉            | 176/442 [05:25<14:13,  3.21s/it]



Processing articles:  40%|████████            | 177/442 [05:28<12:56,  2.93s/it]





Processing articles:  40%|████████            | 178/442 [05:32<14:49,  3.37s/it]



Processing articles:  40%|████████            | 179/442 [05:33<12:23,  2.83s/it]



Processing articles:  41%|████████▏           | 180/442 [05:35<11:11,  2.56s/it]



Processing articles:  41%|████████▏           | 181/442 [05:38<11:28,  2.64s/it]



Processing articles:  41%|████████▏           | 182/442 [05:40<10:07,  2.34s/it]



Processing articles:  41%|████████▎           | 183/442 [05:41<07:54,  1.83s/it]



Processing articles:  42%|████████▎           | 184/442 [05:45<10:56,  2.54s/it]



Processing articles:  42%|████████▎           | 185/442 [05:48<11:57,  2.79s/it]



Processing articles:  42%|████████▍           | 186/442 [05:50<10:29,  2.46s/it]



Processing articles:  42%|████████▍           | 187/442 [05:53<11:00,  2.59s/it]



Processing articles:  43%|████████▌           | 188/442 [05:54<09:27,  2.23s/it]



Processing articles:  43%|████████▌           | 189/442 [05:55<08:18,  1.97s/it]



Processing articles:  43%|████████▌           | 190/442 [05:57<07:23,  1.76s/it]



Processing articles:  43%|████████▋           | 191/442 [05:58<06:31,  1.56s/it]





Processing articles:  43%|████████▋           | 192/442 [06:02<09:49,  2.36s/it]



Processing articles:  44%|████████▋           | 193/442 [06:04<08:55,  2.15s/it]



Processing articles:  44%|████████▊           | 194/442 [06:07<10:24,  2.52s/it]



Processing articles:  44%|████████▊           | 195/442 [06:11<12:22,  3.01s/it]



Processing articles:  44%|████████▊           | 196/442 [06:13<10:39,  2.60s/it]



Processing articles:  45%|████████▉           | 197/442 [06:14<08:41,  2.13s/it]



Processing articles:  45%|████████▉           | 198/442 [06:17<09:21,  2.30s/it]



Processing articles:  45%|█████████           | 199/442 [06:19<09:10,  2.27s/it]



Processing articles:  45%|█████████           | 200/442 [06:20<07:32,  1.87s/it]





Processing articles:  45%|█████████           | 201/442 [06:23<09:47,  2.44s/it]



Processing articles:  46%|█████████▏          | 202/442 [06:26<10:18,  2.58s/it]



Processing articles:  46%|█████████▏          | 203/442 [06:28<09:20,  2.34s/it]



Processing articles:  46%|█████████▏          | 204/442 [06:31<09:30,  2.40s/it]



Processing articles:  46%|█████████▎          | 205/442 [06:33<09:00,  2.28s/it]



Processing articles:  47%|█████████▎          | 206/442 [06:34<07:33,  1.92s/it]



Processing articles:  47%|█████████▎          | 207/442 [06:37<08:34,  2.19s/it]



Processing articles:  47%|█████████▍          | 208/442 [06:39<08:43,  2.24s/it]



Processing articles:  47%|█████████▍          | 209/442 [06:43<10:56,  2.82s/it]



Processing articles:  48%|█████████▌          | 210/442 [06:45<10:06,  2.61s/it]





Processing articles:  48%|█████████▌          | 211/442 [06:51<13:50,  3.59s/it]



Processing articles:  48%|█████████▌          | 212/442 [06:53<11:56,  3.12s/it]



Processing articles:  48%|█████████▋          | 213/442 [06:55<10:01,  2.63s/it]



Processing articles:  48%|█████████▋          | 214/442 [06:55<07:55,  2.09s/it]



Processing articles:  49%|█████████▋          | 215/442 [06:58<08:57,  2.37s/it]



Processing articles:  49%|█████████▊          | 216/442 [07:00<07:40,  2.04s/it]



Processing articles:  49%|█████████▊          | 217/442 [07:06<12:18,  3.28s/it]



Processing articles:  49%|█████████▊          | 218/442 [07:10<13:16,  3.56s/it]



Processing articles:  50%|█████████▉          | 219/442 [07:12<11:02,  2.97s/it]



Processing articles:  50%|█████████▉          | 220/442 [07:13<09:22,  2.53s/it]



Processing articles:  50%|██████████          | 221/442 [07:15<08:30,  2.31s/it]



Processing articles:  50%|██████████          | 222/442 [07:18<09:05,  2.48s/it]



Processing articles:  50%|██████████          | 223/442 [07:19<07:46,  2.13s/it]



Processing articles:  51%|██████████▏         | 224/442 [07:20<06:47,  1.87s/it]



Processing articles:  51%|██████████▏         | 225/442 [07:22<06:17,  1.74s/it]



Processing articles:  51%|██████████▏         | 226/442 [07:25<07:37,  2.12s/it]



Processing articles:  51%|██████████▎         | 227/442 [07:26<06:11,  1.73s/it]



Processing articles:  52%|██████████▎         | 228/442 [07:28<06:26,  1.80s/it]



Processing articles:  52%|██████████▎         | 229/442 [07:30<07:15,  2.04s/it]



Processing articles:  52%|██████████▍         | 230/442 [07:33<08:03,  2.28s/it]



Processing articles:  52%|██████████▍         | 231/442 [07:34<06:30,  1.85s/it]



Processing articles:  52%|██████████▍         | 232/442 [07:36<06:06,  1.74s/it]



Processing articles:  53%|██████████▌         | 233/442 [07:38<07:11,  2.06s/it]



Processing articles:  53%|██████████▌         | 234/442 [07:41<07:30,  2.17s/it]



Processing articles:  53%|██████████▋         | 235/442 [07:43<07:20,  2.13s/it]



Processing articles:  53%|██████████▋         | 236/442 [07:45<06:54,  2.01s/it]



Processing articles:  54%|██████████▋         | 237/442 [07:47<06:57,  2.04s/it]



Processing articles:  54%|██████████▊         | 238/442 [07:49<07:08,  2.10s/it]



Processing articles:  54%|██████████▊         | 239/442 [07:52<07:56,  2.35s/it]



Processing articles:  54%|██████████▊         | 240/442 [07:53<07:10,  2.13s/it]



Processing articles:  55%|██████████▉         | 241/442 [07:57<08:33,  2.56s/it]



Processing articles:  55%|██████████▉         | 242/442 [07:59<08:19,  2.50s/it]



Processing articles:  55%|██████████▉         | 243/442 [08:01<07:53,  2.38s/it]



Processing articles:  55%|███████████         | 244/442 [08:05<08:37,  2.61s/it]



Processing articles:  55%|███████████         | 245/442 [08:07<08:42,  2.65s/it]



Processing articles:  56%|███████████▏        | 246/442 [08:08<07:07,  2.18s/it]



Processing articles:  56%|███████████▏        | 247/442 [08:10<06:22,  1.96s/it]



Processing articles:  56%|███████████▏        | 248/442 [08:11<05:35,  1.73s/it]



Processing articles:  56%|███████████▎        | 249/442 [08:15<07:40,  2.39s/it]



Processing articles:  57%|███████████▎        | 250/442 [08:19<08:45,  2.74s/it]



Processing articles:  57%|███████████▎        | 251/442 [08:22<09:29,  2.98s/it]



Processing articles:  57%|███████████▍        | 252/442 [08:24<08:12,  2.59s/it]



Processing articles:  57%|███████████▍        | 253/442 [08:26<08:01,  2.55s/it]



Processing articles:  57%|███████████▍        | 254/442 [08:27<06:28,  2.07s/it]



Processing articles:  58%|███████████▌        | 255/442 [08:30<07:05,  2.28s/it]



Processing articles:  58%|███████████▌        | 256/442 [08:32<06:53,  2.22s/it]



Processing articles:  58%|███████████▋        | 257/442 [08:33<05:55,  1.92s/it]



Processing articles:  58%|███████████▋        | 258/442 [08:39<09:31,  3.10s/it]



Processing articles:  59%|███████████▋        | 259/442 [08:42<08:52,  2.91s/it]



Processing articles:  59%|███████████▊        | 260/442 [08:43<07:44,  2.55s/it]



Processing articles:  59%|███████████▊        | 261/442 [08:45<06:50,  2.27s/it]



Processing articles:  59%|███████████▊        | 262/442 [08:48<07:13,  2.41s/it]



Processing articles:  60%|███████████▉        | 263/442 [08:53<09:55,  3.33s/it]



Processing articles:  60%|███████████▉        | 264/442 [08:54<07:41,  2.59s/it]



Processing articles:  60%|███████████▉        | 265/442 [08:55<06:17,  2.13s/it]



Processing articles:  60%|████████████        | 266/442 [08:57<06:12,  2.12s/it]



Processing articles:  60%|████████████        | 267/442 [08:59<05:43,  1.96s/it]



Processing articles:  61%|████████████▏       | 268/442 [09:01<05:46,  1.99s/it]



Processing articles:  61%|████████████▏       | 269/442 [09:03<06:06,  2.12s/it]



Processing articles:  61%|████████████▏       | 270/442 [09:05<05:58,  2.08s/it]



Processing articles:  61%|████████████▎       | 271/442 [09:07<06:04,  2.13s/it]



Processing articles:  62%|████████████▎       | 272/442 [09:10<06:15,  2.21s/it]



Processing articles:  62%|████████████▎       | 273/442 [09:11<05:28,  1.95s/it]



Processing articles:  62%|████████████▍       | 274/442 [09:13<05:15,  1.88s/it]



Processing articles:  62%|████████████▍       | 275/442 [09:15<05:34,  2.00s/it]



Processing articles:  62%|████████████▍       | 276/442 [09:18<06:07,  2.21s/it]



Processing articles:  63%|████████████▌       | 277/442 [09:19<05:29,  2.00s/it]



Processing articles:  63%|████████████▌       | 278/442 [09:23<07:10,  2.62s/it]



Processing articles:  63%|████████████▌       | 279/442 [09:24<05:47,  2.13s/it]



Processing articles:  63%|████████████▋       | 280/442 [09:26<05:04,  1.88s/it]



Processing articles:  64%|████████████▋       | 281/442 [09:27<04:23,  1.63s/it]



Processing articles:  64%|████████████▊       | 282/442 [09:30<05:17,  1.98s/it]



Processing articles:  64%|████████████▊       | 283/442 [09:31<05:09,  1.94s/it]



Processing articles:  64%|████████████▊       | 284/442 [09:33<05:02,  1.91s/it]



Processing articles:  64%|████████████▉       | 285/442 [09:37<06:34,  2.52s/it]



Processing articles:  65%|████████████▉       | 286/442 [09:40<07:05,  2.73s/it]



Processing articles:  65%|████████████▉       | 287/442 [09:42<05:51,  2.27s/it]



Processing articles:  65%|█████████████       | 288/442 [09:43<05:25,  2.11s/it]



Processing articles:  65%|█████████████       | 289/442 [09:45<05:02,  1.98s/it]



Processing articles:  66%|█████████████       | 290/442 [09:46<04:24,  1.74s/it]



Processing articles:  66%|█████████████▏      | 291/442 [09:50<05:34,  2.21s/it]



Processing articles:  66%|█████████████▏      | 292/442 [09:52<05:37,  2.25s/it]



Processing articles:  66%|█████████████▎      | 293/442 [09:55<05:59,  2.42s/it]



Processing articles:  67%|█████████████▎      | 294/442 [09:56<05:13,  2.12s/it]



Processing articles:  67%|█████████████▎      | 295/442 [09:57<04:17,  1.75s/it]



Processing articles:  67%|█████████████▍      | 296/442 [09:58<03:59,  1.64s/it]



Processing articles:  67%|█████████████▍      | 297/442 [10:01<04:44,  1.96s/it]



Processing articles:  67%|█████████████▍      | 298/442 [10:02<04:03,  1.69s/it]



Processing articles:  68%|█████████████▌      | 299/442 [10:05<05:05,  2.14s/it]



Processing articles:  68%|█████████████▌      | 300/442 [10:07<04:57,  2.09s/it]



Processing articles:  68%|█████████████▌      | 301/442 [10:09<04:38,  1.98s/it]



Processing articles:  68%|█████████████▋      | 302/442 [10:11<04:30,  1.93s/it]



Processing articles:  69%|█████████████▋      | 303/442 [10:12<03:55,  1.70s/it]



Processing articles:  69%|█████████████▊      | 304/442 [10:14<04:11,  1.82s/it]



Processing articles:  69%|█████████████▊      | 305/442 [10:16<04:14,  1.86s/it]



Processing articles:  69%|█████████████▊      | 306/442 [10:18<04:29,  1.98s/it]



Processing articles:  69%|█████████████▉      | 307/442 [10:22<05:29,  2.44s/it]



Processing articles:  70%|█████████████▉      | 308/442 [10:24<05:28,  2.45s/it]



Processing articles:  70%|█████████████▉      | 309/442 [10:28<05:58,  2.69s/it]



Processing articles:  70%|██████████████      | 310/442 [10:31<06:41,  3.04s/it]



Processing articles:  70%|██████████████      | 311/442 [10:34<06:03,  2.78s/it]



Processing articles:  71%|██████████████      | 312/442 [10:36<05:56,  2.75s/it]



Processing articles:  71%|██████████████▏     | 313/442 [10:38<05:17,  2.46s/it]



Processing articles:  71%|██████████████▏     | 314/442 [10:41<05:34,  2.61s/it]



Processing articles:  71%|██████████████▎     | 315/442 [10:42<04:44,  2.24s/it]



Processing articles:  71%|██████████████▎     | 316/442 [10:44<04:03,  1.94s/it]



Processing articles:  72%|██████████████▎     | 317/442 [10:45<03:36,  1.74s/it]



Processing articles:  72%|██████████████▍     | 318/442 [10:47<03:44,  1.81s/it]



Processing articles:  72%|██████████████▍     | 319/442 [10:48<03:30,  1.71s/it]



Processing articles:  72%|██████████████▍     | 320/442 [10:50<03:19,  1.64s/it]



Processing articles:  73%|██████████████▌     | 321/442 [10:51<03:06,  1.54s/it]



Processing articles:  73%|██████████████▌     | 322/442 [10:53<03:05,  1.54s/it]



Processing articles:  73%|██████████████▌     | 323/442 [10:54<02:42,  1.36s/it]



Processing articles:  73%|██████████████▋     | 324/442 [10:56<03:00,  1.53s/it]



Processing articles:  74%|██████████████▋     | 325/442 [10:57<02:49,  1.45s/it]



Processing articles:  74%|██████████████▊     | 326/442 [10:58<02:52,  1.48s/it]



Processing articles:  74%|██████████████▊     | 327/442 [10:59<02:37,  1.37s/it]



Processing articles:  74%|██████████████▊     | 328/442 [11:01<02:44,  1.44s/it]



Processing articles:  74%|██████████████▉     | 329/442 [11:02<02:30,  1.33s/it]



Processing articles:  75%|██████████████▉     | 330/442 [11:06<03:57,  2.12s/it]



Processing articles:  75%|██████████████▉     | 331/442 [11:07<03:20,  1.81s/it]



Processing articles:  75%|███████████████     | 332/442 [11:08<03:02,  1.66s/it]



Processing articles:  75%|███████████████     | 333/442 [11:10<03:10,  1.75s/it]



Processing articles:  76%|███████████████     | 334/442 [11:12<03:04,  1.70s/it]



Processing articles:  76%|███████████████▏    | 335/442 [11:15<03:57,  2.22s/it]



Processing articles:  76%|███████████████▏    | 336/442 [11:17<03:44,  2.11s/it]



Processing articles:  76%|███████████████▏    | 337/442 [11:20<03:48,  2.17s/it]



Processing articles:  76%|███████████████▎    | 338/442 [11:22<04:01,  2.32s/it]



Processing articles:  77%|███████████████▎    | 339/442 [11:24<03:31,  2.05s/it]



Processing articles:  77%|███████████████▍    | 340/442 [11:25<03:13,  1.90s/it]



Processing articles:  77%|███████████████▍    | 341/442 [11:27<03:14,  1.92s/it]



Processing articles:  77%|███████████████▍    | 342/442 [11:29<02:58,  1.78s/it]



Processing articles:  78%|███████████████▌    | 343/442 [11:30<02:41,  1.63s/it]



Processing articles:  78%|███████████████▌    | 344/442 [11:31<02:34,  1.58s/it]



Processing articles:  78%|███████████████▌    | 345/442 [11:33<02:33,  1.58s/it]



Processing articles:  78%|███████████████▋    | 346/442 [11:35<02:39,  1.66s/it]



Processing articles:  79%|███████████████▋    | 347/442 [11:36<02:24,  1.52s/it]





Processing articles:  79%|███████████████▋    | 348/442 [11:43<04:45,  3.04s/it]



Processing articles:  79%|███████████████▊    | 349/442 [11:44<04:05,  2.64s/it]





Processing articles:  79%|███████████████▊    | 350/442 [11:51<05:51,  3.82s/it]



Processing articles:  79%|███████████████▉    | 351/442 [11:53<05:02,  3.32s/it]



Processing articles:  80%|███████████████▉    | 352/442 [11:56<04:54,  3.27s/it]



Processing articles:  80%|███████████████▉    | 353/442 [12:01<05:31,  3.72s/it]



Processing articles:  80%|████████████████    | 354/442 [12:03<04:35,  3.13s/it]



Processing articles:  80%|████████████████    | 355/442 [12:05<04:05,  2.82s/it]



Processing articles:  81%|████████████████    | 356/442 [12:08<04:07,  2.88s/it]



Processing articles:  81%|████████████████▏   | 357/442 [12:10<03:38,  2.58s/it]



Processing articles:  81%|████████████████▏   | 358/442 [12:15<04:34,  3.27s/it]



Processing articles:  81%|████████████████▏   | 359/442 [12:17<04:01,  2.90s/it]



Processing articles:  81%|████████████████▎   | 360/442 [12:19<03:33,  2.60s/it]



Processing articles:  82%|████████████████▎   | 361/442 [12:21<03:31,  2.61s/it]



Processing articles:  82%|████████████████▍   | 362/442 [12:24<03:24,  2.56s/it]



Processing articles:  82%|████████████████▍   | 363/442 [12:24<02:39,  2.02s/it]



Processing articles:  82%|████████████████▍   | 364/442 [12:25<02:13,  1.72s/it]



Processing articles:  83%|████████████████▌   | 365/442 [12:27<02:01,  1.58s/it]





Processing articles:  83%|████████████████▌   | 366/442 [12:32<03:23,  2.68s/it]



Processing articles:  83%|████████████████▌   | 367/442 [12:34<03:06,  2.49s/it]



Processing articles:  83%|████████████████▋   | 368/442 [12:36<02:59,  2.43s/it]



Processing articles:  83%|████████████████▋   | 369/442 [12:37<02:28,  2.04s/it]



Processing articles:  84%|████████████████▋   | 370/442 [12:40<02:44,  2.28s/it]



Processing articles:  84%|████████████████▊   | 371/442 [12:41<02:14,  1.89s/it]



Processing articles:  84%|████████████████▊   | 372/442 [12:43<02:14,  1.92s/it]



Processing articles:  84%|████████████████▉   | 373/442 [12:44<01:43,  1.50s/it]



Processing articles:  85%|████████████████▉   | 374/442 [12:45<01:41,  1.49s/it]



Processing articles:  85%|████████████████▉   | 375/442 [12:47<01:37,  1.45s/it]



Processing articles:  85%|█████████████████   | 376/442 [12:49<02:02,  1.85s/it]



Processing articles:  85%|█████████████████   | 377/442 [12:51<01:57,  1.81s/it]



Processing articles:  86%|█████████████████   | 378/442 [12:54<02:08,  2.01s/it]



Processing articles:  86%|█████████████████▏  | 379/442 [12:55<01:52,  1.79s/it]





Processing articles:  86%|█████████████████▏  | 380/442 [13:01<03:12,  3.11s/it]





Processing articles:  86%|█████████████████▏  | 381/442 [13:08<04:14,  4.18s/it]



Processing articles:  86%|█████████████████▎  | 382/442 [13:11<03:57,  3.95s/it]



Processing articles:  87%|█████████████████▎  | 383/442 [13:15<03:53,  3.96s/it]



Processing articles:  87%|█████████████████▍  | 384/442 [13:18<03:29,  3.62s/it]



Processing articles:  87%|█████████████████▍  | 385/442 [13:19<02:46,  2.91s/it]



Processing articles:  87%|█████████████████▍  | 386/442 [13:21<02:29,  2.67s/it]



Processing articles:  88%|█████████████████▌  | 387/442 [13:24<02:26,  2.66s/it]



Processing articles:  88%|█████████████████▌  | 388/442 [13:25<02:00,  2.24s/it]



Processing articles:  88%|█████████████████▌  | 389/442 [13:26<01:38,  1.85s/it]



Processing articles:  88%|█████████████████▋  | 390/442 [13:27<01:23,  1.61s/it]





Processing articles:  88%|█████████████████▋  | 391/442 [13:33<02:25,  2.85s/it]



Processing articles:  89%|█████████████████▋  | 392/442 [13:35<02:09,  2.59s/it]



Processing articles:  89%|█████████████████▊  | 393/442 [13:38<02:20,  2.86s/it]



Processing articles:  89%|█████████████████▊  | 394/442 [13:39<01:48,  2.26s/it]



Processing articles:  89%|█████████████████▊  | 395/442 [13:40<01:30,  1.93s/it]



Processing articles:  90%|█████████████████▉  | 396/442 [13:42<01:19,  1.72s/it]



Processing articles:  90%|█████████████████▉  | 397/442 [13:44<01:22,  1.84s/it]



Processing articles:  90%|██████████████████  | 398/442 [13:45<01:12,  1.64s/it]



Processing articles:  90%|██████████████████  | 399/442 [13:46<01:08,  1.58s/it]



Processing articles:  90%|██████████████████  | 400/442 [13:48<01:11,  1.70s/it]



Processing articles:  91%|██████████████████▏ | 401/442 [13:52<01:35,  2.32s/it]



Processing articles:  91%|██████████████████▏ | 402/442 [13:54<01:23,  2.09s/it]



Processing articles:  91%|██████████████████▏ | 403/442 [13:55<01:15,  1.95s/it]



Processing articles:  91%|██████████████████▎ | 404/442 [13:58<01:20,  2.12s/it]



Processing articles:  92%|██████████████████▎ | 405/442 [14:00<01:17,  2.10s/it]



Processing articles:  92%|██████████████████▎ | 406/442 [14:01<01:05,  1.83s/it]



Processing articles:  92%|██████████████████▍ | 407/442 [14:04<01:14,  2.12s/it]



Processing articles:  92%|██████████████████▍ | 408/442 [14:09<01:41,  3.00s/it]



Processing articles:  93%|██████████████████▌ | 409/442 [14:11<01:31,  2.77s/it]



Processing articles:  93%|██████████████████▌ | 410/442 [14:13<01:21,  2.55s/it]



Processing articles:  93%|██████████████████▌ | 411/442 [14:15<01:13,  2.37s/it]



Processing articles:  93%|██████████████████▋ | 412/442 [14:18<01:12,  2.41s/it]



Processing articles:  93%|██████████████████▋ | 413/442 [14:19<01:04,  2.24s/it]



Processing articles:  94%|██████████████████▋ | 414/442 [14:23<01:14,  2.66s/it]



Processing articles:  94%|██████████████████▊ | 415/442 [14:25<01:01,  2.29s/it]



Processing articles:  94%|██████████████████▊ | 416/442 [14:27<00:58,  2.26s/it]



Processing articles:  94%|██████████████████▊ | 417/442 [14:30<01:02,  2.50s/it]



Processing articles:  95%|██████████████████▉ | 418/442 [14:31<00:50,  2.09s/it]



Processing articles:  95%|██████████████████▉ | 419/442 [14:32<00:41,  1.79s/it]



Processing articles:  95%|███████████████████ | 420/442 [14:34<00:42,  1.93s/it]



Processing articles:  95%|███████████████████ | 421/442 [14:36<00:36,  1.76s/it]



Processing articles:  95%|███████████████████ | 422/442 [14:38<00:40,  2.02s/it]



Processing articles:  96%|███████████████████▏| 423/442 [14:41<00:44,  2.35s/it]



Processing articles:  96%|███████████████████▏| 424/442 [14:43<00:41,  2.29s/it]



Processing articles:  96%|███████████████████▏| 425/442 [14:47<00:43,  2.59s/it]





Processing articles:  96%|███████████████████▎| 426/442 [14:52<00:52,  3.27s/it]



Processing articles:  97%|███████████████████▎| 427/442 [14:54<00:42,  2.86s/it]



Processing articles:  97%|███████████████████▎| 428/442 [14:55<00:33,  2.40s/it]



Processing articles:  97%|███████████████████▍| 429/442 [14:57<00:28,  2.20s/it]



Processing articles:  97%|███████████████████▍| 430/442 [14:58<00:24,  2.01s/it]



Processing articles:  98%|███████████████████▌| 431/442 [14:59<00:18,  1.67s/it]



Processing articles:  98%|███████████████████▌| 432/442 [15:01<00:18,  1.86s/it]



Processing articles:  98%|███████████████████▌| 433/442 [15:04<00:20,  2.25s/it]



Processing articles:  98%|███████████████████▋| 434/442 [15:06<00:16,  2.08s/it]



Processing articles:  98%|███████████████████▋| 435/442 [15:08<00:13,  1.93s/it]



Processing articles:  99%|███████████████████▋| 436/442 [15:09<00:10,  1.71s/it]



Processing articles:  99%|███████████████████▊| 437/442 [15:11<00:09,  1.84s/it]



Processing articles:  99%|███████████████████▊| 438/442 [15:13<00:07,  1.87s/it]



Processing articles:  99%|███████████████████▊| 439/442 [15:16<00:06,  2.09s/it]



Processing articles: 100%|███████████████████▉| 440/442 [15:18<00:04,  2.03s/it]



Processing articles: 100%|████████████████████| 442/442 [15:19<00:00,  2.08s/it]
