# Query Re-writing with BART-FC (Full Context)
By Carlos Gemmell (carlos.gemmell@glasgow.ac.uk | Twitter: @aquaktus) & Jeffrey Dalton (Jeff.Dalton@glasgow.ac.uk | Twitter: @JeffD)

BART-FC is a re-writer trained on only CAsT 2019. It takes an arbitrary amount of unresoved queries in a conversation and returns all turns fully resolved. BART-FC does this by rewriting each query individually and feeding the resolved query for the next turn. This ensures relevant entities are modeled effectively for each turn and not forgotten.

Limitations: BART-FC can suffer from cascading errors since an error early in the conversation can propagate through all later turns. 

![BART](images/BART_feedback_rewriter.png)

### Download the BART-FC checkpoint

In [8]:
from src.useful_utils import download_from_url
download_from_url('https://storage.googleapis.com/artifacts.grill-search.appspot.com/model_checkpoints/BART_save_dict_fixed.ckpt', './BART-FC_huggingface.ckpt')

BART_save_dict_fixed.ckpt: 1.63GB [00:38, 42.2MB/s]                                


1625476222

### Initialise the the re-writer object

In [1]:
from src.text_transforms import *
from src.complex_transforms import BART_Query_Rewriter_Transform, BART_Full_Conversational_Rewriter_Transform

from tqdm.auto import tqdm 
%load_ext autoreload
%autoreload 2
%load_ext line_profiler

In [2]:
BART_conv_transform = BART_Full_Conversational_Rewriter_Transform("BART-FC_huggingface.ckpt", device="cuda:0")

Numericaliser. Ex: 'This is a test' -> [0, 713, 16, 10, 1296, 2]
Denumericaliser. Ex: [0,1,2,3,4,5,6,7,8,9] -> <s><pad></s><unk>. the, to and of
BERT ReRanker initialised on cuda:0. Batch size 1


### Single query with context

In [16]:
test_samples = [{'previous_queries':['How old are digital cameras?', 'Where were they invented?'],
                'unresolved_query':'Were they around the industrial era?'}]

eval_raw_samples = BART_conv_transform(test_samples)
resolved_query = eval_raw_samples[0]['full_rewritten_queries'][-1]

print("Output:", resolved_query)

BART self feeding rewrites: 100%|██████████| 1/1 [00:00<00:00,  1.21it/s]

Output:  Were digital cameras around the industrial era?





### Fully unresolved conversation with BART-FC

In [17]:
print(eval_raw_samples[0]['full_rewritten_queries'])

[' How old are digital cameras?', ' Where were digital cameras invented?', ' Were digital cameras around the industrial era?']


### Batched rewriting

In [20]:
test_samples = [{'previous_queries':['How old are digital cameras?', 'Where were they invented?'],
                'unresolved_query':'Were they around the industrial era?'},
               
                {'previous_queries':['How do you make raspberry jam?'],
                'unresolved_query':'is there a recipe?'}]                       #    <- zero anaphora case!

eval_raw_samples = BART_conv_transform(test_samples)

BART self feeding rewrites: 100%|██████████| 2/2 [00:01<00:00,  1.24it/s]


In [21]:
eval_raw_samples

[{'previous_queries': ['How old are digital cameras?',
   'Where were they invented?'],
  'unresolved_query': 'Were they around the industrial era?',
  'full_rewritten_queries': [' How old are digital cameras?',
   ' Where were digital cameras invented?',
   ' Were digital cameras around the industrial era?'],
  'rewritten_query': ' Were digital cameras around the industrial era?'},
 {'previous_queries': ['How do you make raspberry jam?'],
  'unresolved_query': 'is there a recipe?',
  'full_rewritten_queries': [' How do you make raspberry jam?',
   ' is there a recipe for raspberry jam.'],
  'rewritten_query': ' is there a recipe for raspberry jam.'}]