# Evaluation of Results

## NOTE: This is still work in progress, and the description of the script is not yet finished. It will be done by the mid October 2022

This jupyter notebook includes full description of the evaluation process as is now suggested. For running the evaluation in full, use [evaluate.py](https://github.com/DigilabNLCR/BibleCitations/blob/main/evaluate.py)

The evaluation proces consists of several steps. Durung the process, individual CSV files are created for each step.
1) Drop duplicate results and transform the initial 'batch_results.csv' file into CSV file with different structure (more info needed for evaluation).
    --> creates 'UNFILTERED_batch_results.csv' file
2) Connecting consequential results and calculting approximate match probabilities.
    --> creates 'FILTERED_UNFILTERED_batch_results.csv' file
3) Filtering stop-subverses.
    --> creates 'ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file
    --> + creates 'FILTERED_BY_STOP_SUBS.csv' file (so you can check these results, too)
4) Drop 'hidden' duplicates. These are the duplicate results that have formally different query string, but actually one of the query strings contains the other.
    --> creates 'DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file
5) Marking multiple attributions. In this case the multiple attributions are not dropped, but kept with a column that suggest if the result should be dropped or not.
    --> creates 'MA_DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file
6) Marking 'sure' citations:
    --> creates 'FINAL_MA_DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file

7) Check the results by yourselves ;-)

In [1]:
import biblical_intertextuality_package as bip

import pandas as pd
import os
import Levenshtein
import numpy as np
import joblib

from collections import defaultdict
from nltk import word_tokenize, sent_tokenize

In [2]:
""" Defining paths. """
ROOT_PATH = os.getcwd()

BIBLES_PATH = os.path.join(ROOT_PATH, 'Bible_files')
DATASETS_PATH = os.path.join(ROOT_PATH, 'datasets')
DICTS_PATH = os.path.join(ROOT_PATH, 'dictionaries')
CORPUS_PATH = os.path.join(ROOT_PATH, 'corpuses')
RESULTS_PATH = os.path.join(ROOT_PATH, 'results')
ALL_JSONS_PATH = os.path.join(ROOT_PATH, 'query_jsons')

JOURNAL_FULLDATA_PATH = os.path.join(ROOT_PATH, 'journals_fulldata.joblib')

BATCHES_FILE_PATH = os.path.join(ROOT_PATH, 'batches.csv')
BATCH_RESULTS_FILE_PATH = os.path.join(RESULTS_PATH, 'batch_results.csv')

STOP_WORDS_PATH = os.path.join(ROOT_PATH, 'stop_words.txt')
STOP_SUBVERSES_PATH = os.path.join(ROOT_PATH, 'stop_subverses_21.txt')
EXCLUSIVES_PATH = os.path.join(ROOT_PATH, 'exclusives.txt')

### General functions for evaluation

In [None]:
def load_results(results_filename='batch_results.csv', delimiter=',') -> pd.core.frame.DataFrame:
    """ This function loads selected results from the results folder. It is returned as pandas dataframe
    
    :param results_filename: filename of results; 'batch_results.csv' is the default parameter, as this is the default filename of results from the search functions.
    """
    return pd.read_csv(os.path.join(RESULTS_PATH, results_filename), quotechar='"', delimiter=delimiter, encoding='utf-8')


def get_verseid_queryfile(dataframe:pd.core.frame.DataFrame, row_id:int):
    """ This function returns search properties of a given row in the results dataframe. """

    verse_id = dataframe.loc[row_id]['verse_id']
    query_file = dataframe.loc[row_id]['query_file']

    return verse_id, query_file


def get_book_id(verse_id:str) -> str:
    """ Gets book's ID from a verse_id (e.g. "Gn 1:1"). """
    book_id = verse_id.split(' ')[0]
    return book_id

### 1) Drop duplicate results
+ transform the initial 'batch_results.csv' file into CSV file with different structure (more info needed for evaluation).
--> creates 'UNFILTERED_batch_results.csv' file

In [None]:
def make_unfiltered_search_dataframe(results_filename='batch_results.csv', save=True, return_df=False):
    """ This functions converts the preliminary results to structure same as all of the other results (filtered and improved). This is for purely statistical reasons. It only drops duplicates. """
    # Load results:
    results_dataframe = load_results(results_filename)

    # Load metadata from json_metadata.joblib (created with prepare_query_documents.py)
    jsons_metadata = joblib.load(os.path.join(ROOT_PATH, 'journals_metadata.joblib'))

    # Remove duplicate rows from the result dataframe
    print('Original size of the results dataframe:', len(results_dataframe))
    results_dataframe.drop_duplicates(subset=['verse_id', 'query_file', 'index_query_part'], keep='first', inplace=True)
    print('Size of the results dataframe after droping duplicates:', len(results_dataframe))

    # Create (empty) final results dataframe:
    final_results = {}
    res_id = 0
    print_progress = 0
    iter_ = 0

    print('Dropping duplicates...')
    for row_id in results_dataframe.index:
        iter_ += 1
        if print_progress == 500:
            print('\t', iter_, 'of', len(results_dataframe))
            print_progress = 0
      
        verse_id, query_file = get_verseid_queryfile(dataframe=results_dataframe, row_id=row_id)

        # NOTE: repair Syr verses to Sir (there has been a mistake in my dataset, now it is repaired but not in the initial batch_results.csv file in PUBLIC_RESULTS)
        if 'Syr' in verse_id:
            verse_id = verse_id.replace('Syr', 'Sir')

        row_dict = results_dataframe.loc[row_id].to_dict()

        # NOTE: 334149b0-877c-11e6-8aeb-5ef3fc9ae867 has wrong date --> it is repaired here in the process:
        if '334149b0-877c-11e6-8aeb-5ef3fc9ae867' in row_dict['query_file']:
            row_dict['date'] = '30.06.1935'
        else:
            row_dict['date'] = jsons_metadata[query_file]['issue_date']

        row_dict['verse_id'] = verse_id
        row_dict['book'] = get_book_id(verse_id)
        row_dict['journal'] = jsons_metadata[query_file]['journal']
        row_dict['page_num'] = jsons_metadata[query_file]['issue_page']

        # NOTE: filtering out year out of the scope of 1925-1939 (because for some reason, some other years also entered out dataset, possibly due to wrong metadata)
        issue_year = row_dict['date'].split('.')[-1]
        years_to_consider = ['1925', '1926', '1927', '1928', '1929', '1930', '1931', '1932', '1933', '1934', '1935', '1936', '1937', '1938', '1939', '1937-1938']
        if issue_year not in years_to_consider:
            continue
        else:
            final_results[res_id] = row_dict
            res_id += 1
        
        print_progress += 1

    final_results_df = pd.DataFrame.from_dict(final_results)
    final_results_df = final_results_df.transpose()
    
    if save:
        final_results_df.to_csv(os.path.join(RESULTS_PATH, f'UNFILTERED_{results_filename}'), encoding='utf-8', quotechar='"', sep=';')

    if return_df:
        return final_results_df

In [None]:
make_unfiltered_search_dataframe(results_filename='batch_results.csv')

### 2) Connecting consequential results
- In this part the citations that appear over the borders of the split query parts are connected, so there are no duplicates of this kind.
+ approximate match probabilities are "calculated"
- This probability calculation works more or less as a suggestion for you and the suggested values are not something that can be taken really seriously.
    - in general, the match probability is matched_characters*matched_subverses_score, i.e. it takes into consideration how large part of the verse is matched (and how big the edit distance is)
- In addition, the scripts check if there are some "exclusives" irregularities? E.g. "He said a sentence" vs. "I said a sentence" (while the edit distance is in tolerance, the sentences should probably not be considered as "the same" ... The value is either True or False and may help you with the by-hand evaluation ... These exclusive words can be defined in [exclusives.txt](https://github.com/DigilabNLCR/BibleCitations/blob/main/exclusives.txt) - always put mutualy exclusives words on one line.

--> creates 'FILTERED_UNFILTERED_batch_results.csv' file

In [None]:
""" Define mutually exclusive words in exclusives.txt """
with open(EXCLUSIVES_PATH, 'r', encoding='utf-8') as exclusives_file:
    data = exclusives_file.read()
    words_lines = data.split('\n')

    exclusives_dict = defaultdict(list)
    list_of_exclusives = []

    for line in words_lines:
        word_list = line.split(', ')
        for word_from in word_list:
            for word_to in word_list:
                if word_from == word_to:
                    continue
                if word_from == 'je' and word_to == 'jest':
                    continue
                if word_from == 'jest' and word_to == 'je':
                    continue
                else:
                    exclusives_dict[bip.normalize_string(word_from)].append(bip.normalize_string(word_to))
            list_of_exclusives.append(bip.normalize_string(word_from))


def exclusiveness_test(subverse_string:str, query_string:str) -> bool:
    """
    This function serves to check if the detected string is not false positive based on mutually exclusive words. E.g. naše vs. vaše;, je vs, není etc.
    """
    subverse_string = bip.normalize_string(subverse_string)
    query_string = bip.normalize_string(query_string)

    subverse_words = bip.word_tokenize_no_punctuation(subverse_string)
    query_words = bip.word_tokenize_no_punctuation(query_string)

    for i, word in enumerate(subverse_words):
        if word in list_of_exclusives:
            list_to_ex = exclusives_dict[word]
            try:
                if query_words[i] in list_to_ex:
                    return False
            except:
                continue

    return True


def get_row_data_for_initial_filter(dataframe:pd.core.frame.DataFrame, row_id:int):
    """ This function returns search properties of a given row in the results dataframe to be used in check_results function. """

    query_file = dataframe.loc[row_id]['query_file']
    query_window_len = dataframe.loc[row_id]['query_window_len']
    query_overlap = dataframe.loc[row_id]['query_overlap']

    return query_file, query_window_len, query_overlap


def get_verse_et_idx(dataframe:pd.core.frame.DataFrame, row_id:int):
    """ This function returns search properties of a given row in the results dataframe. """

    verse_id = dataframe.loc[row_id, 'verse_id']
    index_query_part = dataframe.loc[row_id, 'index_query_part']

    return verse_id, index_query_part


def select_attributions_to_json(dataframe:pd.core.frame.DataFrame, query_file:str):
    """ This function selects all attributions to a given JSON file. 
    
    It returns: dataframe of all of the results, row_ids to skip
    """
    subset_dataframe = dataframe[dataframe['query_file'] == query_file]

    # If the subset dataframe contains only one result, return it and empty skips.
    if len(subset_dataframe) == 1:
        verse_id, index_query_part = get_verse_et_idx(subset_dataframe, subset_dataframe.index[0])
        attributed_verses = {verse_id: [index_query_part]}
        return attributed_verses, []

    # If the subset dataframe contains more rows, check if further.
    else:
        row_ids_to_skip = subset_dataframe.index
        attributed_verses = defaultdict(list)
        for row_id in row_ids_to_skip:
            verse_id, index_query_part = get_verse_et_idx(dataframe=subset_dataframe, row_id=row_id)
            attributed_verses[verse_id].append(index_query_part)

        return attributed_verses, row_ids_to_skip


def join_overlap(list_of_parts:list, query_index:int) -> str:
    """ This function serves to join two parts of a query into one string (when the citation has been discovered in two consecutive parts of the query document). """
    output = ''

    sentences_in_1 = sent_tokenize(list_of_parts[query_index])
    try:
        sentences_in_2 = sent_tokenize(list_of_parts[query_index+1])
    except IndexError:
        print(sentences_in_1)
        print(list_of_parts[-1])

    for sent_1 in sentences_in_1:
        if sent_1 not in sentences_in_2:
            output += sent_1 + ' '
        else:
            break

    for sent_2 in sentences_in_2:
        output += sent_2 + ' '

    return output.strip()


def fuzzy_string_matching_for_implementation_with_text(subverse_string:str, query_string:str, tolerance=0.85):
    """ 
    Contrary to fuzzy_string_matching_for_implementation(), this function also returns the matched part of the query string and the edit distance of the compared strings. The function is duplicated so as not to speed down the function in the broad search. However, the speed difference has not been tested yet.

    This function is for implementation of typo similarity detection applied to two strings. It returns bool value of match.

    :param subverse_string: string of the biblical subverse we are searching for.
    :param query_string: string in which we are searching for the seubverse_string.
    :param tolerance: how large proportion of the subverse_string must be present in query_string to consider it a match.
    """
    subverse_string = bip.normalize_string(subverse_string)
    subverse_len = len(subverse_string)

    query_string = bip.normalize_string(query_string)
    query_len = len(query_string)

    tolerance = subverse_len * (1-tolerance)

    if subverse_len-tolerance > query_len:
        # If subverse is longer than query string, it is not a match by default
        return False, '', 0
    elif subverse_len-tolerance <= query_len <= subverse_len+tolerance:
        # If subverse is more or les of the same length as query string, just compare them.
        edit_distance = Levenshtein.distance(subverse_string, query_string)
        if edit_distance <= tolerance:
            return True, query_string, edit_distance
    else:
        char_len_sub = len(subverse_string)
        word_len_subv = len(word_tokenize(subverse_string))
        words_in_query_string = word_tokenize(query_string)
        word_len_query_string = len(words_in_query_string)

        for i, cycle in enumerate(range(word_len_subv, word_len_query_string+1)):
            gram_str = ' '.join(words_in_query_string[i:])[:char_len_sub]
            edit_distance = Levenshtein.distance(subverse_string, gram_str)
            if edit_distance <= tolerance:
                return True, gram_str, edit_distance
            else:
                continue
    
    return False, '', 0


def check_for_verse(verse_id:str, string_to_check:str) -> dict:
    """ This function performs the inner check for a verse in all availiable translations. It is implemented in the check_results() function. """
    possible_citations = []

    for trsl in bip.all_translations:
        verse_text = bip.get_verse_text(trsl, verse_id, print_exceptions=False)
        
        if verse_text:
            subverses = bip.split_verse(verse_text, tole_len=21, return_shorts=True, short_limit=9)

            fuzzy_matched_subs_num = 0
            fuzzy_matched_subs = []
            matched_subs_edit_distance = 0
            matched_subs_chars = 0
            exclusive_matched_subs_num = 0

            for subverse in subverses:
                # check for every subverse in edit distance
                fuzzy_match, query_match, edit_distance = fuzzy_string_matching_for_implementation_with_text(subverse, query_string=string_to_check, tolerance=0.85)
                if fuzzy_match:
                    fuzzy_matched_subs_num += 1
                    fuzzy_matched_subs.append(subverse)
                    matched_subs_edit_distance += edit_distance
                    matched_subs_chars += len(subverse)

                    # run the exclussiveness test
                    if exclusiveness_test(subverse, query_match):
                        exclusive_matched_subs_num += 1

                else:
                    continue

            if fuzzy_matched_subs_num == 0:
                continue
            else:
                matched_characters = (matched_subs_chars-matched_subs_edit_distance)/matched_subs_chars
                matched_subverses_score = fuzzy_matched_subs_num/len(subverses)

                match_probability = matched_characters*matched_subverses_score

                result_for_trsl = {'verse_id': verse_id,
                                    'verse_text': verse_text, 
                                    'matched_subverses': fuzzy_matched_subs, 
                                    'query_string': string_to_check, 
                                    'matched_characters': (matched_subs_chars-matched_subs_edit_distance)/matched_subs_chars, 
                                    'matched_subverses_score': fuzzy_matched_subs_num/len(subverses),
                                    'exclusives_match': exclusive_matched_subs_num/fuzzy_matched_subs_num,
                                    'match_probability': match_probability}
            
                possible_citations.append(result_for_trsl)

    # If there are none possible citation (which is weird and it should not happen and it probably means that there are differently split verses in the originally used BibleDataset) return result that are basically False:
    if not possible_citations:
        result = {'verse_id': verse_id,
                    'verse_text': verse_text, 
                    'matched_subverses': [], 
                    'query_string': string_to_check, 
                    'matched_characters': 0, 
                    'matched_subverses_score': 0,
                    'exclusives_match': 0,
                    'match_probability': 'FALSE'}
        
        return result
    
    # Now, if the results seem OK, select the best match (translations as such are not evaluated, just select the best result of all possible results)... in this evaluation, we consider the result with most detected subverses as a match, if same then based on the characters, and finally on the exclusiveness test results.
    matched_subverses_scores = [pc['matched_subverses'] for pc in possible_citations]
    matched_characters_scores = [pc['matched_characters'] for pc in possible_citations]
    exclusiveness_test_scores = [pc['exclusives_match'] for pc in possible_citations]

    # Check subverses score results:
    best_subverses_match = max(matched_subverses_scores)
    if matched_subverses_scores.count(best_subverses_match) == 1:
        best_pc_idx = matched_subverses_scores.index(best_subverses_match)
        return possible_citations[best_pc_idx]
    else:
        # check the character scores results:
        idxs = [i for i, score in enumerate(matched_subverses_scores) if score == best_subverses_match]
        best_chars_match = max([matched_characters_scores[i] for i in idxs])
        if matched_characters_scores.count(best_chars_match) == 1:
            best_pc_idx = matched_characters_scores.index(best_chars_match)
            return possible_citations[best_pc_idx]
        else:
            # check exclusiveness test results:
            idxs = [i for i, score in enumerate(matched_characters_scores) if score == best_chars_match]
            best_excl_res = max([exclusiveness_test_scores[i] for i in idxs])
            best_pc_idx = exclusiveness_test_scores.index(best_excl_res)
            return possible_citations[best_pc_idx]


def load_data_from_journals_fulldata(journals_fulldata:dict, query_file:str):
    journal = journals_fulldata[query_file]['journal']
    issue_date = journals_fulldata[query_file]['issue_date']
    issue_page = journals_fulldata[query_file]['issue_page']
    issue_uuid = journals_fulldata[query_file]['issue_uuid']
    kramerius_url = journals_fulldata[query_file]['kramerius_url']
    full_query_string = journals_fulldata[query_file]['text']

    return journal, issue_date, issue_page, issue_uuid, kramerius_url, full_query_string


def evaluate_attributions_in_doc(attributed_verses:dict, query_file:str, query_window_len:int, query_overlap:int, journals_fulldata:dict) -> list:
    """ This function evaluates attributed verses, supposedly detected in a single JSON file. """
    # Load data from journals_fulldata dictionary:
    journal, issue_date, issue_page, issue_uuid, kramerius_url, full_query_string = load_data_from_journals_fulldata(journals_fulldata=journals_fulldata, query_file=query_file)
    
    # NOTE: repair wrong date with 334149b0-877c-11e6-8aeb-5ef3fc9ae867
    if '334149b0-877c-11e6-8aeb-5ef3fc9ae867' in issue_uuid:
        issue_date = '30.06.1935'

    query_parts = bip.split_query(full_query_string, window_len=query_window_len, overlap=query_overlap)
    
    results_of_attributions = []
  
    for verse_id in attributed_verses:
        attributed_idxs = attributed_verses[verse_id]
        if len(attributed_idxs) == 1:
            string_to_check = query_parts[attributed_idxs[0]]
            possible_citation = check_for_verse(verse_id=verse_id, string_to_check=string_to_check)
            results_of_attributions.append(possible_citation)
            
        else:
            skip = False
            for i, q_idx in enumerate(attributed_idxs):
                if not skip:
                    try:
                        if attributed_idxs[i+1] == q_idx+1:
                            # checking if the next part is a joined sequence
                            skip = True
                            string_to_check = join_overlap(query_parts, q_idx)
                            possible_citation = check_for_verse(verse_id=verse_id, string_to_check=string_to_check)
                            results_of_attributions.append(possible_citation)
                        else:
                            string_to_check = query_parts[attributed_idxs[0]]
                            possible_citation = check_for_verse(verse_id=verse_id, string_to_check=string_to_check)
                            results_of_attributions.append(possible_citation)
                    except IndexError:
                        string_to_check = query_parts[attributed_idxs[i]]
                        possible_citation = check_for_verse(verse_id=verse_id, string_to_check=string_to_check)
                        results_of_attributions.append(possible_citation)
                else:
                    skip = False
                    continue

    # TODO: zde pak přidat všechny další parametry nalezené citace --> pak se to vrátí a přidá do výsledného DF.
    if len(results_of_attributions) == 1:
        results_of_attributions[0]['multiple_attribution'] = False
        results_of_attributions[0]['journal'] = journal
        results_of_attributions[0]['date'] = issue_date
        results_of_attributions[0]['page_num'] = issue_page
        results_of_attributions[0]['uuid'] = issue_uuid
        results_of_attributions[0]['kramerius_url'] = kramerius_url
    else:
        for res in results_of_attributions:
            res['multiple_attribution'] = True
            res['journal'] = journal
            res['date'] = issue_date
            res['page_num'] = issue_page
            res['uuid'] = issue_uuid
            res['kramerius_url'] = kramerius_url

    return results_of_attributions

In [None]:
def make_filtered_search_dataframe(results_filename='UNFILTERED_batch_results.csv', save=True, return_df=False):
    """ This functions applies initial checks on the preliminary results. """
    # Load results:
    print('Loading UNFILTERED results...')
    results_dataframe = load_results(results_filename, delimiter=';')

    # Load journals_fulldata:
    print('Loading journals_fulldata.joblib...')
    journals_full_data = joblib.load(os.path.join(ROOT_PATH, 'journals_fulldata.joblib'))

    # Create (empty) final results dataframe:
    final_results = {}
    res_id = 0

    print('Running initial filtering...')
    rows_to_skip = []
    print_progress = 0
    iter_ = 0
    for row_id in results_dataframe.index:
        print_progress += 1
        iter_ += 1
        if print_progress >= 500:
            print('\t', iter_, '/', len(results_dataframe))
            print_progress = 0

        if row_id in rows_to_skip:
            continue
        else:
            query_file, query_window_len, query_overlap = get_row_data_for_initial_filter(dataframe=results_dataframe, row_id=row_id)
            attributed_verses, add_to_skip = select_attributions_to_json(dataframe=results_dataframe, query_file=query_file)
            rows_to_skip.extend(add_to_skip)

            results = evaluate_attributions_in_doc(attributed_verses=attributed_verses, query_file=query_file, query_window_len=query_window_len, query_overlap=query_overlap, journals_fulldata=journals_full_data)

            for res in results:
                final_results[res_id] = res
                res_id += 1

    final_results_df = pd.DataFrame.from_dict(final_results)
    final_results_df = final_results_df.transpose()
    
    if save:
        final_results_df.to_csv(os.path.join(RESULTS_PATH, f'FILTERED_{results_filename}'), encoding='utf-8', quotechar='"', sep=';')

    if return_df:
        return final_results_df

In [None]:
bip.make_filtered_search_dataframe(results_filename='UNFILTERED_batch_results.csv')

### 3) Filtering stop-subverses.
- define evaluation stop subverses in [evaluation_stop_subverse_21.txt](https://github.com/DigilabNLCR/BibleCitations/blob/main/evaluation_stop_subverses_21.txt) ... the structure of this file must reflect the structure of matched_subverses column of the result dataset. I.e., ['subverse text']
- During this step, the citations that include only the subverses defined in [evaluation_stop_subverse_21.txt](https://github.com/DigilabNLCR/BibleCitations/blob/main/evaluation_stop_subverses_21.txt) are ignored.
- You should choose the subverses to filter wisely, some of the subverses that seem not worty may indeed be true bible quotes.
- In addition, there is [100_hit_needed_subs_21.txt](https://github.com/DigilabNLCR/BibleCitations/blob/main/100_hit_needed_subs_21.txt) file that includes those verses that need to be cited in full (the string of subverse must exactly match the passage where it is supposed to be cited, i.e. no "typos" allowed)
--> creates 'ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file
--> + creates 'FILTERED_BY_STOP_SUBS.csv' file (so you can check the filtered out citations, too)


In [None]:
# TODO: pokračovat zde

### 4) Drop 'hidden' duplicates
These are the duplicate results that have formally different query string, but actually one of the query strings contains the other (this means that there was an overlap in the split of query document, most likely in the end of the query documents)
--> creates 'DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file


### 5) Marking multiple attributions
In this case the multiple attributions are not dropped, but kept with a column that suggest if the result should be dropped or not.
--> creates 'MA_DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file

### 6) Marking 'sure' citations
- We consider a 'sure' citation one that is cited DOPLNIT
--> creates 'FINAL_MA_DUPS_ST_SUBS_FILTERED_UNFILTERED_batch_results.csv' file

## Working with results.

In [None]:
mutual_verses = {
    'L 11:3/Mt 6:11': ['L 11:3', 'Mt 6:11'],
    'Mk 13:31/Mt 24:35/L 21:33': ['Mk 13:31', 'Mt 24:35', 'L 21:33'],
    'Ex 20:16/Dt 5:20': ['Ex 20:16', 'Dt 5:20'],
    '2K 1:2/Fp 1:2/2Te 1:2/1K 1:3/Ef 1:2/Ga 1:3': ['2K 1:2', 'Fp 1:2', '2Te 1:2', '1K 1:3', 'Ef 1:2', 'Ga 1:3'],
    'Mt 11:15/Mt 13:9': ['Mt 11:15', 'Mt 13:9']
    
}