# Weekly Meeting 9

* To check if the ancestors (category graph) and the value intersections (infobox graph) are complementary.
* To separate group/individual information in the input for BART.
* Experiments for reducing the scores for negative candidates.

In [4]:
from pyvis import network as net
import itertools
import pickle as pkl
import networkx as nx
import numpy as np
from networkx.algorithms.shortest_paths import shortest_path as get_shortest_path
import sys
sys.path.insert(0, "../")
import modeling

## Are the ancestors and the intersections complementary?

"61.68% of the entity tuples have not got intersections in the 1-hop of the infobox graph"

| Entity tuple | Infobox | Category |
| --- | --- | --- |
| (A, B) | ✅ | ❌ |
| ... | | |
| (Y, Z) | ❌ | ✅ |

In [5]:
def get_tesa_information(path):
    with open(path, "rb") as fr:
        modeling_task = pkl.load(fr)
    
    entities = []
    aggregations = []
    gold_standards = []
    types = []
    
    # modeling_task.test_loader: 
    for ranking_task in list(itertools.chain(*[modeling_task.train_loader,
                                               modeling_task.valid_loader,
                                               modeling_task.test_loader])):
        aggregations.append([])
        gold_standards.append([])
        entities.append(ranking_task[0][0]["entities"])
        types.append(ranking_task[0][0]["entities_type"])
        for (inp, trg) in ranking_task:
            gold_standards[-1].extend(trg)
            aggregations[-1].extend(inp["choices"])
    
    return entities, aggregations, gold_standards, types

def get_hops(entity, infobox_graph, hops=1):
    hops_map = {0: set([entity])}
    for i in range(1, hops + 1):
        hops_map[i] = set([])
        if entity not in infobox_graph:
            continue
        for node in hops_map[i-1]:
            successors = set([successor for successor in infobox_graph.successors(node)
                              if successor and "WP:" not in successor])
            hops_map[i] = hops_map[i].union(successors)
    
    if entity in infobox_graph:
        hops_map["all"] = set.union(*list(hops_map.values()))
    else:
        hops_map["all"] = hops_map[0]
    return hops_map

def get_depth_intersection(intersection, entities_hops):
    depth_intersection = []
    for node in intersection:
        sum_depths = 0
        for entity in entities_hops:
            for depth in entities_hops[entity]:
                if depth != "all" and node in entities_hops[entity][depth]:
                    sum_depths += depth
                    break
        depth_intersection.append((node, sum_depths))
    return sorted(list(set(depth_intersection)), key=lambda x: x[1])
                        
    
def infobox_value_intersection(aggregatable_entities, infobox_graph, hops=1, topk=10):
    """
    Given a tuple of entities
        1) Computes the successors of each entity until a maximum depth *hops*.
        2) Intersects all successors (common successors)
        3) Sorts the common successors by sum of distances to the entities and pick the $k$ lowest.
    """
    
    intersections = {}
    
    for entities in aggregatable_entities:
        # 1)
        entities_hops = {entity: get_hops(entity, infobox_graph, hops) for entity in entities}
        # 2)
        all_hops = [entities_hops[entity]["all"] for entity in entities_hops]
        intersection = set.intersection(*all_hops)
        # 3)
        depth_intersections = get_depth_intersection(intersection, entities_hops)[:topk]

        intersections[tuple(entities)] = [k for k, _ in depth_intersections]
    
    return intersections

In [6]:
with open("./knowledge_graphs/infobox_graph_depth-3.pkl", "rb") as fr:
    infobox_graph = pkl.load(fr)
    
with open("./knowledge_graphs/6_lowest_common_ancestors_graph.pkl", "rb") as fr:
    ancestors = pkl.load(fr)

In [7]:
path = "./context-dependent-same-type_50-25-25_rs24_bs4_cf-v0_tf-v2.pkl"
aggregatable_entities, candidates, gold_standards, types = get_tesa_information(path)

In [8]:
def make_analysis(intersections):
    statistics = {"empty_intersections": 0}
    statistics["empty_intersections"] = len([k for k in list(intersections.values()) if not k]) \
                                        / len(intersections)
    return statistics

In [9]:
intersections_1 = infobox_value_intersection(aggregatable_entities, infobox_graph, hops=1, topk=999)

In [10]:
aggregatable_entities = set([tuple(ae) for ae in aggregatable_entities])
h = {"infobox": 0, "category": 0, "all": 0, "any": 0, "none": 0}

for entities in aggregatable_entities:
    inters = intersections_1[tuple(entities)]
    lca = ancestors[tuple(entities)]
    
    if inters or lca:
        h["any"] += 1
        if inters:
            h["infobox"] += 1
        if lca:
            h["category"] += 1
    
        if inters and lca:
            h["all"] += 1
        
    else:
        h["none"] += 1

#for key in h:
#    print(key, ":", h[key] / len(aggregatable_entities))

Percentages over unique entity tuples

| Infobox | Category | All | Any | None |
| --- | --- | --- | --- | --- |
| 38.32% | 80.84% | 38.10% | 81.06% | 18.94% |

The number of entity tuples that have information of both graphs is very similar to the infobox graph (minimum) and the number of entity tuples that have information of at least one of them is very similar to the category graph (maximum). Basically, the set of entities that have intersections at the 1-hop of the infobox graph is $\approx$ a subset of the set of entities that have lowest common ancestors in the category graph.

| Entity tuple | Infobox | Category |
| --- | --- | --- |
| (A, B) | ✅ | ✅ |
| ... | | |
| (Y, Z) | ❌ | ✅ |

In [17]:
entities_with_ancestors = set([ent for ent, _ in ancestors.items() if ancestors[ent]])
entities_with_infobox = set([ent for ent, _ in intersections_1.items() if intersections_1[ent]])
print("Entities with ancestors:", len(entities_with_ancestors))
print("Entities with infobox:", len(entities_with_infobox))
print("Len intersection:", len(entities_with_ancestors.intersection(entities_with_infobox)))

Entities with ancestors: 1080
Entities with infobox: 512
Len intersection: 509


## Separators for group/individual information

I considered the backgrounds of the entities as individual information, and as group information the ancestors of the category graph, the value intersections of the infobox graph and the context. I considered the set of entities as a "query" block:

> &lt;s>
<br>**(S)** $B_1$ ... $B_N$ **<single/group sep>**
<br>**(G)** $E_1$, ... and $E_N$ are related to: $A_1$, ... ,$A_P$, $I_1$, ... and $I_Q$ **<graph/article sep>**
<br>**(G)** $T: C$ **<group/query sep>**
<br>**(Q)** $E_1$, ..., $E_N$
<br>&lt;/s>


<single/group_sep> = "µ"
<br><graph/article sep> = "." in SGGQ-1 and "§" in SGGQ-2
<br><group/query sep> = "£"


| System | MAP | R@10 | MRR |
| -- |  -- | -- | -- |
| BCE | 83.07 | 93.02 | 93.90 |
| AIBCE (I: 1-hop, k=all, A: k=6) | 83.25 | **93.40** | 94.55 |
| SGGQ-1 | **83.65** | 92.58 | **95.12** |
| SGGQ-2 | 83.48 | 92.52 | 94.73 |

## Reducing the scores for negative candidates

Until now, we focused on using the knowledge graphs for ranking higher the gold aggregations, but these experiments are focused on ranking lower the distractors.

Idea: to use symbolic information extracted from the knowledge graphs as candidates during training. Three ways (from the previous meeting):

1. $C^+$ = {G, $S^+$}, $C^-$ = N
2. $C^+$ = {G, $S^+$}, $C^-$ = {N, $S^-$} 
3. $C^+_1$ = {G, $S^+$}, $C^-_1$ = {N, $S^-$} and $C^+_2$ = G, $C^-_2$ = N; alternate them during training.


Experimental details:

* $C^-$ are not used in the generative setup, so, in this case, I use only $C^+$ = {G, $S^+$} (some idea to integrate $C^-$ in the training process? [Unlikelihood](https://arxiv.org/pdf/1908.04319.pdf)).


* I didn't use any information from the graphs in the input (input format: BCE).


| System | MAP | R@10 | MRR |
| --- | --- | --- | --- |
| Generative ($C^{+}$ $=$ {$G$, $A$})| 81.58 | 92.05 | 92.83 |
| Generative ($C^{+}$ $=$ {$G$, $I$})| 83.20 | 92.77 | 94.46 |
| Generative ($C^{+}$ $=$ {$G$, $A$, $I$})| 81.58 | 92.40 | 92.55 |
| Disc-1 | | | |
| Disc-2 | | | |
| Disc-3 | | | |

Aggregation statistics of the training set, the validation and test sets are not modified.

| System | # Pos | # Neg |
| --- | --- | --- |
| Generative | 2303 | 18289 |
| Generative ($S^{+}=A$)| 6545 | 14047 |
| Generative ($S^{+}=I$)| 3679 | 16913 |
| Generative ($S^{+}$ $=$ {$A$, $I$})| 7420 | 13172 |