# MULTIVAC Pre-Cooked Output
## Introduction
Gallup's other Jupyter notebook, `precooked_replication.ipynb`, walked through the step-by-step -- with data inputs and outputs -- the various pieces to MULTIVAC. This notebook here takes the end product created by those steps -- a knowledge graph and attributes, including semantic clustering in the form of a Markov logic network or MLN -- from the parsed articles as a pickle file called `mln.pkl` and runs the evaluation steps against the data to asses value and accuracy. 

## Query mapping
Once the MLN knowledge base has been compiled and test queries have been generated, the prototype MULTIVAC system provides a basic interface for mapping these queries to the knowledge base to retrieve answers.
 
The evaluation script is called from `USP.py` and takes three parameters determining the eval and results directories and the name of the file with queries to test. The directory parameters specify the locations of the query file and saved MLN data files, respectively; the eval directory is also where any resulting answers will be written out.
 
The system first reads in the MLN data, followed by the query file. Each question in the query file (one per line) is parsed using the Stanford CoreNLP parser engine, and a custom parse object is returned containing a list of token objects specifying form, lemma, POS tag, NER tag, as well as dependency and parent/child token relationships.
 
Once parsed, each question is analyzed, and key relationships and related arguments and dependencies are extracted. To begin, all non-auxiliary verbs with children are extracted as key relations to analyze; if no such verbs are present, the system picks the token identified as the root of the sentence, whatever it is. As this implies, a convoluted or complex question could indeed result in multiple "queries" being applied to the MULTIVAC knowledge base.
 
For each key relationship extracted, MULTIVAC assumes there is at least one "known" component and one "missing" component that the query is attempting to locate. MULTIVAC searches for all child tokens that are nouns or are tagged as `nsubj`, `nubjpass`, `dobj`, or `obj` dependents. If one matching `nsubj` or `nsubjpass` is found, the missing argument is determined to be of type `dobj` and `nsubj` if not. If the known argument itself has children, these are added to the argument as a compound phrase before matching against the knowledge base.
 
MULTIVAC maps a query to the MLN knowledge base by searching through the argument clusters contained within the key relation’s semantic cluster and finding matches to the known and missing argument types. For each set of argument type matches, the system attempts to find matching semantic clusters for the arguments, indicating a subnetwork within the MLN that is semantically isomorphic to the query and complete. If it finds such a match, it returns an answer filling in the missing information. Returned answers are compiled and printed out to a file in the eval directory called “Answers.txt.”
 
As of this date, there are still complications with returning answers to the generated test queries from the MLN knowledgebase. The MULTIVAC team continues to work to identify and resolve bugs and this prototype is continually evolving. Several considerations have impacted this effort, not least of which is the complexity involved in accepting an indefinitely broad range of potential questions and question types. The original system this interface is based on depended on receiving one of two types of rigorously formatted questions that were hand-curated.

In [None]:
import argparse
import corenlp
import os
import re
import time

from sortedcontainers import SortedDict, SortedSet

from multivac import settings
from multivac.pymln.eval import Answer, Question
from multivac.pymln.eval.USP import USP
from multivac.pymln import semantic
from multivac.pymln import syntax
from multivac.pymln.semantic import MLN, Part, Clust
from multivac.pymln.syntax.Nodes import Article, Sentence, Token
from multivac.pymln.syntax.Relations import RelType, ArgType
from multivac.pymln.utils import Utils


def run(verbose=False):
    
    if verbose:
        print("Reading questions from file... ")
    
    try:
        USP.readQuestions(verbose=verbose)
    except:
        time.sleep(1)
        USP.readQuestions(verbose=verbose)

    if verbose:
        print("Done.\n")
        print("Loading MLN knowledgebase... ")
    
    mln = MLN.load_mln("{}/mln.pkl".format(USP.resultDir), ret=True)

    if len(Clust.clusts) == 0:
        Clust.clusts = mln['clusts']

    if len(Clust.relTypeIdx_clustIdx) == 0:
        Clust.relTypeIdx_clustIdx = mln['relTypeIdx_clustIdx']

    if len(Part.rootNodeId_part) == 0:
        Part.rootNodeId_part = mln['rootNodeId_part']

    if len(Part.clustIdx_partRootNodeIds) == 0:
        Part.clustIdx_partRootNodeIds = mln['clustIdx_partRootNodeIds']

    if len(Part.pairClustIdxs_pairPartRootNodeIds) == 0:
        Part.pairClustIdxs_pairPartRootNodeIds = mln['pairClustIdxs_pairPartRootNodeIds']

    if verbose:
        print("Done.\n")
        print("Analyzing knowledgebase... ")
    
    USP.readClust()
    USP.readPart()
    USP.preprocArgs()

    if verbose:
        print("Done.\n")
        print("Finding answers... ")
    
    USP.match()
    USP.printAns()

    if verbose:
        print("Done.\n")
    
    return None


In [None]:
from multivac import settings
from multivac.pymln.eval.USP import USP
import sys
sys.modules['semantic'] = semantic
sys.modules['syntax'] = syntax


# Default argument values
params = {'eval_dir': settings.models_dir,
          'results_dir': settings.mln_dir,
          'query_file': 'output_questions_QG-Net.pt.txt'}

USP.query_file = params['query_file']

if os.path.isabs(params['results_dir']):
    USP.resultDir = params['results_dir']
else:
    USP.resultDir = os.path.join(os.getcwd(), params['results_dir'])

if os.path.isabs(params['eval_dir']):
    USP.evalDir = params['eval_dir']
else:
    USP.evalDir = os.path.join(os.getcwd(), params['eval_dir'])

run(verbose=True)


## <a name='gan'>NEXT STEPS: Generative Adversarial Networks (GANs)</a>
### Initial planning
As a final step in developing its machine-assisted inference capabilities, MULTIVAC will train a Generative Adversarial Network (GAN) to produce well-formed, novel expert queries without human intervention. GANs comprise two main components, the generator and the discriminator. The more traditional discriminator network is a standard convolutional neural network that learns the boundaries between classes — for instance, well-formed expert queries and nonsense queries — by training on real-world examples. The generator network is an inverse convolutional network that models the distribution of individual classes in terms of their features. Thus, the generator network generates new query instances, while the discriminator evaluates them for validity.
 
The discriminator network will be trained on the accrued library of queries generated by MULTIVAC and the human expert participants. Meanwhile, the generator will ingest models, parameters, factors and relationships from the MLN knowledgebase and return a "query" constructed from them. The generator network compiles the queries from the formulas in MULTIVAC’s MLN knowledgebase using Markov-Chains to mimic the semantic query grammars embedded there. This novel query is fed to the discriminator along with the existing set of curated expert queries. The discriminator considers both these real and generated queries and assigns probabilities of their authenticity, gradually learning to assign higher probabilities to "authentic" queries and lower ones to inauthentic queries.
 
GAN architectures are trained dialectically, first training the discriminator on the existing query library, then training the generator against a static discriminator. The discriminator is then trained again, accounting for examples on which it failed, and so on. MULTIVAC's discriminator will also be augmented by a "real-world" feedback loop; when the generator produces a query, the discriminator scores it, but the query is also submitted against the MLN knowledgebase. If it produces results, the query is added to the discriminator training set as a valid expert query, regardless of the initial score given by the discriminator. Thus, new queries and query types can be added to the training library from successful novel queries. In the final iteration, the system will include a hypothesis evaluation loop looking at the explanatory power of a given machine-generated hypothesis and weighting up those that are novel, have a potentially high explanatory power and are plausible in the current context. This GAN implementation will be written in Python leveraging the Keras API with a TensorFlow backend.

### Current status and next steps
The generator in a GAN model improves by making small changes to be able to fool the discriminator. This is particularly useful when we have images as input data. Making small changes in the color of a pixel is done by changing a number (typically from 0 to 255) that represents that pixel and adding or subtracting from this number is brightening or darkening the color of that pixel. We can also move that value to an adjacent pixel, which changes the shape in the picture. Ultimately, these small, but distinct, changes help the generator to develop similiar -- yet wrong -- pictures with which the discriminator trains. 
 
However, when dealing with text, the problem is more nuanced and complex. Machines do not understand texts as humans do, they understand numbers. So, we need to translate those texts into numbers that a machine can understand. We assign a number to each word (giving them a token ID). In essence, the computer -- and GAN -- needs input it recognizes to work upon.
 
This is not a big problem for the discriminator. For example, if we want to train a model that can say if a sentence has a positive or a negative tone, a machine can learn that there are positive and negative words. If there’s "happy" in the sentence it may have a positive tone and if it has "sad" in it, it may carry a negative tone.

But generators improve, by making small adjustments to their output. In essence, the notional example above of adjusting pixels does not work for text. Changing just a slight amount of text can, in essence, change the entire meaning of the word or the sentence. Moreover, it can make certain words no longer intelligible. This is a hard, and evolving, problem in the field. 

Since 2018, there have been some attempts to develop a GAN model for texts. Most of the previous models are only good for text completion or longer texts. These models are trying to address those limitations so it can produce sentences. However, they are still experimental, and their success is limited. Gallup's work is looking to study and identify the strengths and weaknesses at this stage of these approaches and in phase 2, looks to build out a GAN for text produced out the QG-Net and MLN portions of MULTIVAC.

![MULTIVAC Schematic](images/MULTIVAC_schematic.png)