# Using PaLM to generate false statements

The aim is to generate a set of true and false statements to test fact-checking approaches. We need interesting statements and the false statements need to be plausible.

For the true statements we use [TriviaQA](http://nlp.cs.washington.edu/triviaqa/) which is a dataset of questions and answers that is meant for reading comprehension and question answering. 

To generate true and false statements, we use PaLM to 1. combine the question and answer into a statement 2. generate a fake answer similar to the true answer (assume the true answer is unique and not equivalent to the fake one) 3. combine the question and fake answer into a false statement.

## Difference with the chatGPT version

It turns out that PaLM does not always produce well formed JSON. For instance, 
~~~
'[{"question": "Which Lloyd Webber musical premiered in the US on 10th December 1993?", "answer": "Sunset Boulevard", "statement": "Sunset Boulevard premiered in the US on 10th December 1993.", "fake_answer": "Cats", "fake_statement": "Cats premiered in the US on 10th December 1993."}]}'
~~~
There is an extra '}' at the end of the list

I then requested that it generate labeled newline separated text like:
~~~
'Question: In the very first Boston Marathon, 15 runners competed. How many finished?\nAnswer: eighty-five thousand  distance\n\nStatement: The very first Boston Marathon had eighty-five thousand runners.\nFake answer: 15\nFake statement: The very first Boston Marathon had 15 runners.
~~~

However, the labels are not always as requested, and not all labels are always present. 

These behaviors of PaLM have to be handled in the script.


In [39]:
import google.generativeai as palm
import google.generativeai.types.safety_types as safety_types
import json
import os
import re
import logging
import time
import random
from Levenshtein import distance as levDist

api_key = os.environ['PALM_API_KEY']
palm.configure(api_key=api_key)
MODEL = 'models/text-bison-001'
BASEDIR = os.getcwd()
DATADIR = os.path.join(BASEDIR, '../datasets/triviaqa-unfiltered')
OUTDIR = os.path.join(BASEDIR, '../procData/fakeGen/palm')
BATCHSIZE = 8
DUMPINTERVAL = 8
SLEEP = (1, 3)
# without using this, a number of requests fail
SAFETY = [
    {'category': safety_types.HarmCategory.HARM_CATEGORY_DEROGATORY,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
    {'category': safety_types.HarmCategory.HARM_CATEGORY_TOXICITY,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
    {'category': safety_types.HarmCategory.HARM_CATEGORY_VIOLENCE,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
    {'category': safety_types.HarmCategory.HARM_CATEGORY_SEXUAL,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
    {'category': safety_types.HarmCategory.HARM_CATEGORY_MEDICAL,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
    {'category': safety_types.HarmCategory.HARM_CATEGORY_DANGEROUS,
     'threshold': safety_types.HarmBlockThreshold.BLOCK_NONE,
    },
]

logging.basicConfig(filename=os.path.join(BASEDIR, 'fakeGen_palm.log'), 
                    level=logging.DEBUG)

def getQuery(queries: str) -> str:
    query = f"""This is a multi-step task.
         1. Given the  question and answer below, combine them to make a statement. Example:
         
         ("What is the fastest mammal?", "the cheetah")
         Statement: The cheetah is the fastest mammal.
         
         2. Given the answer, generate a fake answer in the same class. Example:

         ("What is the fastest mammal?", "the cheetah")
         Fake answer: the sloth

         3. Given the question and the fake answer, combine them to make a fake statement. Example:

         ("What is the fastest mammal?", "the cheetah")
         Fake answer: the sloth
         Fake statement: The sloth is the fastest mammal.

         4. Generate your response as a list of 'question', 'answer', 'statement', 'fake_answer' and 'fake_statement' as in the example:
         Question: What is the fastest mammal?
         Answer: the cheetah
         Statement: The cheetah is the fastest mammal.
         Fake answer: the sloth
         Fake statement: The sloth is the fastest mammal.
         
         List of question and answer:
         {queries}
         """
    return query

# assuming:
# all lines retuned have a label or are blank
# all labels are sepaated from the text by ':'
# it does not change labels in too strange a way
# if does not mix up differnet sets of answers
def makeDict(response):
    lod = []
    i = 0
    wd = {'question': None, 'answer': None, 'statement': None, 'fake_answer': None,
          'fake_statement': None}
    keys = list(wd.keys())
    lines  = response.split('\n')
    while i < len(lines):
        l = lines[i].strip()
        if len(l) == 0: 
            for k in keys:
                if wd[k] is not None: 
                    lod.append(wd)
                    break
            wd = {'question': None, 'answer': None, 'statement': None, 'fake_answer': None,
                  'fake_statement': None}
            i += 1
            continue  # assuming \n\n is a new record.
        key, val = getKeyVal(l, keys)
        if key is None:
            logging.error('Not found key: ' + l)
            i += 1
            continue
        if wd[key] is not None:
            lod.append(wd)
            wd = {'question': None, 'answer': None, 'statement': None, 'fake_answer': None,
                  'fake_statement': None}
            continue
        else:
            wd[key] = val
            i += 1
    for k in keys:
        if wd[k] is not None: lod.append(wd)
        break
    return lod

# since the labels can vary, use edit distance to pick the best label
def getKeyVal(line, keys):
    colLoc = line.find(':')
    if colLoc == -1: return None, None
    keyDist = [[k, levDist(k, line[:colLoc])] for k in keys]
    keyDist.sort(key = lambda x: x[1])
    val = line[colLoc + 1:].strip()
    return keyDist[0][0], val


def processQA(qaList: list[list[str]]):
    qAndAs = '\n\n'.join([f"Question: {qa[0]}\nAnswer: {qa[1]}" for qa in qaList])
    logging.info(f"processQA q&a:\n{qAndAs}")
    query = getQuery(qAndAs)
    listOfDicts = None
    try:
        palmResp = palm.generate_text(
            model=MODEL,
            prompt=query,
            temperature=0,
            candidate_count=1,
            safety_settings = SAFETY
        )
        logging.debug(str(palmResp))
        logging.debug('Result\n' + str(palmResp.result))
    except Exception as e:
        print('generate_text failed\n', e)
        logging.error('Palm failuer: ' + str(e))
        raise e
    if len(palmResp.candidates) == 0:
        logging.error('No candidates from palm')
        raise ValueError('No candidates')
    listOfDicts = makeDict(palmResp.result)
    if listOfDicts is None:
        logging.error('failed making dict')
        raise ValueError('failed making dict')
    if len(listOfDicts) < len(qaList):
        print('Too few responses')
        # redo individually if this was a batch
        raise ValueError('Missing responses')
    return listOfDicts
                 

In [40]:
result = """
Question: What star sign is Jamie Lee Curtis?
Answer: scorpio
Statement: Jamie Lee Curtis's star sign is scorpio.
Fake answer: gemini
Fake statement: Jamie Lee Curtis's star sign is gemini.

Question: What claimed the life of singer Kathleen Ferrier?
Answer: Stroke
Statement: Stroke claimed the life of singer Kathleen Ferrier.
Fake answer: Cancer
Fake statement: Cancer claimed the life of singer Kathleen Ferrier.

Question: How many different animal shapes are there in the Animal Crackers cookie zoo?
Answer: Eighteen--two bears , a bison, camel, cougar, elephant, giraffe, gorilla, hippopotamus, hyena , kangaroo, lion, monkey, rhinoceros, seal, sheep, tier, and zebra
Statement: There are eighteen different animal shapes in the Animal Crackers cookie zoo.
Fake answer: Twenty-four
Fake statement: There are twenty-four different animal shapes in the Animal Crackers cookie zoo.
"""

result2 = """Statement: Barbara Eden was the star of the film based on the record Harper Valley PTA by Jeannie C Riley.
         Fake answer: Barbara Walters
         Fake statement: Barbara Walters was the star of the film based on the record Harper Valley PTA by Jeannie C Riley.
"""

makeDict(result2)

[{'question': None,
  'answer': None,
  'statement': 'Barbara Eden was the star of the film based on the record Harper Valley PTA by Jeannie C Riley.',
  'fake_answer': 'Barbara Walters',
  'fake_statement': 'Barbara Walters was the star of the film based on the record Harper Valley PTA by Jeannie C Riley.'}]

In [None]:
processQA([
          ["What star sign is Jamie Lee Curtis?", "scorpio"],
          ["What claimed the life of singer Kathleen Ferrier?", "Stroke"],
          ["How many different animal shapes are there in the Animal Crackers cookie zoo?",
          "Eighteen--two bears , a bison, camel, cougar, elephant, giraffe, gorilla, hippopotamus, hyena , kangaroo, lion, monkey, rhinoceros, seal, sheep, tier, and zebra"],
          ])

## TriviaQA

This consists of questions and answers among other things, all human verified.  For instance:
~~~
        {
            "Answer": {
                "Aliases": [
                    "Niger Republic",
                    "Nigerois",
                    "Republic Of Niger",
                    "Republic of Niger",
                    "The Republic of Niger",
                    "Nigerien",
                    "Niger (country)",
                    "République du Niger",
                    "Republique du Niger",
                    "ISO 3166-1:NE",
                    "Niger",
                    "NG-NI"
                ],
                "MatchedWikiEntityName": "Niger",
                "NormalizedAliases": [
                    "republic of niger",
                    "niger republic",
                    "niger",
                    "république du niger",
                    "niger country",
                    "ng ni",
                    "republique du niger",
                    "nigerois",
                    "nigerien",
                    "iso 3166 1 ne"
                ],
                "NormalizedMatchedWikiEntityName": "niger",
                "NormalizedValue": "niger",
                "Type": "WikipediaEntity",
                "Value": "Niger"
            },
            "EntityPages": [
                {
                    "DocSource": "TagMe",
                    "Filename": "Niamey.txt",
                    "LinkProbability": "1.00000",
                    "Rho": "0.67254",
                    "Title": "Niamey"
                }
            ],
            "Question": "Of which African country is Niamey the capital?",
            "QuestionId": "tc_241",
            "QuestionSource": "http://www.triviacountry.com/",
            "SearchResults": [
                {
                    "Description": "Location Niamey is the capital and largest city of the West African ... Niamey is the capital and largest city of the West African country Niger. Niamey lies ...",
                    "DisplayUrl": "fortuneofafrica.com/niger/niamey-city",
                    "Filename": "166/166_2204158.txt",
                    "Rank": 0,
                    "Title": "Niamey City - Fortune of Africa Niger",
                    "Url": "http://fortuneofafrica.com/niger/niamey-city/"
                },
...
                {
                    "Description": "Country: Niger. Diocese of Niamey. Latin Nam
e: Niameyensis; Elevated: 21 March 1961; Immediately Subject to the Holy See Cou
ntry: Niger. ... Archdiocese of Niamey: 2013:",
                    "DisplayUrl": "www.catholic-hierarchy.org/diocese/dniam.html
",
                    "Rank": 45,
                    "Title": "Niamey (Archdiocese) [Catholic-Hierarchy]",
                    "Url": "http://www.catholic-hierarchy.org/diocese/dniam.html
"
                }
            ]
        },


~~~

We pick the MatchedWikiEntityName if available, else the first alias as answer and output a file of questions and answers.



In [None]:
# processing the triviaQA data

def batchGen(inFname: str) -> list[list[str]]:
    batch = []
    bcnt = 0
    with open(inFname, 'r') as ix:
        ds = json.load(ix)
        for d in ds['Data']:
            try:
                if 'MatchedWikiEntityName' in d['Answer']:
                    ans = d['Answer']['MatchedWikiEntityName']
                else: 
                    ans = d['Answer']['Aliases'][0]
                ans = re.sub('\(.*\)', '', ans)
                batch.append([d['Question'], ans])
                bcnt += 1
            except Exception as e:
                print(e)
                logging.error('BatchGen error: ' + str(e))
                continue
            if bcnt >= BATCHSIZE:
                yield batch
                batch = []
                bcnt = 0
        if bcnt > 0:
            yield batch


def genStmtFile(inFile: str, outFile:str, startBatch: int = 0, batchesToDo: int =-1) -> None:
    outData = []
    bgen = batchGen(inFile)
    done = False
    bcnt = -1
    dbcnt = 0
    ecnt = 0
    logging.info(f"-----------{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%s')}--------------------")
    while not done:
        time.sleep(random.random() * SLEEP[1] + SLEEP[0])
        if batchesToDo > 0 and dbcnt > batchesToDo : break
        try:
            batch = next(bgen)
            bcnt += 1
            if bcnt % 10 == 0: print(f"batch: {bcnt}")
            if bcnt < startBatch: continue
        except StopIteration:
            done = True
            continue
        try:
            respListDict = processQA(batch)
            outData.extend(respListDict)
            dbcnt += 1
            if dbcnt % DUMPINTERVAL == 0:
                with open(outFile, 'w') as ox:
                    json.dump(outData, ox, indent=4)
        except Exception as e:
            print('Exception: ', e)
            logging.error(f"processQA failed: {e}")
            print('switch to batch size 1')
            for p in batch:
                time.sleep(random.random() * SLEEP[1] + SLEEP[0])
                rld = None
                try:
                    rld = processQA([p])
                    outData.extend(rld)
                except Exception as e2:
                    print(e2)
                    logging.error(f"Error in single query: {p}\n{str(e2)}")
                    logging.error(f"__ABANDON__ {p}")
            ecnt += 1
            if ecnt > 3: break
    with open(outFile, 'w') as ox:
        json.dump(outData, ox, indent=4)
 
            
genStmtFile(os.path.join(DATADIR, 'unfiltered-web-dev.json'),
            os.path.join(OUTDIR, 'tf_qa-dev.json'),
            0, -1)
    

# Examples of output

Like with chatGPT, the fake answers are generally plausible and the statements generated are as expected.  However there are more unexpected behaviors in this case.

~~~

{\n              "question": "In what year\'s Olympics were electric timing devices and a public-address system used for the first time?",\n              "answer": "In 1912, in Stockholm",\n              "statement": "Electric timing devices and a public-address system were used for the first time in the 1912 Olympics in Stockholm.",\n              "fake_answer": "In 1920, in Antwerp",\n              "fake_statement": "Electric timing devices and a public-address system were used for the first time in the 1920 Olympics in Antwerp."\n            }
~~~
This looks good.

~~~
{\n              "question": "Why is the site of a boxing match called a ring when it\'s square?",\n              "answer": "Boxing rings were originally circular",\n              "statement": "Boxing rings were originally circular, which is why the site of a boxing match is called a ring even though it\'s square.",\n              "fake_answer": "Boxing rings are called rings because they are round",\n              "fake_statement": "Boxing rings are called rings because they are round, even though they are square."\n            }
~~~
It does not seem to see the inconsistency between a thing being a circle and a square.

~~~
{\n              "question": "Complete this Biblical quotation: ""It is easier for a camel to go through the eye of a needle, than..."",\n              "answer": "...for a rich man to enter into the kingdom of God. The words are those of Jesus, from Matthew 19:24",\n              "statement": "It is easier for a camel to go through the eye of a needle, than for a rich man to enter into the kingdom of God. The words are those of Jesus, from Matthew 19:24.",\n              "fake_answer": "...for a poor man to enter into the kingdom of God. The words are those of Jesus, from Matthew 19:24",\n              "fake_statement": "It is easier for a poor man to enter into the kingdom of God. The words are those of Jesus, from Matthew 19:24."\n            }
~~~
While it generates a good true statement from the true answer and question, and generates a good fake answer, it fails in assembling the fake sentence

~~~

{\n              "question": "In what state was playwright Tennessee Williams born?",\n              "answer": "In Mississippi, as Thomas Lanier Williams. He took the name Tennessee after his father\'s home state",\n              "statement": "Playwright Tennessee Williams was born in Mississippi, as Thomas Lanier Williams. He took the name Tennessee after his father\'s home state.",\n              "fake_answer": "In New York City",\n              "fake_statement": "Playwright Tennessee Williams was born in New York City."\n            },\n  
~~~
NYC is not in the same class as  Mississippi.


~~~
{\n              "question": "What was President Gerald Ford\'s middle name?",\n              "answer": "Rudolph",\n              "statement": "Rudolph is President Gerald Ford\'s middle name.",\n              "fake_answer": "the sloth",\n              "fake_statement": "The sloth is President Gerald Ford\'s middle name."\n            },
~~~
It sometimes just uses the example fake answer. 

~~~
{\n              "question": "In Greek mythology, who were Arges, Brontes and Steropes?",\n              "answer": "Cyclopes",\n              "statement": "Cyclopes were the names of Arges, Brontes and Steropes in Greek mythology.",\n              "fake_answer": "Titans",\n              "fake_statement": "Titans were the names of Arges, Brontes and Steropes in Greek mythology."\n            },
~~~
It gets the order wrong even though it seems to have figured names are involved.

~~~
{\n              "question": "In Greek mythology, what were the names of Oedipus\'s parents?",\n              "answer": "Laius, King of Thebes, and his queen, Jocasta",\n              "statement": "Laius, King of Thebes, and his queen, Jocasta were the names of Oedipus\'s parents in Greek mythology.",\n              "fake_answer": "Oedipus and Jocasta",\n              "fake_statement": "Oedipus and Jocasta were the names of Oedipus\'s parents in Greek mythology."\n            },
~~~
It makes Oedipus his own father.  Here too it does not seem to detect inconsistency in the statement. However, since this is supposed to be a false statement, it works.

~~~
{\n              "question": "Who or what was Gentle Ben in the 60s TV series?",\n              "answer": "Bear",\n              "statement": "Bear was the star of the 60s TV series Gentle Ben.",\n              "fake_answer": "Dog",\n              "fake_statement": "Dog was the star of the 60s TV series Gentle Ben."\n            }
~~~
It seems to take 'Bear' and 'Dog' as names.


