# Experimenting with LLAMA-2 (Best Results in the TELLER Paper) For Generating Logic-Atom Vector Encodings: Practicing with the WELFake News Dataset


In this notebook, I will experiment with how to deploy the LLAMA-2 transformer-based model to answer yes-no questions about a news text to output a vector of probabilities for each answer being "yes", representing a set of claims about the news text. In their TELLER paper, Liu et al. received much higher performance scores by first encoding the news texts as vectors of these "logic atoms" before passing the vectors into a Disjunctive Normal Form neural network than by simply querying a LLM directly about whether a news article was fake or not. Their highest scores were achieved using the "open" (meaning it allows user access to final logits for each yes/no answer) and free LLAMA-2 LLM, thus I will test this LLM below to see if it is feasible and efficient to encode news texts as these logic-vectors for classification with either an ML-based classifier or a neural decision system.

## Environment Setup

First, we have to make some installs and imports to work with HuggingFace transformers. This library also requires an access key for some models which has been stored as a "secret" in the Jupyter Notebook in Colab.

In [None]:
# Mountsthe Google Drive for Colab access to dataset files...
from google.colab import drive
drive.mount("/content/drive")

# Setup path to the right base folder with the training data files in Google Drive
root_path = "/content/drive/My Drive/NeuroSymbolic_FND/"

Mounted at /content/drive


In [None]:
# Installs the required Python packages: HuggingFace "transformers", bitsandbytes and accelerate for improved model efficiency
!pip install transformers bitsandbytes accelerate

Collecting bitsandbytes
  Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl (69.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.1/69.1 MB[0m [31m31.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.45.0


In [None]:
# Imports required libraries
import os
import pandas as pd
import numpy as np
import time # Measures how long it takes for models to generate response to questions
import textwrap
import torch
import transformers # For pipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig # Main classes for loading transformers
from google.colab import userdata # For importing HuggingFace access token
from huggingface_hub import login # For logging into HuggingFace with access token

In [None]:
# Sets up the path to access HuggingFace

# Sets up access token in HuggingFace to be able to use LLAMA-2 LLM model
hf_access_token = userdata.get("HF_TOKEN")

# Logs in to HuggingFace
login(token=hf_access_token)

# Loads in the WELFake TRAIN dataset for experimenting with LLAMA model
train_path = os.path.join(root_path, "clean_train_wf.csv")

# Check if the GPU is available in Colab for accelerated LLM usage
print(f"Is Runtime connected to GPU?: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"Notebook is running on : {torch.cuda.get_device_name(0)} GPU")

Is Runtime connected to GPU?: True
Notebook is running on : Tesla T4 GPU


## Configuring and Loading the LLAMA-2 Model



### Applying Quantization with BitsAndBytes to transformers LLM model: [reference to docs](https://docs.vllm.ai/en/latest/quantization/bnb.html)
- BitsAndBytes is a library that "quantizes" models in order to reduce memory usage and enhance performance without significantly sacrificing accuracy.
- Running a large LLM like LLAMA-2 on limited GPU availability in Colab requires ways to reduce the memory requirements of the model to avoid getting OOM (Out-of-Memory) error, which was a frequent problem.
- Basically, model parameters (weights) are stored as 4-bit floating point numbers instead of 32-bit floating point numbers in order to save on space requirements.
- The BitsAndBytes library enables reducing memory consumption through this conversion while trying to maintain precision; leading to an optimized trade-off between efficiency/cost and accuracy/performance
- Without this, it is very difficult to use the model at all without an industrial-grade GPU.
- The nf4 quantization type has been suggested as optimal for Justification fir NLP based tasks: [reference](https://medium.com/@dillipprasad60/qlora-explained-a-deep-dive-into-parametric-efficient-fine-tuning-in-large-language-models-llms-c1a4794b1766) --> "4-bit NormalFloat perform slightly better performance than float4 datatype."


In [None]:
# Loads up the LLAMA-2 model with BitsAndBytes quantization to save on memory

def setUpLLMModel(model_name="meta-llama/Llama-2-7b-hf"): # Loads in LLAMA-2 7-billion weight size LLM model for outputting answers
  """
    Configures and sets up a Hugging Face transformer model and appropriate tokenizer.

        Input Parameters:
          model_name (str): name of the model on HuggingFace
    
        Output:
          tokenizer: the correct tokenizer (from HuggingFace) for the transformer model to properly encode text samples
          model: the loaded transformer model from HuggingFace specified by model_name input parameter
  """

  # Configures the quantization with BitsAndBytes in order to reduce the memory requirements
  bnb_quantization_config = BitsAndBytesConfig(
      load_in_4bit=True, # Uses 4-bit floats for model weights instead of default 32-bit to reduce model weight size
      bnb_4bit_quant_type="nf4", # Uses normalized float 4 quantization (better for NLP)
      bnb_4bit_compute_dtype=torch.float16, # Uses 16-bit floats DURING computation for increased precision
      bnb_4bit_use_double_quant=False # Tries first without using double quantization which sacrifices performance score
  )

  # Sets the device_map to use the GPU only if available, else set the device to use the CPU
  device_map = "auto" if torch.cuda.is_available() else "cpu"

  # Loads in the pre-trained tokenizer and the specified LLM model (default: LLAMA2 7B)
  tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token=hf_access_token)

  # Initializes and passes the configuration into the model
  model = AutoModelForCausalLM.from_pretrained(
      model_name,
      quantization_config=bnb_quantization_config, # Inputs the BitsAndBytes quantization config
      device_map=device_map, # Uses GPU if available, else uses CPU as fallback option
      torch_dtype=torch.float16, # Sets the PyTorch computational precision from 32-bit to 16-bit, also for reduced memory requirements
      use_auth_token=hf_access_token # Passes in the HuggingFace token for authorization
  )

  # Sets an EOS padding token to signal padding for shorter-length texts; statements can have different lengths, need to be the same size
  # StackOverflow ref: https://stackoverflow.com/questions/70544129/transformers-asking-to-pad-but-the-tokenizer-does-not-have-a-padding-token
  if tokenizer.pad_token is None:
      tokenizer.pad_token = tokenizer.eos_token

  return tokenizer, model

In [None]:
# Tests out if the LLAMA model initialization is successful
tokenizer, model = setUpLLMModel()
print("LLAMA-2 model loaded successfully!")



tokenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

LLAMA-2 model loaded successfully!


## Defining a Function for Simple Text Generation Based on a Prompt-Template

In this approach, news texts will be wrapped in "prompt templates" containing a question (analogous to the fact-checking questions listed in the *TELLER* paper by Liu et al., [2024]). The prompt template also gives the LLM instructions for how to answer (to return YES/NO responses only, which correspond to the question's "logic atoms"or truth values), and a placeholder for where to insert its (the model's) answer.

In [None]:
def generateSimpleText(
                        prompt, tokenizer, model, device="cuda", max_new_tokens=10,
                        do_sample=False, temperature=0.2, top_k=20, top_p=0.9
                      ):
    """
      Wraps the transformer text generation process. It tokenizes the prompt template
      using the tokenizer corresponding to the model, puts them on the GPU if this is possible,
      and calls "model.generate" with a specific configuration to output the LLM (LLAMA)'s response
      to the constructed prompt template.

      Input Parameters

        prompt (str): prompt template filled in with context (news article) and the question corresponding to 
                      a TELLER question template or predicate
        tokenizer: transformers library tokenizer object for encoding the inputted prompt to tokens learned 
                   from LLM pre-training/like indices
        model: the loaded LLM model to use for text generation
        device (str): which device (CPU or GPU) to use, for placing input tensors on the same device as the model
        max_new_tokens (int): the maximum threshold for tokens the LLM should generate. Default is 5 for YES/NO answers (might
                              contain end-of-text tokens, or spaces, so it's required to ensure that enough are generated).
        do_sample (Boolean): whether to sample next-possible-tokens (during the language modelling task) from probability of
                            possible tokens (True), or to always use the most probable token (False). Set default to False
                            as we do not want high creativity/originality, but predictable and deterministic factual responses.

      Output:
        response (str): decoded model output as text, in response to the question prompt

    """
    # Encodes the input: returns input_ids (like token indexes learned during LLM pre-training) and attention_mask indicating which
    #tokens should be disregarded for differing length inputs
    encoded_torch_inputs = tokenizer(
        prompt,
        return_tensors="pt", # pt = returns PyTorch tensors
        add_special_tokens=True # Adds special tokens (<|endoftext|> for LLaMa decoder model), marks the end of the text
      )


    # Places the tokenized inputs (PyTorch tensors) onto the device which should be a GPU
    encoded_torch_inputs = {k: v.to(device) for k, v in encoded_torch_inputs.items()} 

    # If do_sample is True, this means that the model can sample from top k possible next-token outputs...
    if do_sample:
       # Generates the text with parameters optimized for more deterministic, factual, predictable responses
      outputs = model.generate(
          input_ids=encoded_torch_inputs["input_ids"], # Enters the tokenizer input IDs
          attention_mask=encoded_torch_inputs["attention_mask"], # Enters the attention mask to adapt to different-length news texts
          max_new_tokens=max_new_tokens, # A cap on how many tokens the decoder model should output; need short answers
          # Configures the text generation parameters for how deterministic the responses are
          do_sample=do_sample,     # Default: False, for more factual and deterministic responses
          num_return_sequences=1,   # Outputs only ONE answer to the question template
          temperature=temperature, # Determines how "creative" the responses are
          top_k=top_k, # Determines how many potential "next tokens" the model should sample
          top_p=top_p # Determines the probability that the token should have to be considered as the next output
      )
    else: # If does not sample from next possible token probability, do this instead
      outputs = model.generate(
        input_ids=encoded_torch_inputs["input_ids"], # Enters the tokenizer's nput IDs
        attention_mask=encoded_torch_inputs["attention_mask"], # Enters the attention mask for different-length texts
        max_new_tokens=max_new_tokens,
        do_sample=do_sample
    )

    # Extracts the first output at index 0 (as there is only one), converts output to a list of word tokens 
    # Skips special tokens like CLS and EOL, and decodes the text
    response = tokenizer.decode(outputs[0].tolist(), skip_special_tokens=True)

    # Returns the textual response to the question prompt
    return response

In [None]:
# Loads in the training WELFake dataset as a pandas DataFrame to test LLAMA model 
welfake_train_df = pd.read_csv(train_path)

# Displays the first few samples
welfake_train_df.head()

Unnamed: 0,id,title,text,label
0,56051,"The Politics of Death: Cancer and Politics, a ...",License DMCA This is not about how politics co...,1
1,30084,Governor-Elect Of Kentucky Tells The EPA To Go...,States have rights too! We love the new conser...,1
2,40781,ARE YOU READY FOR JOE? 91% Of Obama-Biden Bund...,"Bernie, Hillary and Joe a low information vote...",1
3,64772,"Trump win, Democratic setbacks cloud Pelosi's ...",WASHINGTON (Reuters) - Nancy Pelosi may face a...,0
4,67872,Investigators ask White House for details on F...,WASHINGTON (Reuters) - The special counsel inv...,0


In [None]:
# Extracts an example news text and prints the label
example_real_news_text = welfake_train_df.iloc[245]["text"]
example_real_news_label = welfake_train_df.iloc[245]["label"]

example_real_news_category = "Fake News" if example_real_news_label == 1 else "Real News"

#  Wraps the example news text to 80 chars per line to make it readable
wrapped_example_text = textwrap.fill(example_real_news_text, width=80)
print(f"{wrapped_example_text}\n")
print(f"Label: {example_real_news_category}")

BRUSSELS (Reuters) - A leading European rights watchdog called on Turkey on
Friday to ease post-coup state of emergency laws that have seen thousands
arrested and restore power to regional authorities. President Tayyip Erdogan has
overseen a mass purge in the armed forces and the judiciary, as well as a
crackdown on critics including academics and journalists since a failed military
coup in July last year.  An advisory body to the Council of Europe, of which
Turkey is a member, acknowledged in a report  the need for certain extraordinary
steps taken by Turkish authorities to face a dangerous armed conspiracy .
However...Turkish authorities have interpreted these extraordinary powers too
extensively,  said the experts, known as the Venice Commission, in an opinion
that has no legal force. It urged Ankara to lift laws allowing it to pick
mayors, deputy mayors and members of local councils outside of local elections,
a reference to rules the Turkish government has used to replace local pr

In [None]:
# Creates the "context" (i.e. the news text) for the prompt-template to input to the LLM
example_context = example_real_news_text

# Creates the question (based loosely off the questions in the TELLER paper about
# having sufficient background information to support news claims). This
# is a typical reasoning step used by human fact-checking experts to flag news
# texts as potential disinformation if the answer is negative.
example_question = "Does this news text provide sufficient background information to support the claims being made in it?"

# Creates the real news prompt template
example_prompt_template = f"""
    News Text: {example_context}

    Question: {example_question}

    Please respond ONLY with "Yes" or "No". Answer the question based strictly on the news text.
    Answer:
"""

# Prints the example template that will be entered into the LLaMA-2 model for text generation
print(example_prompt_template)


    News Text: BRUSSELS (Reuters) - A leading European rights watchdog called on Turkey on Friday to ease post-coup state of emergency laws that have seen thousands arrested and restore power to regional authorities. President Tayyip Erdogan has overseen a mass purge in the armed forces and the judiciary, as well as a crackdown on critics including academics and journalists since a failed military coup in July last year.  An advisory body to the Council of Europe, of which Turkey is a member, acknowledged in a report  the need for certain extraordinary steps taken by Turkish authorities to face a dangerous armed conspiracy .  However...Turkish authorities have interpreted these extraordinary powers too extensively,  said the experts, known as the Venice Commission, in an opinion that has no legal force. It urged Ankara to lift laws allowing it to pick mayors, deputy mayors and members of local councils outside of local elections, a reference to rules the Turkish government has used to

In [None]:
# Starts the timer to see how long answer generation takes
start_time = time.time()

# Uses the LLaMA-2 model to get answer to question template wrapping the news text
# This should be a YES or NO answer
example_result_real_news = generateSimpleText(example_prompt_template, tokenizer, model)

# Prints the answer
print(example_result_real_news.upper())

# Finishes timing
end_time = time.time()

# Calculates the time elapsed in seconds
time_elapsed = end_time - start_time

print("Time taken was", time_elapsed, "seconds")




    NEWS TEXT: DESPITE PROMISING TO RELEASE HIS TAX RETURNS ALL THROUGHOUT HIS PRESIDENTIAL CAMPAIGN, DONALD TRUMP STILL REFUSES TO MAKE GOOD ON HIS PROMISE TO CONCERNED AMERICANS. INSTEAD, HE S GOTTEN HIS LAWYERS TO MAKE A STATEMENT ABOUT THE LAST TEN YEARS OF HIS TAX RETURNS, AND IT WENT PRETTY MUCH EXACTLY HOW WE THOUGHT IT WOULD   IT WAS A DISASTER.RAISING MORE QUESTIONS THAN THEY ACTUALLY ANSWERED, TRUMP S LAWYERS TOLD REPORTERS THAT TRUMP S TAX RETURNS DON T REFLECT ANY INCOME OF ANY TYPE FROM RUSSIAN SOURCES,  HOWEVER THERE ARE  A FEW EXCEPTIONS. ACCORDING TO THE ASSOCIATED PRESS, TRUMP S LAWYERS SENT A LETTER STATING THAT THE DISHONEST, SHADY POTUS DIDN T OWE ANY MONEY TO RUSSIAN LENDERS AND HAD NO EQUITY INVESTMENT BY RUSSIANS IN ENTITIES CONTROLLED BY TRUMP.  OF COURSE, THESE LAWYERS DIDN T PROVIDE ANY COPIES OF THE TAX RETURNS TO VERIFY THIS, SO IT S VERY POSSIBLE THAT THIS INFORMATION IS FALSE. HOWEVER, THE LETTER DID STATE THAT TRUMP RECEIVED  INCOME FROM THE 2013 MISS UN

While 0.91 seconds might seem fast, we have datasets with about 700,000 samples, so encoding all the texts with *multiple* question templates would take over (0.807 * 700,000 seconds) 150 hours (the amount for just a single question template).

In [None]:
# Does the same for a fake news example text: test out the response to the prompt template but use fake-news as the example text
example_fake_news_text = welfake_train_df.iloc[384]["text"] # Extracts text
example_fake_news_label = welfake_train_df.iloc[384]["label"] # Extracts its label

example_fake_news_category = "Fake News" if example_fake_news_label == 1 else "Real News" # Converts the integer label to text

# Wraps the text to 80-chars-per-line for easier inspection
wrapped_fake_example_text = textwrap.fill(example_fake_news_text, width=80) 
print(wrapped_fake_example_text, "\n")
print(example_fake_news_category)

Despite promising to release his tax returns all throughout his presidential
campaign, Donald Trump still refuses to make good on his promise to concerned
Americans. Instead, he s gotten his lawyers to make a statement about the last
ten years of his tax returns, and it went pretty much exactly how we thought it
would   it was a disaster.Raising more questions than they actually answered,
Trump s lawyers told reporters that Trump s tax returns don t reflect any income
of any type from Russian sources,  however there are  a few exceptions.
According to the Associated Press, Trump s lawyers sent a letter stating that
the dishonest, shady POTUS didn t owe any money to Russian lenders and had no
equity investment by Russians in entities controlled by Trump.  Of course, these
lawyers didn t provide any copies of the tax returns to verify this, so it s
very possible that this information is false. However, the letter did state that
Trump received  income from the 2013 Miss Universe pageant h

In [None]:
example_context2 = example_fake_news_text

# Creates a new prompt template for the LLM, containing the fake news text as the "context" this time, but same question
example_fake_news_prompt_template = f"""
    News Text: {example_context2}

    Question: {example_question}

    Please respond ONLY with "Yes" or "No". Answer the question based strictly on the news text.
    Answer:
"""

In [None]:
# Generates the result for the fake news example
result_fake_news = generateSimpleText(example_fake_news_prompt_template, tokenizer, model)
print(result_fake_news)


    News Text: Despite promising to release his tax returns all throughout his presidential campaign, Donald Trump still refuses to make good on his promise to concerned Americans. Instead, he s gotten his lawyers to make a statement about the last ten years of his tax returns, and it went pretty much exactly how we thought it would   it was a disaster.Raising more questions than they actually answered, Trump s lawyers told reporters that Trump s tax returns don t reflect any income of any type from Russian sources,  however there are  a few exceptions. According to the Associated Press, Trump s lawyers sent a letter stating that the dishonest, shady POTUS didn t owe any money to Russian lenders and had no equity investment by Russians in entities controlled by Trump.  Of course, these lawyers didn t provide any copies of the tax returns to verify this, so it s very possible that this information is false. However, the letter did state that Trump received  income from the 2013 Miss Un

Although the news text is fake, and does not contain much background information to support the claims in it, the model still answers yes. Therefore, it is necessary to investigate whether the model ever actually answers "no" to a question template, even when the answer is obviously negative. This will be tested below:

In [None]:
# Tests if the model ever answers NO rather than a default YES to everything by asking questions that should have obvious "no" responses

example_question2 = "Is this news text about the environment and climate change?"

example_fake_news_prompt_template2 = f"""
    News Text: {example_context2}

    Question: {example_question2}

    Please respond ONLY with "Yes" or "No". Answer the question based strictly on the news text.
    Answer:
"""

example_question3 = "Is this news text about sports?"

example_fake_news_prompt_template3 = f"""
    News Text: {example_context2}

    Question: {example_question3}

    Please respond ONLY with "Yes" or "No". Answer the question based strictly on the news text.
    Answer:
"""

example_question4 = "Is this text a short, snappy headline with only a few words in it?"

example_fake_news_prompt_template4 = f"""
    News Text: {example_context2}

    Question: {example_question4}

    Please respond ONLY with "Yes" or "No". Answer the question based strictly on the news text.
    Answer:
"""

In [None]:
# Tests if the obvious answers are "no"
result_fake_news2 = generateSimpleText(example_fake_news_prompt_template2, tokenizer, model)
print(result_fake_news2)


    News Text: Despite promising to release his tax returns all throughout his presidential campaign, Donald Trump still refuses to make good on his promise to concerned Americans. Instead, he s gotten his lawyers to make a statement about the last ten years of his tax returns, and it went pretty much exactly how we thought it would   it was a disaster.Raising more questions than they actually answered, Trump s lawyers told reporters that Trump s tax returns don t reflect any income of any type from Russian sources,  however there are  a few exceptions. According to the Associated Press, Trump s lawyers sent a letter stating that the dishonest, shady POTUS didn t owe any money to Russian lenders and had no equity investment by Russians in entities controlled by Trump.  Of course, these lawyers didn t provide any copies of the tax returns to verify this, so it s very possible that this information is false. However, the letter did state that Trump received  income from the 2013 Miss Un

The model answers that the text about Donald Trump's relationship with Russia is "about the environment and climate change" when it is clearly not, and this should be an easy question to answer.

In [None]:
result_fake_news3 = generateSimpleText(example_fake_news_prompt_template3, tokenizer, model)
print(result_fake_news3)


    News Text: Despite promising to release his tax returns all throughout his presidential campaign, Donald Trump still refuses to make good on his promise to concerned Americans. Instead, he s gotten his lawyers to make a statement about the last ten years of his tax returns, and it went pretty much exactly how we thought it would   it was a disaster.Raising more questions than they actually answered, Trump s lawyers told reporters that Trump s tax returns don t reflect any income of any type from Russian sources,  however there are  a few exceptions. According to the Associated Press, Trump s lawyers sent a letter stating that the dishonest, shady POTUS didn t owe any money to Russian lenders and had no equity investment by Russians in entities controlled by Trump.  Of course, these lawyers didn t provide any copies of the tax returns to verify this, so it s very possible that this information is false. However, the letter did state that Trump received  income from the 2013 Miss Un

Once again, the model has answered "yes" when the answer to whether this political news text is about sports should clearly be "no".

In [None]:
result_fake_news4 = generateSimpleText(example_fake_news_prompt_template4, tokenizer, model)
print(result_fake_news4)


    News Text: Despite promising to release his tax returns all throughout his presidential campaign, Donald Trump still refuses to make good on his promise to concerned Americans. Instead, he s gotten his lawyers to make a statement about the last ten years of his tax returns, and it went pretty much exactly how we thought it would   it was a disaster.Raising more questions than they actually answered, Trump s lawyers told reporters that Trump s tax returns don t reflect any income of any type from Russian sources,  however there are  a few exceptions. According to the Associated Press, Trump s lawyers sent a letter stating that the dishonest, shady POTUS didn t owe any money to Russian lenders and had no equity investment by Russians in entities controlled by Trump.  Of course, these lawyers didn t provide any copies of the tax returns to verify this, so it s very possible that this information is false. However, the letter did state that Trump received  income from the 2013 Miss Un

Asking whether the moderately-long article only has a few words in it (clearly false) also leads to returning "yes". Therefore, the model is clearly returning the incorrect answer. This is a problem. Perhaps the LLAMA-2 model used in the TELLER paper was not suited to long-form texts, but the authors mention using the PolitiFact datase which contains relatively long news articles!


The next step is to try the model with more probabilistic, more creative configuration by adjusting the sampling, temperature and top number of possible tokens (top_k) parameter values. Perhaps this will yield improved answers.

In [None]:
# Experiments with more probabilistic, creative LLAMA-2 configuration

start_time = time.time()

result_fake_news2_probabilistic = generateSimpleText(example_fake_news_prompt_template2, tokenizer, model, do_sample=True,
                                                    temperature=0.4, top_k=20)
print(result_fake_news2_probabilistic)

result_fake_news3_probabilistic = generateSimpleText(example_fake_news_prompt_template3, tokenizer, model, do_sample=True,
                                                    temperature=0.4, top_k=20)
print(result_fake_news3_probabilistic)

result_fake_news4_probabilistic = generateSimpleText(example_fake_news_prompt_template4, tokenizer, model, do_sample=True,
                                                    temperature=0.4, top_k=20)
print(result_fake_news4_probabilistic)

end_time = time.time()
time_elapsed = end_time - start_time
print(time_elapsed, "seconds")


    News Text: Despite promising to release his tax returns all throughout his presidential campaign, Donald Trump still refuses to make good on his promise to concerned Americans. Instead, he s gotten his lawyers to make a statement about the last ten years of his tax returns, and it went pretty much exactly how we thought it would   it was a disaster.Raising more questions than they actually answered, Trump s lawyers told reporters that Trump s tax returns don t reflect any income of any type from Russian sources,  however there are  a few exceptions. According to the Associated Press, Trump s lawyers sent a letter stating that the dishonest, shady POTUS didn t owe any money to Russian lenders and had no equity investment by Russians in entities controlled by Trump.  Of course, these lawyers didn t provide any copies of the tax returns to verify this, so it s very possible that this information is false. However, the letter did state that Trump received  income from the 2013 Miss Un

## Conclusions

Unfortunately, the experiments with this model for outputting yes/no (i.e. 1/0) answers to basic questions about the news texts have not been too promising for the following reasons. Despite adjusting the parameters for the model, it always seems to output "yes" even to questions about the context which are evidently false. Furthermore, outputting *multiple* truth-values to these questions for an immense dataset such as WELFake, Fakeddit, Constraint, and PolitiFact combined (for the five-shot and few-shot scenarios) would take many days if not weeks of training time. As a result, I need to reevaluate the approach I will take towards creating a more generalizable and explainable fake news detection system in the coming weeks, and design a different solution.

************