**IMPLEMENTATION OF BEST WORKING PROMPT FOR GENERATING COUNTER NARRATIVE BASED STORIES FROM GIVEN CLAIM - FACT PAIRS USING PHI3 MODEL**







*   MODEL NAME : microsoft/Phi-3-mini-128k-instruct
*   LINK: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct



In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import pandas as pd

# Setting the path to CSV file
csv_path = "/content/Copy of Prompt outcomes for America Election Stories - Sheet1.csv"
# Loading the model and tokenizer
torch.random.manual_seed(0)
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-128k-instruct",
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")

# Initializing the text generation pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

# Setting generation arguments
generation_args = {
    "max_new_tokens": 2000,
    "return_full_text": False,
    "temperature": 0.7,         # to not let model hallucinate too much
    "do_sample": False,
}

# Loading the CSV file
data = pd.read_csv(csv_path)

# Iterating over each row in the CSV and process claims and facts
for index, row in data.iterrows():
    claim = row['Claim']
    facts = row['Fact']

    # Generating the prompt for the current claim and facts
    prompt = f"""<|system|>
You are a persuasive and factually accurate story-generating assistant with a journalistic rigor.
<|end|>
<|user|>
Given a claim and corresponding facts, generate a summarized well-structured story that meets the following criteria:

- Refute the claim using every provided fact, along with additional context from historically accurate trends and statistics.
- Limit the response to a maximum of 250 words.
- Use a strongly critical tone, directly highlighting the flaws in the claim and emphasizing its misleading nature.
- Include relevant background information, citing historically significant economic events or definitions (e.g., major inflation periods or economic milestones
- When writing about public figures, always use their correct title based on their current or former status:
     Example: Use “former President” for Donald Trump, who no longer holds office, and “President” for Joe Biden, who is currently in office.
     Similarly, for any retired officials, use titles like “former” or “ex-” as appropriate.
- The narrative-based summarized story should be coherent and persuasive, leaving no room for doubt about the validity of the refutation.
- The summarized story must follow a clear 3-paragraph structure:

  Paragraph1- Begin with historical background or context around the event or term in question.
  Paragraph2- Progress to opposing the claim in a detailed manner, referencing the provided facts and relevant historical events to refute it.
  Paragraph3- Conclude with a strong, fact-based analysis that highlights the flaws in the claim and clarifies the reality of inflation trends.

Claim: {claim}
Facts: {facts}

<|end|>
<|assistant|>
"""

    # Generate the counter-narrative
    output = pipe(prompt, **generation_args)
    print(f"Counter-narrative for claim {index + 1}:\n", output[0]['generated_text'])
    print("\n" + "="*80 + "\n")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/3.48k [00:00<?, ?B/s]

configuration_phi3.py:   0%|          | 0.00/11.2k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-128k-instruct:
- configuration_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_phi3.py:   0%|          | 0.00/73.2k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-128k-instruct:
- modeling_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors.index.json:   0%|          | 0.00/16.3k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.67G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/181 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/3.44k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.94M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/306 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/599 [00:00<?, ?B/s]

The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.
`get_max_cache()` is deprecated for all Cache classes. Use `get_max_cache_shape()` instead. Calling `get_max_cache()` will raise error from v4.48


Counter-narrative for claim 1:
  In the wake of the 2008 financial crisis, the United States has witnessed a rollercoaster of economic fluctuations, with inflation rates that have both soared and subsided. The claim by former President Trump, who stated that the inflation rate was the worst in 100 years, is a misleading assertion that fails to align with historical data and economic trends.


Contrary to Trump's assertion, the highest inflation rate recorded during the Biden presidency was 9.1% in June 2022, a figure that pales in comparison to the inflation rates of the early 1980s. This peak, while significant, occurred over 40 to 41 years ago, far from the "100 years" Trump referenced. Moreover, the inflation rate has since shown a marked decline, with the July 2024 rate at 3.2%, a figure that not only surpasses the Biden era's peak but also echoes rates seen in 2011.


The flawed nature of Trump's claim is evident when examining the historical context and current economic indicator

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


Counter-narrative for claim 10:
  In the tumultuous arena of American politics, claims of election subversion have periodically surfaced, often casting long shadows over the democratic process. The former President, Donald Trump, has been a vocal proponent of such narratives, alleging that the Biden administration orchestrated legal actions against him in various jurisdictions. These claims, however, are not only baseless but also detrimental to the integrity of the electoral system.

The Manhattan case, where Trump was found guilty, is a prime example of the fallacy in Trump's assertions. Attorney General Merrick Garland's testimony to Congress clarified that the Manhattan district attorney operates independently of the federal government, including the Justice Department. This autonomy is a cornerstone of the American legal system, ensuring that local jurisdictions can enforce state laws without federal interference. Furthermore, Matthew Colangelo, the lawyer Trump repeatedly cited, 

**RESULTS**



* **90% cases**: The model generates stories within the specified word limit.

* **85% cases**: The model selects appropriate titles for public figures as per the guidelines.

* **100% cases**: The model adheres to the given story structure.
* **100% cases**: The generated stories exhibit coherence.








