## Data prep for Bart Tuning

Use of GPT to produce synthetic news data for tuning bart-large-mnli for entailment, contradiction and neutral standing between premise and hypothesis, swapping to question and claim instead

In [1]:
from openai import OpenAI
from getpass import getpass
import json
import pandas as pd
from concurrent.futures import ThreadPoolExecutor, as_completed
from tqdm import tqdm

In [2]:
openai_key = getpass("Enter your API Key:")
client = OpenAI(api_key=openai_key)

GPT used to generate entailment, contradiction and neturality sentences against a list of questions 

In [3]:
def set_role(system_prompt, set_json=False):
    def get_completion(prompt, model="gpt-4o-mini"):
        messages = [{"role":"system", "content": system_prompt}, {"role": "user", "content": f"{prompt}"}]
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            temperature=0.3, # this is the degree of randomness of the model's output
        )
        return json.loads(response.choices[0].message.content) if set_json else response.choices[0].message.content
    
    return get_completion

## Train dataset

### Entailment

In [12]:
system_prompt = f"""
You are a news generator. Your task is to generate 7 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure they agree exactly with the question but make it vary in phrasing but not the values used
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_claims = set_role(system_prompt, set_json=True)


In [13]:
# Your existing list of questions
hypothesis_questions = [
    "Did Apple's revenue exceed $94 billion in Q2 2024?",
    "Was Indonesia's population 279 million by the end of 2023?",
    "Did Spain’s unemployment rate drop to 12.3% in early 2024?",
    "Did NASA spend more than $93 billion on the Artemis program by 2025?",
    "Has the average global temperature risen by 1.15°C since pre-industrial times?",
    "Were over 50,000 electric vehicles sold in Norway in the first half of 2024?",
    "Did Nigeria’s literacy rate reach 68.4% in 2023?",
    "Did annual rainfall in the Amazon basin fall by 17% compared to the 10-year average?",
    "Did Meta report 3.96 billion monthly active users in Q1 2025?",
    "Did Germany produce 248 terawatt-hours of renewable electricity in 2023?",
    "Did tuition fees at Harvard rise to $57,246 for the 2024 academic year?",
    "Did the Tokyo Stock Exchange index increase by 3.2% in March 2025?",
    "Did China’s military budget reach $230 billion in 2024?",
    "Did Singapore record 3,102 dengue cases in the first quarter of 2024?",
    "Was the U.S. federal deficit $1.6 trillion in fiscal year 2023?",
    "Did global smartphone sales fall below 1.2 billion units in 2024?",
    "Were there 12 major hurricanes in the Atlantic during the 2023 season?",
    "Did India’s GDP grow by 6.8% in 2024?",
    "Were 45% of university graduates in Europe employed within three months in 2023?",
    "Did the world produce 380 million tonnes of plastic in 2023?",
    "Did the global e-commerce market reach $6.3 trillion in 2024?",
    "Was Japan’s aging population over 29% of its total population in 2023?",
    "Did YouTube generate $44.6 billion in ad revenue in 2023?",
    "Did Brazil’s deforestation rate drop to 9,000 square kilometers in 2024?",
    "Was the average student loan debt in the United States $37,000 in 2023?",
        "Did Tesla deliver more than 1.8 million vehicles in 2024?",
    "Was the inflation rate in the Eurozone 4.3% in Q1 2024?",
    "Did the iPhone 16 launch in September 2024?",
    "Did the average rent in New York City exceed $4,000 per month in 2024?",
    "Was Saudi Arabia's oil output over 10 million barrels per day in 2023?",
    "Did OpenAI release GPT-5 in 2025?",
    "Was global sea level rise measured at 3.3 mm per year in 2023?",
    "Did Microsoft Azure grow faster than AWS in Q4 2024?",
    "Was Argentina’s inflation above 110% in 2023?",
    "Did Singapore's GDP per capita surpass $85,000 in 2024?",
    "Did South Korea export over $680 billion worth of goods in 2023?",
    "Was the unemployment rate in Canada 5.7% in early 2024?",
    "Did Vietnam receive over 14 million international tourists in 2024?",
    "Was the world's total installed solar capacity over 1,200 GW in 2024?",
    "Did global wheat production fall below 770 million tonnes in 2023?",
    "Was the average CO₂ concentration above 420 ppm in 2024?",
    "Did Google spend more than $30 billion on R&D in 2023?",
    "Was the average life expectancy in Japan still above 84 years in 2023?",
    "Did Netflix surpass 280 million subscribers by 2025?",
    "Did Egypt’s population exceed 110 million in 2024?",
    "Was there a magnitude 7.0+ earthquake in Japan during 2024?",
    "Did Indonesia experience more than 300,000 hectares of forest fires in 2023?",
    "Was global smartphone penetration above 72% in 2024?",
    "Did the World Bank project global GDP growth of 2.4% in 2025?",
    "Was the median age of the global population over 31 years in 2024?",
    "Did the U.S. issue more than 450,000 student visas in 2024?",
    "Was Twitter’s daily active user count below 220 million in 2024?",
    "Did Germany import more than $160 billion worth of energy in 2023?",
    "Did the number of electric buses in China exceed 500,000 in 2024?",
    "Was the literacy rate in Afghanistan below 45% in 2023?",
    "Did France produce over 56 million hectoliters of wine in 2023?",
    "Was the Antarctic sea ice extent below 1.5 million km² in Feb 2024?",
    "Did India launch over 10 space missions in 2024?",
    "Was the total number of cryptocurrency users globally above 500 million in 2024?",
    "Did the U.S. average gas price exceed $4.20 per gallon in mid-2024?",
    "Did global electric vehicle sales surpass 17 million units in 2024?",
    "Was the average global internet speed above 50 Mbps in 2024?",
    "Did Facebook’s user base in the U.S. drop below 170 million in 2024?",
    "Did Brazil host COP30 in 2025?",
    "Was the number of endangered species listed by the IUCN over 42,000 in 2024?",
    "Did Pakistan’s GDP per capita remain under $2,000 in 2024?",
    "Did South Africa's unemployment rate stay above 30% in 2024?",
    "Was Tokyo the most populous city in the world in 2024?",
    "Did the Philippines experience over 20 tropical cyclones in 2024?",
    "Was the average global surface temperature anomaly over 1.3°C in 2024?",
    "Did Bitcoin’s price peak above $75,000 in 2024?",
    "Was Australia’s trade surplus above $100 billion in 2023?",
    "Did Malaysia generate more than 40% of its electricity from renewables in 2024?",
    "Was global plastic waste generation above 400 million tonnes in 2024?",
    "Did the FIFA Women’s World Cup 2023 reach over 2 billion viewers worldwide?",
    "Did the U.S. experience over 100 mass shootings in the first half of 2024?"
]

claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_claims(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        claims_ttl.append(future.result())


Fetching claims:   0%|          | 0/76 [00:00<?, ?it/s]

Fetching claims: 100%|██████████| 76/76 [00:56<00:00,  1.34it/s]


In [14]:
claims_ttl

[{'claims': ["Apple's revenue for the second quarter of 2024 surpassed $94 billion.",
   'In Q2 2024, Apple reported earnings that exceeded $94 billion.',
   'The total revenue generated by Apple in the second quarter of 2024 was over $94 billion.',
   "For the second quarter of 2024, Apple's revenue was greater than $94 billion.",
   'Apple achieved a revenue figure that went beyond $94 billion in Q2 2024.',
   "The financial results for Q2 2024 indicated that Apple's revenue was above $94 billion.",
   "In the second quarter of 2024, Apple's revenue reached levels exceeding $94 billion."],
  'original': "Did Apple's revenue exceed $94 billion in Q2 2024?"},
 {'claims': ["By 2025, NASA's expenditures on the Artemis program are projected to exceed $93 billion.",
   'The budget allocated by NASA for the Artemis program is anticipated to surpass $93 billion by the year 2025.',
   'As of 2025, it is estimated that NASA will have invested more than $93 billion into the Artemis initiative.'

In [15]:
rows = []
for entry in claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
sim_df = pd.DataFrame(rows)
sim_df['label'] = 'entailment' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(sim_df.shape)
sim_df.head()

(525, 3)


Unnamed: 0,original_question,claim,label
0,Did Apple's revenue exceed $94 billion in Q2 2024?,Apple's revenue for the second quarter of 2024 surpassed $94 billion.,entailment
1,Did Apple's revenue exceed $94 billion in Q2 2024?,"In Q2 2024, Apple reported earnings that exceeded $94 billion.",entailment
2,Did Apple's revenue exceed $94 billion in Q2 2024?,The total revenue generated by Apple in the second quarter of 2024 was over $94 billion.,entailment
3,Did Apple's revenue exceed $94 billion in Q2 2024?,"For the second quarter of 2024, Apple's revenue was greater than $94 billion.",entailment
4,Did Apple's revenue exceed $94 billion in Q2 2024?,Apple achieved a revenue figure that went beyond $94 billion in Q2 2024.,entailment


### Contradiction

In [16]:
system_prompt = f"""
You are a news generator. Your task is to generate 7 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure they give varying numbers that contradict the question but make it vary
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_contradict = set_role(system_prompt, set_json=True)

In [17]:
# Your existing list of questions
hypothesis_questions = [
    "Did Apple's revenue exceed $94 billion in Q2 2024?",
    "Was Indonesia's population 279 million by the end of 2023?",
    "Did Spain’s unemployment rate drop to 12.3% in early 2024?",
    "Did NASA spend more than $93 billion on the Artemis program by 2025?",
    "Did the global e-commerce market reach $6.3 trillion in 2024?",  # new
    "Did South Korea invest over $15 billion in semiconductor R&D in 2023?",  # new
    "Was Nigeria’s internet penetration rate 55% in 2024?",  # new
    "Did Meta report 3.96 billion monthly active users in Q1 2025?",
    "Did Germany produce 248 terawatt-hours of renewable electricity in 2023?",
    "Did tuition fees at Harvard rise to $57,246 for the 2024 academic year?",
    "Did China’s military budget reach $230 billion in 2024?",
    "Did Singapore record 3,102 dengue cases in the first quarter of 2024?",
    "Was the U.S. federal deficit $1.6 trillion in fiscal year 2023?",
    "Did global smartphone sales fall below 1.2 billion units in 2024?",
    "Did India’s GDP grow by 6.8% in 2024?",
    "Was the average student loan debt in the United States $37,000 in 2023?",
    "Did Japan’s aging population exceed 29% in 2023?",
    "Did Brazil’s deforestation rate drop to 9,000 square kilometers in 2024?",
    "Did Tesla deliver over 1.8 million vehicles globally in 2023?",  # new
    "Did global CO2 emissions reach 36.8 billion metric tons in 2023?",  # new
    "Did the WHO report 1.2 million antibiotic-resistant infections in 2023?",  # new
    "Did YouTube generate $44.6 billion in ad revenue in 2023?",
    "Did the Tokyo Stock Exchange index increase by 3.2% in March 2025?",
    "Did Ethereum's market cap surpass $450 billion in 2024?",  # new
    "Did the 2024 Summer Olympics host over 11,000 athletes from 200+ countries?",  # new
     "Did global wheat production fall below 770 million tonnes in 2023?",
    "Was Apple's iPhone 17 launched in September 2025?",
    "Did India overtake China as the world's most populous country in 2023?",
    "Did Microsoft invest over $13 billion in OpenAI by 2024?",
    "Did Amazon's total revenue exceed $570 billion in 2024?",
    "Did the average global temperature anomaly reach 1.3°C in 2024?",
    "Did the world install over 300 GW of new solar capacity in 2024?",
    "Did Netflix surpass 285 million global subscribers by 2025?",
    "Did the inflation rate in the U.S. average 3.6% in 2024?",
    "Did the unemployment rate in the UK fall to 4.1% in 2023?",
    "Did Singapore's total exports exceed SGD 600 billion in 2024?",
    "Was the literacy rate in Bangladesh over 77% in 2023?",
    "Did global electric vehicle sales surpass 17 million units in 2024?",
    "Was the total number of crypto users globally above 580 million in 2024?",
    "Did Samsung lead the global smartphone market in Q1 2025?",
    "Did Facebook’s U.S. user base drop below 170 million in 2024?",
    "Did Egypt’s tourism revenue surpass $13 billion in 2024?",
    "Was the Tokyo metro area home to more than 38 million people in 2024?",
    "Did Google Cloud generate more than $40 billion in revenue in 2024?",
    "Was the average broadband speed in South Korea over 110 Mbps in 2024?",
    "Did the Eurozone economy grow by more than 0.8% in 2024?",
    "Did the Philippines experience over 20 typhoons in 2023?",
    "Was the average price of lithium down 20% year-on-year in 2024?",
    "Did the global semiconductor market reach $580 billion in 2023?",
    "Was the Chinese yuan the fifth most traded currency in 2024?",
    "Did Uber report over 9 billion trips worldwide in 2024?",
    "Did global plastic waste exceed 400 million tonnes in 2023?",
    "Did the FIFA 2023 Women's World Cup attract more than 2 billion viewers?",
    "Was TikTok banned in more than five countries by 2024?",
    "Did global ocean heat content reach record levels in 2023?",
    "Did world coal consumption decline by 2% in 2024?",
    "Was Malaysia’s GDP growth rate 4.5% in 2024?",
    "Did Japan's exports of vehicles exceed 4 million units in 2023?",
    "Was Brazil's Amazon deforestation rate at a 10-year low in 2024?",
    "Did Vietnam become the third-largest apparel exporter in 2024?",
    "Did the global fintech market reach a valuation of $350 billion in 2024?",
    "Was the Nasdaq Composite up over 20% in 2024?",
    "Did Saudi Aramco report profits exceeding $120 billion in 2024?",
    "Did Russia export over 200 million tonnes of oil in 2023?",
    "Did global wind power capacity surpass 1,000 GW in 2024?",
    "Was the average global retirement age 64 years in 2024?",
    "Did the United Nations report over 114 million displaced people in 2024?",
    "Was Argentina’s inflation rate above 110% in 2023?",
    "Did Zoom’s user base fall below 250 million monthly active users in 2024?",
    "Did the number of satellites in orbit exceed 9,000 by 2024?",
    "Was the U.S. military budget above $850 billion in 2024?",
    "Did China export over 7 million electric vehicles in 2024?",
    "Was the average annual rainfall in California below 15 inches in 2023?",
    "Did Adobe acquire a major AI startup in 2024?",
    "Did Intel’s market share in the chip sector fall below 60% in 2024?",
    "Was the average smartphone battery capacity over 4,800mAh in 2024?"
]

contra_claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_contradict(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        contra_claims_ttl.append(future.result())


Fetching claims: 100%|██████████| 76/76 [00:51<00:00,  1.48it/s]


In [19]:
contra_claims_ttl

[{'claims': ["Indonesia's population was estimated to be 270 million by the end of 2023.",
   "According to recent statistics, Indonesia's population reached approximately 284 million in December 2023.",
   "The World Bank reported Indonesia's population at 278 million as of late 2023.",
   "A census conducted in late 2023 indicated that Indonesia's population could be as high as 281 million.",
   "Government estimates suggest that Indonesia's population was around 275 million at the end of 2023.",
   "Demographic studies show that Indonesia's population was approximately 283 million by the end of 2023.",
   'The Indonesian Bureau of Statistics reported a population figure of 279 million for the end of 2023.'],
  'original': "Was Indonesia's population 279 million by the end of 2023?"},
 {'claims': ["NASA's budget for the Artemis program is projected to reach $95 billion by 2025.",
   'As of 2023, NASA has allocated approximately $88 billion for the Artemis program.',
   "Reports indic

In [20]:
rows = []
for entry in contra_claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
contra_df = pd.DataFrame(rows)
contra_df['label'] = 'contradiction' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(contra_df.shape)
contra_df.head()

(532, 3)


Unnamed: 0,original_question,claim,label
0,Was Indonesia's population 279 million by the end of 2023?,Indonesia's population was estimated to be 270 million by the end of 2023.,contradiction
1,Was Indonesia's population 279 million by the end of 2023?,"According to recent statistics, Indonesia's population reached approximately 284 million in December 2023.",contradiction
2,Was Indonesia's population 279 million by the end of 2023?,The World Bank reported Indonesia's population at 278 million as of late 2023.,contradiction
3,Was Indonesia's population 279 million by the end of 2023?,A census conducted in late 2023 indicated that Indonesia's population could be as high as 281 million.,contradiction
4,Was Indonesia's population 279 million by the end of 2023?,Government estimates suggest that Indonesia's population was around 275 million at the end of 2023.,contradiction


### Neutral

In [53]:
system_prompt = f"""
You are a news generator. Your task is to generate 10 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure the sentences are not related in terms of the values can be related by background topic.
Like if the question is on revenue do not talk about revenue but you can talk about the company
Add varying sentences some with and without values. 
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_neutral = set_role(system_prompt, set_json=True)

In [21]:
# Your existing list of questions
hypothesis_questions = [
    "Did Apple's revenue exceed $94 billion in Q2 2024?",
    "Was Indonesia's population 279 million by the end of 2023?",
    "Did Spain’s unemployment rate drop to 12.3% in early 2024?",
    "Did NASA spend more than $93 billion on the Artemis program by 2025?",
    "Did the global e-commerce market reach $6.3 trillion in 2024?",  # new
    "Did South Korea invest over $15 billion in semiconductor R&D in 2023?",  # new
    "Was Nigeria’s internet penetration rate 55% in 2024?",  # new
    "Did Meta report 3.96 billion monthly active users in Q1 2025?",
    "Did Germany produce 248 terawatt-hours of renewable electricity in 2023?",
    "Did tuition fees at Harvard rise to $57,246 for the 2024 academic year?",
    "Did China’s military budget reach $230 billion in 2024?",
    "Did Singapore record 3,102 dengue cases in the first quarter of 2024?",
    "Was the U.S. federal deficit $1.6 trillion in fiscal year 2023?",
    "Did global smartphone sales fall below 1.2 billion units in 2024?",
    "Did India’s GDP grow by 6.8% in 2024?",
    "Was the average student loan debt in the United States $37,000 in 2023?",
    "Did Japan’s aging population exceed 29% in 2023?",
    "Did Brazil’s deforestation rate drop to 9,000 square kilometers in 2024?",
    "Did Tesla deliver over 1.8 million vehicles globally in 2023?",  # new
    "Did global CO2 emissions reach 36.8 billion metric tons in 2023?",  # new
    "Did the WHO report 1.2 million antibiotic-resistant infections in 2023?",  # new
    "Did YouTube generate $44.6 billion in ad revenue in 2023?",
    "Did the Tokyo Stock Exchange index increase by 3.2% in March 2025?",
    "Did Ethereum's market cap surpass $450 billion in 2024?",  # new
    "Did the 2024 Summer Olympics host over 11,000 athletes from 200+ countries?",  # new\
       "Did Netflix spend over $20 billion on content in 2024?",
    "Did the global AI market reach $310 billion in 2024?",
    "Was China the top exporter of solar panels in 2024?",
    "Did Starbucks operate more than 38,000 stores worldwide in 2024?",
    "Did the U.S. unemployment rate fall below 3.5% in April 2024?",
    "Did Africa’s population exceed 1.5 billion in 2024?",
    "Was the average lifespan in South Korea 83.5 years in 2024?",
    "Did Google’s parent company Alphabet generate over $320 billion in revenue in 2024?",
    "Was Bitcoin’s price above $70,000 in March 2024?",
    "Did the global cruise industry serve more than 30 million passengers in 2023?",
    "Did the global apparel market surpass $1.8 trillion in 2024?",
    "Did the number of active Facebook users in India exceed 450 million in 2024?",
    "Did Intel launch its 14th-generation CPUs in 2024?",
    "Was the global death toll from COVID-19 over 7 million by the end of 2024?",
    "Did India export over $420 billion worth of goods in 2024?",
    "Did the price of gold average above $1,900 per ounce in 2024?",
    "Was the inflation rate in Turkey above 50% in 2023?",
    "Did the Paris Metro carry over 1.6 billion passengers in 2024?",
    "Was the world’s tallest building still the Burj Khalifa in 2024?",
    "Did Meta launch a new version of the Quest headset in 2024?",
    "Did the average rent for a one-bedroom apartment in New York exceed $3,500 in 2024?",
    "Was Tesla’s market capitalization over $800 billion in early 2025?",
    "Did the United States import more than $3.5 trillion in goods in 2024?",
    "Did the average monthly temperature in Antarctica break records in 2024?",
    "Was the global video game market worth over $220 billion in 2024?",
    "Did OpenAI release GPT-5 in 2024?",
    "Was global paper consumption over 420 million tonnes in 2023?",
    "Did Boeing deliver more than 600 airplanes in 2024?",
    "Was Malaysia’s trade surplus above RM 200 billion in 2024?",
    "Did Taiwan’s TSMC hold over 55% of global chip foundry market share in 2024?",
    "Did Apple’s Vision Pro ship over 1 million units in its first quarter?",
    "Did the global pet food market exceed $120 billion in 2024?",
    "Did the Arctic sea ice minimum fall below 4 million square kilometers in 2023?",
    "Did YouTube Shorts exceed 70 billion daily views in 2024?",
    "Was the average smartphone screen size above 6.5 inches in 2024?",
    "Did the EU introduce a digital euro pilot in 2024?",
    "Did Pakistan’s textile exports fall below $16 billion in 2024?",
    "Was the global fertilizer market valued over $200 billion in 2024?",
    "Did the average oil price stay below $90 per barrel in 2024?",
    "Did the number of international tourists globally surpass 1.3 billion in 2024?",
    "Did Vietnam attract over $35 billion in FDI in 2024?",
    "Was Dubai International Airport the busiest in terms of international passengers in 2024?",
    "Did Disney+ subscriber growth slow below 5% in 2024?",
    "Was the global electric two-wheeler market worth over $70 billion in 2024?",
    "Did the number of unicorn startups worldwide exceed 1,400 in 2024?",
    "Did wheat prices surge over 30% due to climate disruptions in 2024?",
    "Was the iPhone the top-selling smartphone brand globally in Q4 2024?",
    "Did the European Central Bank hold interest rates steady in Q1 2025?",
    "Did the U.S. record over 300 mass shootings in 2024?",
    "Was the global urban population over 4.6 billion in 2024?"
]

neutral_claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_neutral(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        neutral_claims_ttl.append(future.result())

Fetching claims:   0%|          | 0/75 [00:00<?, ?it/s]

Fetching claims: 100%|██████████| 75/75 [01:09<00:00,  1.07it/s]


In [23]:
rows = []
for entry in neutral_claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
neutral_df = pd.DataFrame(rows)
neutral_df['label'] = 'neutral' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(neutral_df.shape)
neutral_df.head()

(730, 3)


Unnamed: 0,original_question,claim,label
0,Did the global e-commerce market reach $6.3 trillion in 2024?,The global e-commerce market was valued at approximately $5.2 trillion in 2023.,neutral
1,Did the global e-commerce market reach $6.3 trillion in 2024?,"In 2023, Amazon reported a customer base of over 300 million active accounts worldwide.",neutral
2,Did the global e-commerce market reach $6.3 trillion in 2024?,The number of online shoppers in China reached 900 million in 2023.,neutral
3,Did the global e-commerce market reach $6.3 trillion in 2024?,eBay's total sales volume was estimated at $100 billion in the last fiscal year.,neutral
4,Did the global e-commerce market reach $6.3 trillion in 2024?,The average cart abandonment rate for e-commerce sites was around 70% in 2023.,neutral


### Concat Shuffle, to csv

In [24]:
combined_df = pd.concat([sim_df, contra_df, neutral_df], ignore_index=True)
# Shuffle the rows
shuffled_df = combined_df.sample(frac=1, random_state=42).reset_index(drop=True)

shuffled_df.head()

Unnamed: 0,original_question,claim,label
0,Did NASA spend more than $93 billion on the Artemis program by 2025?,"Analysts predict that NASA's investment in the Artemis program will reach $94 billion by 2025, depending on future funding approvals.",contradiction
1,Did Germany produce 248 terawatt-hours of renewable electricity in 2023?,"In 2023, Germany exported approximately 15 terawatt-hours of electricity to neighboring countries.",neutral
2,Did Google’s parent company Alphabet generate over $320 billion in revenue in 2024?,Alphabet's market capitalization was estimated at around $1.5 trillion in early 2024.,neutral
3,Was the average life expectancy in Japan still above 84 years in 2023?,"The life expectancy for citizens in Japan exceeded 84 years, reaching 84.7 years in 2023.",entailment
4,Did the European Central Bank hold interest rates steady in Q1 2025?,"The ECB's inflation target remains set at 2%, a key focus for its monetary policy strategy.",neutral


In [26]:
shuffled_df.to_csv('./datasets/bart_data.csv')

## Validate Set

### Entailment


In [5]:
system_prompt = f"""
You are a news generator. Your task is to generate 5 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure they agree exactly with the question but make it vary
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_claims = set_role(system_prompt, set_json=True)

# Your existing list of questions
hypothesis_questions = [
    "Did the S&P 500 index close above 4,800 points at the end of 2024?",
    "Was the average rent for a one-bedroom apartment in New York City $3,900 in 2023?",
    "Did global oil production exceed 93 million barrels per day in 2024?",
    "Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?",
    "Were more than 230 million COVID-19 vaccine doses administered in Africa by 2024?",
    "Did the unemployment rate in South Africa stay above 32% in 2023?",
    "Did the global semiconductor market reach $600 billion in revenue in 2024?",
    "Was the literacy rate in Afghanistan below 40% in 2023?",
    "Did the number of international students in Canada surpass 900,000 in 2024?",
    "Did Netflix gain over 15 million new subscribers in Q1 2024?",
    "Was the average sea level 98 millimeters higher in 2023 compared to 1993?",
    "Did Google spend $28 billion on R&D in 2023?",
    "Did the number of billionaires worldwide reach 2,800 in 2024?",
    "Did Malaysia generate over 60% of its electricity from renewable sources in 2023?",
    "Did the global fashion industry contribute 10% of annual carbon emissions in 2024?"
]


claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_claims(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        claims_ttl.append(future.result())

claims_ttl
rows = []
for entry in claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
val_sim_df = pd.DataFrame(rows)
val_sim_df['label'] = 'entailment' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(val_sim_df.shape)
val_sim_df.head()

Fetching claims: 100%|██████████| 15/15 [00:09<00:00,  1.57it/s]

(75, 3)





Unnamed: 0,original_question,claim,label
0,Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?,"In 2023, Amazon's carbon footprint was reported to be approximately 71.5 million metric tons.",entailment
1,Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?,The total carbon emissions attributed to Amazon in 2023 reached around 71.5 million metric tons.,entailment
2,Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?,Amazon's environmental report for 2023 indicated a carbon footprint of about 71.5 million metric tons.,entailment
3,Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?,"According to estimates, Amazon's carbon footprint for the year 2023 was around 71.5 million metric tons.",entailment
4,Did Amazon's carbon footprint reach 71.5 million metric tons in 2023?,"In its 2023 sustainability update, Amazon disclosed a carbon footprint of approximately 71.5 million metric tons.",entailment


### Contradiction


In [6]:
system_prompt = f"""
You are a news generator. Your task is to generate 10 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure they give varying numbers that contradict the question but make it vary
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_contradict = set_role(system_prompt, set_json=True)
# Your existing list of questions
hypothesis_questions = [
    "Did the average internet speed in South Korea exceed 100 Mbps in 2024?",
    "Was the population of the United States approximately 334 million in 2023?",
    "Did Twitter's daily active user count drop below 190 million in Q2 2024?",
    "Did the global electric vehicle market grow by 32% in 2023?",
    "Did the UK record over 45,000 tech-related job layoffs in 2024?",
    "Did the fertility rate in Japan fall to 1.26 births per woman in 2023?",
    "Did Microsoft invest more than $13 billion in OpenAI by 2024?",
    "Was the deforestation rate in Indonesia reduced to under 1 million hectares in 2023?",
    "Did France generate 72% of its electricity from nuclear energy in 2023?",
    "Was the average global life expectancy 73.4 years in 2024?",
    "Did India install over 12 GW of solar power capacity in 2023?",
    "Was the consumer price index (CPI) inflation rate in Argentina above 90% in 2024?",
    "Did the number of space launches globally exceed 200 in 2023?",
    "Did over 65% of Gen Z adults in the U.S. use TikTok as a news source in 2024?",
    "Was the global food waste volume estimated at 1.3 billion tonnes in 2024?"
]
contra_claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_contradict(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        contra_claims_ttl.append(future.result())

contra_claims_ttl
rows = []
for entry in contra_claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
val_contra_df = pd.DataFrame(rows)
val_contra_df['label'] = 'contradiction' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(val_contra_df.shape)
val_contra_df.head()

Fetching claims: 100%|██████████| 15/15 [00:14<00:00,  1.01it/s]

(150, 3)





Unnamed: 0,original_question,claim,label
0,"Did the UK record over 45,000 tech-related job layoffs in 2024?","The UK saw approximately 42,000 tech-related job layoffs in the first half of 2024.",contradiction
1,"Did the UK record over 45,000 tech-related job layoffs in 2024?","Reports indicate that the UK experienced 48,500 tech-related job layoffs in 2024, surpassing initial estimates.",contradiction
2,"Did the UK record over 45,000 tech-related job layoffs in 2024?","In 2024, the number of tech-related job layoffs in the UK was reported to be around 39,000.",contradiction
3,"Did the UK record over 45,000 tech-related job layoffs in 2024?","Analysts estimate that the UK recorded 46,200 tech-related job layoffs by the end of 2024.",contradiction
4,"Did the UK record over 45,000 tech-related job layoffs in 2024?","Data from industry sources suggest that the UK had 50,000 tech-related job layoffs in early 2024.",contradiction


### Neutral

In [7]:
system_prompt = f"""
You are a news generator. Your task is to generate 10 factual-sounding news-like sentences involving specific numbers (e.g., population, revenue, casualties, stock prices). 
You will be given a question, revolve the claims around this question. Make sure the sentences are not related in terms of the values can be related by background topic.
Like if the question is on revenue do not talk about revenue but you can talk about the company
Add varying sentences some with and without values. 
Each sentence must be plausible, concise, and in a neutral tone as if from a report.

Examples:
- "The GDP of Germany reached 4.2 trillion dollars in 2023."
- "Singapore recorded 3,102 dengue cases in the first quarter of the year."
- "Tesla sold over 420,000 units globally in the second quarter of 2022."

Return the output as a JSON array under key "claims".
"""

get_neutral = set_role(system_prompt, set_json=True)
# Your existing list of questions
hypothesis_questions = [
    "Did the average retirement age in Germany reach 65.7 years in 2023?",
    "Was the global market for AI hardware valued at $42.4 billion in 2024?",
    "Did China install more than 55 GW of wind power capacity in 2023?",
    "Did the number of active freelancers in the U.S. surpass 70 million in 2024?",
    "Was the global ocean plastic pollution estimated at 11 million metric tons in 2023?",
    "Did the European Union allocate over €300 billion for green infrastructure by 2024?",
    "Did Apple sell more than 85 million iPhones globally in Q4 2023?",
    "Was the average daily screen time for teenagers above 7 hours in 2023?",
    "Did the percentage of cashless transactions in Sweden exceed 93% in 2024?",
    "Was the average hospital stay in the U.S. reduced to 4.8 days in 2023?",
    "Did the number of electric buses in China reach 600,000 by the end of 2024?",
    "Was the global cybersecurity market worth over $210 billion in 2024?",
    "Did the unemployment rate in the Eurozone fall to 6.4% in mid-2024?",
    "Did Canada’s greenhouse gas emissions fall below 700 million tonnes in 2023?",
    "Was the percentage of remote jobs in the U.S. tech sector over 28% in 2024?"
]


neutral_claims_ttl = []

# Define the same wrapper
def fetch_claims(question):
    try:
        result = get_neutral(question)
        result['original'] = question
        return result
    except Exception as e:
        return {"original": question, "error": str(e)}

# Run with tqdm-enabled progress bar
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = [executor.submit(fetch_claims, q) for q in hypothesis_questions]
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching claims"):
        neutral_claims_ttl.append(future.result())
rows = []
for entry in neutral_claims_ttl:
    original = entry.get("original", "")
    claims = entry.get("claims", [])
    for claim in claims:
        rows.append({"original_question": original, "claim": claim})

# Create the DataFrame
val_neutral_df = pd.DataFrame(rows)
val_neutral_df['label'] = 'neutral' 
pd.set_option("display.max_colwidth", None)   # Do not truncate text in cells
print(val_neutral_df.shape)
val_neutral_df.head()

Fetching claims: 100%|██████████| 15/15 [00:16<00:00,  1.10s/it]

(150, 3)





Unnamed: 0,original_question,claim,label
0,Did the average retirement age in Germany reach 65.7 years in 2023?,The average retirement age in Germany was reported to be 65.7 years in 2023.,neutral
1,Did the average retirement age in Germany reach 65.7 years in 2023?,Germany's population aged 65 and older is projected to reach 21 million by 2030.,neutral
2,Did the average retirement age in Germany reach 65.7 years in 2023?,"In 2022, approximately 30% of German workers were over the age of 55.",neutral
3,Did the average retirement age in Germany reach 65.7 years in 2023?,The life expectancy in Germany increased to 81.2 years in 2023.,neutral
4,Did the average retirement age in Germany reach 65.7 years in 2023?,Germany's workforce participation rate for individuals aged 60 and above was 25% in 2023.,neutral


### Concat Shuffle, to csv


In [8]:
val_combined_df = pd.concat([val_sim_df, val_contra_df, val_neutral_df], ignore_index=True)
# Shuffle the rows
val_shuffled_df = val_combined_df.sample(frac=1, random_state=42).reset_index(drop=True)

val_shuffled_df.head()
val_shuffled_df.to_csv('bart_val_data.csv')