# Example pipeline for fake news classification model
Model link: https://huggingface.co/NoAtmosphere0/Roberta-large-fc

Input: 
- claim (str): the claim to be classified
- evidence (list[str]): a list of evidence that supports/refutes the claim

Output:
- label (str): the predicted label of the claim, either "false", "half-true", or "true"

In [17]:
example_claim = "The earth is flat." # Label: 0 (False)

example_evidences = [
    "The documentary 'Behind the Curve' follows several Flat Earth advocates, documenting experiments they conducted to prove their claims. Ironically, many of their tests yielded results consistent with a spherical Earth, sparking debates among their community.",
    "The Flat Earth Society continues to argue that the Earth is a flat disk, emphasizing that the horizon always appears level, regardless of altitude. Their members cite videos and photos taken at high altitudes, claiming no curvature is visible.",
    "Recent satellite imagery and GPS technology rely on the Earth's spherical shape to function accurately. Scientists point to these systems as definitive evidence of Earth's roundness, highlighting that flat models cannot explain global positioning.",
    "In a recent poll, 2% of respondents claimed they believe the Earth is flat, with many attributing their views to skepticism of mainstream science. However, the majority, 88%, affirmed the Earth is round, citing educational materials and personal observations."
]

In [21]:
def format_input(claim: str, evidences: str) -> str:
    """
    Formats the input data into a dictionary for the fake news classifier.

    The format looks like

    ```python
    Claim: {claim} </s> \n Explanation: {evidence1} {evidence2} {evidence3} {evidence4}.
    ```

    Args:
    claim (str): The claim to be classified.

    evidences (List[str]): The evidences supporting the claim.

    Returns:
    str: The formatted input data.
    """
    return f"Claim: {claim} </s> \n Explanation: {' '.join(evidences)}"


In [19]:
# Use a pipeline as a high-level helper
from transformers import pipeline
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

pipe = pipeline(model="NoAtmosphere0/Roberta-large-fc", device=device)

# Classify the example claim
result = pipe(format_input(example_claim, example_evidences))

# Print the result
print(result)



[{'label': 'false', 'score': 0.9998418092727661}]


# Example pipeline for hallucination checker
Model link: https://huggingface.co/vectara/hallucination_evaluation_model



In [20]:
from transformers import AutoModelForSequenceClassification

pairs = [ # Test data, List[Tuple[str, str]]
    ("The capital of France is Berlin.", "The capital of France is Paris."), # factual but hallucinated
    ('I am in California', 'I am in United States.'), # Consistent
    ('I am in United States', 'I am in California.'), # Hallucinated
    ("A person on a horse jumps over a broken down airplane.", "A person is outdoors, on a horse."),
    ("A boy is jumping on skateboard in the middle of a red bridge.", "The boy skates down the sidewalk on a red bridge"),
    ("A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond man wearing a brown shirt is reading a book."),
    ("Mark Wahlberg was a fan of Manny.", "Manny was a fan of Mark Wahlberg.")
]

# Step 1: Load the model
model = AutoModelForSequenceClassification.from_pretrained(
    'vectara/hallucination_evaluation_model', trust_remote_code=True)

# Step 2: Use the model to predict
model.predict(pairs) # note the predict() method. Do not do model(pairs). 
# tensor([0.0111, 0.6474, 0.1290, 0.8969, 0.1846, 0.0050, 0.0543])

You are using a model of type HHEMv2Config to instantiate a model of type HHEMv2. This is not supported for all configurations of models and can yield errors.


tensor([0.0111, 0.6474, 0.1290, 0.8969, 0.1846, 0.0050, 0.0543])